Alternativer Identifier:
(KITopen-DOI) 10.5445/IR/1000146629
Verwandter Identifier:
Ersteller/in:
Bach, Jakob https://orcid.org/0000-0003-0301-2798 [Institut für Programmstrukturen und Datenorganisation (IPD), Karlsruher Institut für Technologie (KIT)]

Iser, Markus [Institut für Informationssicherheit und Verlässlichkeit (KASTEL), Karlsruher Institut für Technologie (KIT)]
Beitragende:
-
Titel:
Experimental Data for the Paper "A Comprehensive Study of k-Portfolios of Recent SAT Solvers"
Weitere Titel:
-
Beschreibung:
(Abstract) These are the experimental data for the paper> Bach, Jakob, Markus Iser, and Klemens Böhm. "A Comprehensive Study of k-Portfolios of Recent SAT Solvers" published at the conference [*SAT 2022*](http://satisfiability.org/SAT22/). You can find the paper [here](https://www.doi.org/10.4230/LIPIcs.SAT.2022.2) and the code [here](https://github.com/Jakob-Bach/Small-Portfolios). See the `README` for details.
(Technical Remarks) # Experimental Data for the Paper "A Comprehensive Study of k-Portfolios of Recent SAT Solvers" These are the experimental data for the paper> Bach, Jakob, Markus Iser, and Klemens Böhm. "A Comprehensive Study of k-Portfolios of Recent SAT Solvers" accepted at the conference [*SAT 2022*](http://satisfiability.org/SAT22/). Check our [GitHub repository](https://github.com/Jakob-Bach/Small-Portfolios) for the code and instructions to reproduce the experiments. The data were obtained on a server with an `AMD EPYC 7551` [CPU](https://www.amd.com/en/products/cpu/amd-epyc-7551) (32 physical cores, base clock of 2.0 GHz) and 128 GB RAM. The Python version was `3.8.10`. With this configuration, the experimental pipeline (`run_experiments.py`) took about 25 h. The commit hash for the last run of the experimental pipeline (`run_experiments.py`) is [`d402353e7f`](https://github.com/Jakob-Bach/Small-Portfolios/tree/d402353e7f5804d3b693c3057f203a2c719c4098). The commit hash for the last run of the evaluation pipeline (`run_evaluation.py`) is [`5ba0468088`](https://github.com/Jakob-Bach/Small-Portfolios/tree/5ba0468088fc0ce0ce3d2f04229946c1df83fa9d). Both commits are also tagged. In the following, we describe: - how to use the experimental data - the structure/content of each data file ## Using the Data Most of the experimental pipeline's input and output files are plain CSVs. You can easily read in any of the CSVs with `pandas` if you are using Python: ```python import pandas as pd results = pd.read_csv('<path>/search_results.csv') ``` All CSVs were written with `<DataFrame>.to_csv(<path>, index=False)`, i.e., they follow `pandas`' defaults for CSVs, so it is unnecessary to pass further parameters when reading them in with `pandas`. The CSVs mostly contain plain numbers and strings; the latter are only quoted if necessary, e.g., if they contain commas. However, the column `solvers` in `search_results.csv` contains lists of solver names. You can convert this column from string to proper Python lists as follows: ```python import ast search_results['solvers'] = search_results['solvers'].apply(ast.literal_eval) ``` ## `*.db` files and corresponding `*.csv` files Raw instance-feature databases and solver-runtime databases from [GBD](https://gbd.iti.kit.edu), as well as CSV exports of them. Outputs of the script `prepare_dataset.py`. - `meta`: Meta-data of SAT instances, e.g., in which competition(s) the instances were used. We only use this information to filter instances for the experimental datasets but not as features for predictions. - `satzilla`: Instance features obtained with the feature extractor of [*SATzilla 2012*](https://www.cs.ubc.ca/labs/algorithms/Projects/SATzilla/Report_SAT_features.pdf). Numeric matrix, apart from the `hash` column that identifies instances. Each row represents a SAT instance; each column represents a feature (column names are feature names). The value `timeout` represents missing values (feature extractor exceeded time or memory limits). - `sc2020`: Solver runtimes from the [*SAT Competition 2020*](https://satcompetition.github.io/2020/). Numeric matrix, apart from the `hash` column that identifies instances. Each row represents a SAT instance (not all of them are actually from this SAT competition, so we filter instances later); each column represents a solver (column names are solver names). The values `failed`, `unknown`, and `timeout` represent missing values (solver did not solve instance within cut-off time). - `sc2021`: Solver runtimes from the [*SAT Competition 2021*](https://satcompetition.github.io/2021/). Numeric matrix, apart from the `hash` column that identifies instances. Each row represents a SAT instance (not all of them are actually from this SAT competition, so we filter instances later); each column represents a solver (column names are solver names). The value `timeout` represents missing values (solver did not solve instance within cut-off time). ## `sc(2020|2021)_features.csv` Pre-processed instance-feature data for the *Main Track* of the SAT Competitions 2020 and 2021. Output of the script `prepare_dataset.py`; input to the script `run_experiments.py`. Numeric matrix, apart from the `hash` column that identifies instances (we do not have any categorical features; they would need to be encoded beforehand). Each row represents a SAT instance; each column represents a feature (column names are feature names). Has the same number of rows as the corresponding runtime file. The empty string represents missing values caused by the feature extractor exceeding time or memory limits; these missing values are handled (imputed) in the prediction pipeline. ## `sc(2020|2021)_runtimes.csv` Pre-processed solver-runtime data from the *Main Track* of the SAT Competitions 2020 and 2021. Output of the script `prepare_dataset.py`; input to the script `run_experiments.py`. Numeric matrix, apart from the `hash` column that identifies instances. Each row represents a SAT instance; each column represents a solver (column names are solver names). Has the same number of rows as the corresponding feature file. Missing values were replaced with the double cut-off time according to PAR-2 scoring (= 10000). ## `search_results.csv` Results of portfolio search, e.g., portfolios, train/test objective values, and search times. Output of the script `run_experiments.py`; input to the script `run_evaluation.py`. - `solvers` (string, but actually a list of strings): List of the names of the solvers forming the portfolio. The solver names are column names in `sc(2020|2021)_runtimes.csv`. - `train_objective` (float): Objective value of a solution (= portfolio) to the `K-Portfolio-Problem`, using the SAT instances from the training data. The objective is defined as the PAR-2 score of the portfolio's virtual best solver (VBS). - `test_objective` (float): Objective value of the `K-Portfolio-Problem` for the SAT instances from the test data, i.e., take the solvers of the portfolio determined on the training data, and compute their VBS on the test instances (without running portfolio search again). - `(train|test)_portfolio_vws` (float): PAR-2 score of the virtual worst solver (VWS) formed from the portfolio, i.e., select the worst (in terms of PAR-2 score) solver from the portfolio for each instance. Might be used for comparison purposes; we do not use it in our evaluation. Bounds the objective value for portfolios with instance-specific solver selection. - `(train|test)_portfolio_sbs` (float): PAR-2 score of the single best solver (SBS) from the portfolio, i.e., the best individual solver contained in the portfolio. Might be used for comparison purposes; we use it in Figures 4 and 5 as a baseline. - `(train|test)_portfolio_sws` (float): PAR-2 score of the single worst solver (SWS) from the portfolio, i.e., the worst individual solver contained in the portfolio. Might be used for comparison purposes; we do not use it in our evaluation. - `(train|test)_global_sws` (float): PAR-2 score of the single worst solver (SWS) from all solvers (independent from current portfolio), i.e., the globally worst individual solver. Might be used for comparison purposes; we use it in Figures 1 and 2 for the submodularity-based upper bound. - `search_time` (float): Runtime (in seconds) of the portfolio search (on the particular dataset and cross-validation fold, with the particular portfolio-search approach). - `search_id` (int): Identifier denoting combinations of dataset, cross-validation fold, and portfolio-search approach. The experimental pipeline parallelizes these tasks. Each `search_id` might be associated with multiple portfolios; combining `search_id` and `solution_id` allows joining `search_results` and `prediction_results`. - `solution_id` (int): Identifier to distinguish between multiple portfolios found with a particular portfolio-search approach on a particular dataset and cross-validation fold. Apart from `mip_search` (the *optimal solution*), all search approaches yield multiple portfolios. - `fold_id` (int in `{0, 1, 2, 3, 4}`): Index of the cross-validation fold. - `problem` (string, 2 different values): Dataset name (in our experiments: `SC2020` or `SC2021`). - `algorithm` (string, 4 different values): Search approach used to determine portfolios (in our experiments: `beam_search`, `kbest_search`, `mip_search`, or `random_search`). - `k` (int in `[1, 48]`): Desired number of solvers in the portfolio, an input parameter to portfolio search. The actual number of solvers in column `solvers` might differ, as *beam search* and *k-best* are only run with the maximal `k` for each dataset, but also yield all smaller portfolios (intermediate results). - `w` (int in `[1, 100]`, but stored as float): Beam width if *beam search* was used, or number of random samples if *random search* was used. Missing value (represented as an empty string) for the other two search approaches. Input parameter to portfolio search. ## `prediction_results.csv` Results of predictions with portfolios, e.g., train/test prediction performance, train/test objective values, and feature importance. Output of the script `run_experiments.py`; input to the script `run_evaluation.py`. - `model` (string, 2 different values): Name of the prediction model (in our experiments: `Random Forest` and `XGBoost`). Each prediction model is trained for each portfolio from the search (thus, `prediction_results` has twice the number of rows as `search_results`). - `pred_time` (float): Runtime (in seconds) for training the prediction model (for one portfolio) on the training data, and making predictions on training data as well as test data. - `(train|test)_pred_mcc` (float in `[-1, 1]`): Prediction performance in terms of Matthews Correlation Coefficient for predicting the best solver from the portfolio for each instance. - `(train|test)_pred_objective` (float): PAR-2 score of the (instance-specific) solver recommendations made by the prediction model. - `imp.<feature_name>` (float in `[0, 1]`): Feature importances extracted from the prediction model after training. Importances are normalized and should sum up to one for each row. Missing values (represented as empty strings) occur if no prediction model was trained since the prediction target only had one class, e.g., portfolio had size one or always the same solver (out of multiple solvers) was best. - `search_id` (int), `solution_id` (int): Same as in `search_results.csv`, can be used for joining the results. ## `Evaluation_console_output.txt` Output of the script `run_evaluation.py`, manually copied from the console to a file.
Schlagworte:
propositional satisfiability
solver portfolios
runtime prediction
machine learning
integer programming
Zugehörige Informationen:
-
Sprache:
-
Erstellungsjahr:
Fachgebiet:
Computer Science
Objekttyp:
Dataset
Datenquelle:
-
Verwendete Software:
-
Datenverarbeitung:
-
Erscheinungsjahr:
Rechteinhaber/in:

Iser, Markus
Förderung:
-
Name Speichervolumen Metadaten Upload Aktion
Status:
Publiziert
Eingestellt von:
kitopen
Erstellt am:
Archivierungsdatum:
2023-06-21
Archivgröße:
4,8 GB
Archiversteller:
kitopen
Archiv-Prüfsumme:
aea46799aa64dc9ccef7fd48ea35ec24 (MD5)
Embargo-Zeitraum:
-