Alternativer Identifier:

(KITopen-DOI) 10.5445/IR/1000146629

Verwandter Identifier:

(Is Identical To) https://publikationen.bibliothek.kit.edu/1000146629 - URL

Ersteller/in:

Bach, Jakob https://orcid.org/0000-0003-0301-2798 [Institut für Programmstrukturen und Datenorganisation (IPD), Karlsruher Institut für Technologie (KIT)]

Iser, Markus [Institut für Informationssicherheit und Verlässlichkeit (KASTEL), Karlsruher Institut für Technologie (KIT)]

Beitragende:

Titel:

Experimental Data for the Paper "A Comprehensive Study of k-Portfolios of Recent SAT Solvers"

Weitere Titel:

Beschreibung:

(Abstract) These are the experimental data for the paper> Bach, Jakob, Markus Iser, and Klemens Böhm. "A Comprehensive Study of k-Portfolios of Recent SAT Solvers" published at the conference [*SAT 2022*](http://satisfiability.org/SAT22/). You can find the paper [here](https://www.doi.org/10.4230/LIPIcs.SAT.20...

Zeige alles

(Technical Remarks) # Experimental Data for the Paper "A Comprehensive Study of k-Portfolios of Recent SAT Solvers" These are the experimental data for the paper> Bach, Jakob, Markus Iser, and Klemens Böhm. "A Comprehensive Study of k-Portfolios of Recent SAT Solvers" accepted at the conference [*SAT 2022*](http://sati... # Experimental Data for the Paper "A Comprehensive Study of k-Portfolios of Recent SAT Solvers" These are the experimental data for the paper> Bach, Jakob, Markus Iser, and Klemens Böhm. "A Comprehensive Study of k-Portfolios of Recent SAT Solvers" accepted at the conference [*SAT 2022*](http://satisfiability.org/SAT22/). Check our [GitHub repository](https://github.com/Jakob-Bach/Small-Portfolios) for the code and instructions to reproduce the experiments. The data were obtained on a server with an `AMD EPYC 7551` [CPU](https://www.amd.com/en/products/cpu/amd-epyc-7551) (32 physical cores, base clock of 2.0 GHz) and 128 GB RAM. The Python version was `3.8.10`. With this configuration, the experimental pipeline (`run_experiments.py`) took about 25 h. The commit hash for the last run of the experimental pipeline (`run_experiments.py`) is [`d402353e7f`](https://github.com/Jakob-Bach/Small-Portfolios/tree/d402353e7f5804d3b693c3057f203a2c719c4098). The commit hash for the last run of the evaluation pipeline (`run_evaluation.py`) is [`5ba0468088`](https://github.com/Jakob-Bach/Small-Portfolios/tree/5ba0468088fc0ce0ce3d2f04229946c1df83fa9d). Both commits are also tagged. In the following, we describe: - how to use the experimental data - the structure/content of each data file ## Using the Data Most of the experimental pipeline's input and output files are plain CSVs. You can easily read in any of the CSVs with `pandas` if you are using Python: ```python import pandas as pd results = pd.read_csv('<path>/search_results.csv') ``` All CSVs were written with `<DataFrame>.to_csv(<path>, index=False)`, i.e., they follow `pandas`' defaults for CSVs, so it is unnecessary to pass further parameters when reading them in with `pandas`. The CSVs mostly contain plain numbers and strings; the latter are only quoted if necessary, e.g., if they contain commas. However, the column `solvers` in `search_results.csv` contains lists of solver names. You can convert this column from string to proper Python lists as follows: ```python import ast search_results['solvers'] = search_results['solvers'].apply(ast.literal_eval) ``` ## `*.db` files and corresponding `*.csv` files Raw instance-feature databases and solver-runtime databases from [GBD](https://gbd.iti.kit.edu), as well as CSV exports of them. Outputs of the script `prepare_dataset.py`. - `meta`: Meta-data of SAT instances, e.g., in which competition(s) the instances were used. We only use this information to filter instances for the experimental datasets but not as features for predictions. - `satzilla`: Instance features obtained with the feature extractor of [*SATzilla 2012*](https://www.cs.ubc.ca/labs/algorithms/Projects/SATzilla/Report_SAT_features.pdf). Numeric matrix, apart from the `hash` column that identifies instances. Each row represents a SAT instance; each column represents a feature (column names are feature names). The value `timeout` represents missing values (feature extractor exceeded time or memory limits). - `sc2020`: Solver runtimes from the [*SAT Competition 2020*](https://satcompetition.github.io/2020/). Numeric matrix, apart from the `hash` column that identifies instances. Each row represents a SAT instance (not all of them are actually from this SAT competition, so we filter instances later); each column represents a solver (column names are solver names). The values `failed`, `unknown`, and `timeout` represent missing values (solver did not solve instance within cut-off time). - `sc2021`: Solver runtimes from the [*SAT Competition 2021*](https://satcompetition.github.io/2021/). Numeric matrix, apart from the `hash` column that identifies instances. Each row represents a SAT instance (not all of them are actually from this SAT competition, so we filter instances later); each column represents a solver (column names are solver names). The value `timeout` represents missing values (solver did not solve instance within cut-off time). ## `sc(2020|2021)_features.csv` Pre-processed instance-feature data for the *Main Track* of the SAT Competitions 2020 and 2021. Output of the script `prepare_dataset.py`; input to the script `run_experiments.py`. Numeric matrix, apart from the `hash` column that identifies instances (we do not have any categorical features; they would need to be encoded beforehand). Each row represents a SAT instance; each column represents a feature (column names are feature names). Has the same number of rows as the corresponding runtime file. The empty string represents missing values caused by the feature extractor exceeding time or memory limits; these missing values are handled (imputed) in the prediction pipeline. ## `sc(2020|2021)_runtimes.csv` Pre-processed solver-runtime data from the *Main Track* of the SAT Competitions 2020 and 2021. Output of the script `prepare_dataset.py`; input to the script `run_experiments.py`. Numeric matrix, apart from the `hash` column that identifies instances. Each row represents a SAT instance; each column represents a solver (column names are solver names). Has the same number of rows as the corresponding feature file. Missing values were replaced with the double cut-off time according to PAR-2 scoring (= 10000). ## `search_results.csv` Results of portfolio search, e.g., portfolios, train/test objective values, and search times. Output of the script `run_experiments.py`; input to the script `run_evaluation.py`. - `solvers` (string, but actually a list of strings): List of the names of the solvers forming the portfolio. The solver names are column names in `sc(2020|2021)_runtimes.csv`. - `train_objective` (float): Objective value of a solution (= portfolio) to the `K-Portfolio-Problem`, using the SAT instances from the training data. The objective is defined as the PAR-2 score of the portfolio's virtual best solver (VBS). - `test_objective` (float): Objective value of the `K-Portfolio-Problem` for the SAT instances from the test data, i.e., take the solvers of the portfolio determined on the training data, and compute their VBS on the test instances (without running portfolio search again). - `(train|test)_portfolio_vws` (float): PAR-2 score of the virtual worst solver (VWS) formed from the portfolio, i.e., select the worst (in terms of PAR-2 score) solver from the portfolio for each instance. Might be used for comparison purposes; we do not use it in our evaluation. Bounds the objective value for portfolios with instance-specific solver selection. - `(train|test)_portfolio_sbs` (float): PAR-2 score of the single best solver (SBS) from the portfolio, i.e., the best individual solver contained in the portfolio. Might be used for comparison purposes; we use it in Figures 4 and 5 as a baseline. - `(train|test)_portfolio_sws` (float): PAR-2 score of the single worst solver (SWS) from the portfolio, i.e., the worst individual solver contained in the portfolio. Might be used for comparison purposes; we do not use it in our evaluation. - `(train|test)_global_sws` (float): PAR-2 score of the single worst solver (SWS) from all solvers (independent from current portfolio), i.e., the globally worst individual solver. Might be used for comparison purposes; we use it in Figures 1 and 2 for the submodularity-based upper bound. - `search_time` (float): Runtime (in seconds) of the portfolio search (on the particular dataset and cross-validation fold, with the particular portfolio-search approach). - `search_id` (int): Identifier denoting combinations of dataset, cross-validation fold, and portfolio-search approach. The experimental pipeline parallelizes these tasks. Each `search_id` might be associated with multiple portfolios; combining `search_id` and `solution_id` allows joining `search_results` and `prediction_results`. - `solution_id` (int): Identifier to distinguish between multiple portfolios found with a particular portfolio-search approach on a particular dataset and cross-validation fold. Apart from `mip_search` (the *optimal solution*), all search approaches yield multiple portfolios. - `fold_id` (int in `{0, 1, 2, 3, 4}`): Index of the cross-validation fold. - `problem` (string, 2 different values): Dataset name (in our experiments: `SC2020` or `SC2021`). - `algorithm` (string, 4 different values): Search approach used to determine portfolios (in our experiments: `beam_search`, `kbest_search`, `mip_search`, or `random_search`). - `k` (int in `[1, 48]`): Desired number of solvers in the portfolio, an input parameter to portfolio search. The actual number of solvers in column `solvers` might differ, as *beam search* and *k-best* are only run with the maximal `k` for each dataset, but also yield all smaller portfolios (intermediate results). - `w` (int in `[1, 100]`, but stored as float): Beam width if *beam search* was used, or number of random samples if *random search* was used. Missing value (represented as an empty string) for the other two search approaches. Input parameter to portfolio search. ## `prediction_results.csv` Results of predictions with portfolios, e.g., train/test prediction performance, train/test objective values, and feature importance. Output of the script `run_experiments.py`; input to the script `run_evaluation.py`. - `model` (string, 2 different values): Name of the prediction model (in our experiments: `Random Forest` and `XGBoost`). Each prediction model is trained for each portfolio from the search (thus, `prediction_results` has twice the number of rows as `search_results`). - `pred_time` (float): Runtime (in seconds) for training the prediction model (for one portfolio) on the training data, and making predictions on training data as well as test data. - `(train|test)_pred_mcc` (float in `[-1, 1]`): Prediction performance in terms of Matthews Correlation Coefficient for predicting the best solver from the portfolio for each instance. - `(train|test)_pred_objective` (float): PAR-2 score of the (instance-specific) solver recommendations made by the prediction model. - `imp.<feature_name>` (float in `[0, 1]`): Feature importances extracted from the prediction model after training. Importances are normalized and should sum up to one for each row. Missing values (represented as empty strings) occur if no prediction model was trained since the prediction target only had one class, e.g., portfolio had size one or always the same solver (out of multiple solvers) was best. - `search_id` (int), `solution_id` (int): Same as in `search_results.csv`, can be used for joining the results. ## `Evaluation_console_output.txt` Output of the script `run_evaluation.py`, manually copied from the console to a file.

Experimental Data for the Paper "A Comprehensive Study of k-Portfolios of Recent SAT Solvers"

These are the experimental data for the paper> Bach, Jakob, Markus Iser, and Klemens Böhm. "A Comprehensive Study of k-Portfolios of Recent SAT Solvers" accepted at the conference SAT 2022. Check our GitHub repository for the code and instructions to reproduce the experiments. The data were obtained on a server with an AMD EPYC 7551 CPU (32 physical cores, base clock of 2.0 GHz) and 128 GB RAM. The Python version was 3.8.10. With this configuration, the experimental pipeline (run_experiments.py) took about 25 h. The commit hash for the last run of the experimental pipeline (run_experiments.py) is d402353e7f. The commit hash for the last run of the evaluation pipeline (run_evaluation.py) is 5ba0468088. Both commits are also tagged. In the following, we describe:

how to use the experimental data
the structure/content of each data file

Using the Data

Most of the experimental pipeline's input and output files are plain CSVs. You can easily read in any of the CSVs with pandas if you are using Python:

import pandas as pd
results = pd.read_csv('<path>/search_results.csv')

All CSVs were written with <DataFrame>.to_csv(<path>, index=False), i.e., they follow pandas' defaults for CSVs, so it is unnecessary to pass further parameters when reading them in with pandas. The CSVs mostly contain plain numbers and strings; the latter are only quoted if necessary, e.g., if they contain commas. However, the column solvers in search_results.csv contains lists of solver names. You can convert this column from string to proper Python lists as follows:

import ast
search_results['solvers'] = search_results['solvers'].apply(ast.literal_eval)

`.db` files and corresponding `.csv` files

Raw instance-feature databases and solver-runtime databases from GBD, as well as CSV exports of them. Outputs of the script prepare_dataset.py.

meta: Meta-data of SAT instances, e.g., in which competition(s) the instances were used. We only use this information to filter instances for the experimental datasets but not as features for predictions.
satzilla: Instance features obtained with the feature extractor of SATzilla 2012. Numeric matrix, apart from the hash column that identifies instances. Each row represents a SAT instance; each column represents a feature (column names are feature names). The value timeout represents missing values (feature extractor exceeded time or memory limits).
sc2020: Solver runtimes from the SAT Competition 2020. Numeric matrix, apart from the hash column that identifies instances. Each row represents a SAT instance (not all of them are actually from this SAT competition, so we filter instances later); each column represents a solver (column names are solver names). The values failed, unknown, and timeout represent missing values (solver did not solve instance within cut-off time).
sc2021: Solver runtimes from the SAT Competition 2021. Numeric matrix, apart from the hash column that identifies instances. Each row represents a SAT instance (not all of them are actually from this SAT competition, so we filter instances later); each column represents a solver (column names are solver names). The value timeout represents missing values (solver did not solve instance within cut-off time).

`sc(2020|2021)_features.csv`

Pre-processed instance-feature data for the Main Track of the SAT Competitions 2020 and 2021. Output of the script prepare_dataset.py; input to the script run_experiments.py. Numeric matrix, apart from the hash column that identifies instances (we do not have any categorical features; they would need to be encoded beforehand). Each row represents a SAT instance; each column represents a feature (column names are feature names). Has the same number of rows as the corresponding runtime file. The empty string represents missing values caused by the feature extractor exceeding time or memory limits; these missing values are handled (imputed) in the prediction pipeline.

`sc(2020|2021)_runtimes.csv`

Pre-processed solver-runtime data from the Main Track of the SAT Competitions 2020 and 2021. Output of the script prepare_dataset.py; input to the script run_experiments.py. Numeric matrix, apart from the hash column that identifies instances. Each row represents a SAT instance; each column represents a solver (column names are solver names). Has the same number of rows as the corresponding feature file. Missing values were replaced with the double cut-off time according to PAR-2 scoring (= 10000).

`search_results.csv`

Results of portfolio search, e.g., portfolios, train/test objective values, and search times. Output of the script run_experiments.py; input to the script run_evaluation.py.

solvers (string, but actually a list of strings): List of the names of the solvers forming the portfolio. The solver names are column names in sc(2020|2021)_runtimes.csv.
train_objective (float): Objective value of a solution (= portfolio) to the K-Portfolio-Problem, using the SAT instances from the training data. The objective is defined as the PAR-2 score of the portfolio's virtual best solver (VBS).
test_objective (float): Objective value of the K-Portfolio-Problem for the SAT instances from the test data, i.e., take the solvers of the portfolio determined on the training data, and compute their VBS on the test instances (without running portfolio search again).
(train|test)_portfolio_vws (float): PAR-2 score of the virtual worst solver (VWS) formed from the portfolio, i.e., select the worst (in terms of PAR-2 score) solver from the portfolio for each instance. Might be used for comparison purposes; we do not use it in our evaluation. Bounds the objective value for portfolios with instance-specific solver selection.
(train|test)_portfolio_sbs (float): PAR-2 score of the single best solver (SBS) from the portfolio, i.e., the best individual solver contained in the portfolio. Might be used for comparison purposes; we use it in Figures 4 and 5 as a baseline.
(train|test)_portfolio_sws (float): PAR-2 score of the single worst solver (SWS) from the portfolio, i.e., the worst individual solver contained in the portfolio. Might be used for comparison purposes; we do not use it in our evaluation.
(train|test)_global_sws (float): PAR-2 score of the single worst solver (SWS) from all solvers (independent from current portfolio), i.e., the globally worst individual solver. Might be used for comparison purposes; we use it in Figures 1 and 2 for the submodularity-based upper bound.
search_time (float): Runtime (in seconds) of the portfolio search (on the particular dataset and cross-validation fold, with the particular portfolio-search approach).
search_id (int): Identifier denoting combinations of dataset, cross-validation fold, and portfolio-search approach. The experimental pipeline parallelizes these tasks. Each search_id might be associated with multiple portfolios; combining search_id and solution_id allows joining search_results and prediction_results.
solution_id (int): Identifier to distinguish between multiple portfolios found with a particular portfolio-search approach on a particular dataset and cross-validation fold. Apart from mip_search (the optimal solution), all search approaches yield multiple portfolios.
fold_id (int in {0, 1, 2, 3, 4}): Index of the cross-validation fold.
problem (string, 2 different values): Dataset name (in our experiments: SC2020 or SC2021).
algorithm (string, 4 different values): Search approach used to determine portfolios (in our experiments: beam_search, kbest_search, mip_search, or random_search).
k (int in [1, 48]): Desired number of solvers in the portfolio, an input parameter to portfolio search. The actual number of solvers in column solvers might differ, as beam search and k-best are only run with the maximal k for each dataset, but also yield all smaller portfolios (intermediate results).
w (int in [1, 100], but stored as float): Beam width if beam search was used, or number of random samples if random search was used. Missing value (represented as an empty string) for the other two search approaches. Input parameter to portfolio search.

`prediction_results.csv`

Results of predictions with portfolios, e.g., train/test prediction performance, train/test objective values, and feature importance. Output of the script run_experiments.py; input to the script run_evaluation.py.

model (string, 2 different values): Name of the prediction model (in our experiments: Random Forest and XGBoost). Each prediction model is trained for each portfolio from the search (thus, prediction_results has twice the number of rows as search_results).
pred_time (float): Runtime (in seconds) for training the prediction model (for one portfolio) on the training data, and making predictions on training data as well as test data.
(train|test)_pred_mcc (float in [-1, 1]): Prediction performance in terms of Matthews Correlation Coefficient for predicting the best solver from the portfolio for each instance.
(train|test)_pred_objective (float): PAR-2 score of the (instance-specific) solver recommendations made by the prediction model.
imp.<feature_name> (float in [0, 1]): Feature importances extracted from the prediction model after training. Importances are normalized and should sum up to one for each row. Missing values (represented as empty strings) occur if no prediction model was trained since the prediction target only had one class, e.g., portfolio had size one or always the same solver (out of multiple solvers) was best.
search_id (int), solution_id (int): Same as in search_results.csv, can be used for joining the results.

`Evaluation_console_output.txt`

Output of the script run_evaluation.py, manually copied from the console to a file.

Zeige alles

Schlagworte:

propositional satisfiability
solver portfolios
runtime prediction
machine learning
integer programming

Zugehörige Informationen:

Sprache:

Herausgeber/in:

Karlsruhe Institute of Technology

Erstellungsjahr:

2022

Fachgebiet:

Computer Science

Objekttyp:

Dataset

Datenquelle:

Verwendete Software:

Datenverarbeitung:

Erscheinungsjahr:

2023

Rechteinhaber/in:

Bach, Jakob https://orcid.org/0000-0003-0301-2798

Iser, Markus

Förderung:

Zeige alles Zeige weniger

Name	Speichervolumen	Metadaten	Upload	Aktion

Status:

Publiziert

Eingestellt von:

kitopen

Erstellt am:

2023-04-20

Archivierungsdatum:

2023-06-21

Archivgröße:

4,8 GB

Archiversteller:

kitopen

Archiv-Prüfsumme:

aea46799aa64dc9ccef7fd48ea35ec24 (MD5)

Embargo-Zeitraum:

Die Metadaten wurden nachträglich korrigiert. Die ursprünglichen Metadaten sind nach Download des Datenpakets verfügbar.

DOI: 10.35097/1331

Publikationsdatum: 2023-06-21

Datenpaket herunterladen

Herunterladen (4,8 GB)

Metadaten herunterladen

Statistik

0
Views

0
Downloads

Lizenz für das Datenpaket

Dieses Werk ist lizenziert unter
CC BY 4.0

Datenpaket zitieren

Bach, Jakob; Iser, Markus (2023): Experimental Data for the Paper "A Comprehensive Study of k-Portfolios of Recent SAT Solvers". Karlsruhe Institute of Technology. DOI: 10.35097/1331

Datenpaket: Experimental Data for the Paper "A Comprehensive Study of k-Portfolios of Recent SAT Solvers"

Experimental Data for the Paper "A Comprehensive Study of k-Portfolios of Recent SAT Solvers"

Using the Data

*.db files and corresponding *.csv files

sc(2020|2021)_features.csv

sc(2020|2021)_runtimes.csv

search_results.csv

prediction_results.csv

Evaluation_console_output.txt

`.db` files and corresponding `.csv` files

`sc(2020|2021)_features.csv`

`sc(2020|2021)_runtimes.csv`

`search_results.csv`

`prediction_results.csv`

`Evaluation_console_output.txt`