tdc.benchmark_group#
tdc.benchmark_group.base_group module#
- class tdc.benchmark_group.base_group.BenchmarkGroup(name, path='./data', file_format='csv')[source]#
Bases:
object
Boilerplate of benchmark group class. It downloads, processes, and loads a set of benchmark classes along with their splits. It also provides evaluators and train/valid splitters.
- evaluate(pred, testing=True, benchmark=None, save_dict=True)[source]#
automatic evaluation function
- Parameters:
- Returns:
a dictionary with key the benchmark name and value a dictionary of metrics to metric value
- Return type:
- Raises:
ValueError – benchmark name not found
- evaluate_many(preds, save_file_name=None, results_individual=None)[source]#
This function returns the data in a format needed to submit to the Leaderboard
- Parameters:
preds (list of dict) – list of dictionary of predictions, each item is the input to the evaluate function.
save_file_name (str, optional) – file name to save the result
results_individual (list of dictionary, optional) – if you already have results generated for each run, simply input here so that this function won’t call the evaluation function again
- Returns:
a dictionary where key is the benchmark name and value is another dictionary where the key is the metric name and value is a list [mean, std].
- Return type:
- get_train_valid_split(seed, benchmark, split_type='default')[source]#
obtain training and validation split given a split type from train_val file
- Parameters:
- Returns:
the training and validation files
- Return type:
pd.DataFrame
- Raises:
NotImplementedError – split method not implemented
tdc.benchmark_group.admet_group module#
- class tdc.benchmark_group.admet_group.admet_group(path='./data')[source]#
Bases:
BenchmarkGroup
Create ADMET Group Class object.
- Parameters:
path (str, optional) – the path to store/retrieve the ADMET group datasets.
tdc.benchmark_group.docking_group module#
- class tdc.benchmark_group.docking_group.docking_group(path='./data', num_workers=None, num_cpus=None, num_max_call=5000)[source]#
Bases:
BenchmarkGroup
Create a docking group benchmark loader.
- Parameters:
path (str, optional) – the folder path to save/load the benchmarks.
pyscreener_path (str, optional) – the path to pyscreener repository in order to call docking scores.
num_workers (int, optional) – number of workers to parallelize dockings
num_cpus (int, optional) – number of CPUs assigned to docking
num_max_call (int, optional) – maximum number of oracle calls
- evaluate(pred, true=None, benchmark=None, m1_api=None, save_dict=True)[source]#
Summary
- Parameters:
pred (dict) – a nested dictionary, where the first level key is the docking target, the value is another dictionary where the key is the maximum oracle calls, and value can have two options. One, a dictionary of SMILES paired up with the docking scores and Second, a list of SMILES strings, where the function will generate the docking scores automatically.
benchmark (str, optional) – name of the benchmark docking target.
m1_api (str, optional) – API token of Molecule.One. This is to use M1 service to generate synthesis score.
save_dict (bool, optional) – whether or not to save the results.
- Returns:
result with all realistic metrics generated
- Return type:
- Raises:
ValueError – Description
- evaluate_many(preds, save_file_name=None, m1_api=None, results_individual=None)[source]#
evaluate many runs together and output submission ready pkl file.
- Parameters:
preds (list) – a list of pred across runs, where each follows the format of pred in ‘evaluate’ function.
save_file_name (str, optional) – the name of the file to save the result.
m1_api (str, optional) – m1 API token for molecule synthesis score.
results_individual (list, optional) – if you already have generated the result from the evaluate function for each run, simply put in a list and it will not regenerate the results.
- Returns:
the output result file.
- Return type:
- get(benchmark, num_max_call=5000)[source]#
retrieve one benchmark given benchmark name (docking target)
- get_train_valid_split(seed, benchmark, split_type='default')[source]#
no split for docking group
- Raises:
ValueError – no split for docking group
tdc.benchmark_group.drugcombo_group module#
- class tdc.benchmark_group.drugcombo_group.drugcombo_group(path='./data')[source]#
Bases:
BenchmarkGroup
create a drug combination benchmark group
- Parameters:
path (str, optional) – path to save/load benchmarks
tdc.benchmark_group.dti_dg_group module#
- class tdc.benchmark_group.dti_dg_group.dti_dg_group(path='./data')[source]#
Bases:
BenchmarkGroup
Create a DTI domain generalization benchmark group
- Parameters:
path (str, optional) – path to save/load benchmarks