tdc.single_pred#
tdc.single_pred.single_pred_dataset module#
- class tdc.single_pred.single_pred_dataset.DataLoader(name, path, label_name, print_stats, dataset_names, convert_format, raw_format='SMILES')[source]#
Bases:
DataLoader
A base data loader class.
- Parameters:
name (str) – the dataset name.
path (str) – The path to save the data file
label_name (str) – For multi-label dataset, specify the label name
print_stats (bool) – Whether to print basic statistics of the dataset
dataset_names (list) – A list of dataset names available for a task
convert_format (str) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe
- entity1#
a list of the single entites
- Type:
Pandas Series
- entity1_idx#
a list of the single entites index
- Type:
Pandas Series
- entity1_name#
a list of the single entites names
- Type:
Pandas Series
- y#
a list of the single entities label
- Type:
Pandas Series
- get_data(format='df')[source]#
- Parameters:
format (str, optional) – the returning dataset format, defaults to ‘df’
- Returns:
a dataframe of a dataset/a dictionary for key information in the dataset
- Return type:
pandas DataFrame/dict
- Raises:
AttributeError – Use the correct format input (df, dict, DeepPurpose)
- get_split(method='random', seed=42, frac=[0.7, 0.1, 0.2])[source]#
- Parameters:
method – splitting schemes, choose from random, cold_{entity}, scaffold, defaults to ‘random’
seed – the random seed for splitting dataset, defaults to ‘42’
frac – train/val/test split fractions, defaults to ‘[0.7, 0.1, 0.2]’
- Returns:
a dictionary with three keys (‘train’, ‘valid’, ‘test’), each value is a pandas dataframe object of the splitted dataset
- Return type:
- Raises:
AttributeError – the input split method is not available.
tdc.single_pred.adme module#
- class tdc.single_pred.adme.ADME(name, path='./data', label_name=None, print_stats=False, convert_format=None)[source]#
Bases:
DataLoader
Data loader class to load datasets in ADME task. More info: https://tdcommons.ai/single_pred_tasks/adme/
- Parameters:
name (str) – the dataset name.
path (str, optional) – The path to save the data file, defaults to ‘./data’
label_name (str, optional) – For multi-label dataset, specify the label name, defaults to None
print_stats (bool, optional) – Whether to print basic statistics of the dataset, defaults to False
convert_format (str, optional) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe, defaults to None
tdc.single_pred.crispr_outcome module#
- class tdc.single_pred.crispr_outcome.CRISPROutcome(name, path='./data', label_name=None, print_stats=False, convert_format=None)[source]#
Bases:
DataLoader
Data loader class to load datasets in CRISPROutcome task. More info: https://tdcommons.ai/single_pred_tasks/CRISPROutcome/
- Parameters:
name (str) – the dataset name.
path (str, optional) – The path to save the data file, defaults to ‘./data’
label_name (str, optional) – For multi-label dataset, specify the label name, defaults to None
print_stats (bool, optional) – Whether to print basic statistics of the dataset, defaults to False
convert_format (str, optional) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe, defaults to None
tdc.single_pred.develop module#
- class tdc.single_pred.develop.Develop(name, path='./data', label_name=None, print_stats=False, convert_format=None)[source]#
Bases:
DataLoader
Data loader class to load datasets in Develop task. More info: https://tdcommons.ai/single_pred_tasks/develop/
- Parameters:
name (str) – the dataset name.
path (str, optional) – The path to save the data file, defaults to ‘./data’
label_name (str, optional) – For multi-label dataset, specify the label name, defaults to None
print_stats (bool, optional) – Whether to print basic statistics of the dataset, defaults to False
convert_format (str, optional) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe, defaults to None
tdc.single_pred.epitope module#
- class tdc.single_pred.epitope.Epitope(name, path='./data', label_name=None, print_stats=False, convert_format=None)[source]#
Bases:
DataLoader
Data loader class to load datasets in Epitope Prediction task. More info: https://tdcommons.ai/single_pred_tasks/epitope/
- Parameters:
name (str) – the dataset name.
path (str, optional) – The path to save the data file, defaults to ‘./data’
label_name (str, optional) – For multi-label dataset, specify the label name, defaults to None
print_stats (bool, optional) – Whether to print basic statistics of the dataset, defaults to False
convert_format (str, optional) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe, defaults to None
tdc.single_pred.hts module#
- class tdc.single_pred.hts.HTS(name, path='./data', label_name=None, print_stats=False, convert_format=None)[source]#
Bases:
DataLoader
Data loader class to load datasets in HTS task. More info: https://tdcommons.ai/single_pred_tasks/hts/
- Parameters:
name (str) – the dataset name.
path (str, optional) – The path to save the data file, defaults to ‘./data’
label_name (str, optional) – For multi-label dataset, specify the label name, defaults to None
print_stats (bool, optional) – Whether to print basic statistics of the dataset, defaults to False
convert_format (str, optional) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe, defaults to None
tdc.single_pred.paratope module#
- class tdc.single_pred.paratope.Paratope(name, path='./data', label_name=None, print_stats=False, convert_format=None)[source]#
Bases:
DataLoader
Data loader class to load datasets in Paratope Prediction task. More info: https://tdcommons.ai/single_pred_tasks/paratope/
- Parameters:
name (str) – the dataset name.
path (str, optional) – The path to save the data file, defaults to ‘./data’
label_name (str, optional) – For multi-label dataset, specify the label name, defaults to None
print_stats (bool, optional) – Whether to print basic statistics of the dataset, defaults to False
convert_format (str, optional) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe, defaults to None
tdc.single_pred.qm module#
- class tdc.single_pred.qm.QM(name, path='./data', label_name=None, print_stats=False, convert_format=None, raw_format='Raw3D')[source]#
Bases:
DataLoader
Data loader class to load datasets in QM (Quantum Mechanics Modeling) task. More info: https://tdcommons.ai/single_pred_tasks/qm/
- Parameters:
name (str) – the dataset name.
path (str, optional) – The path to save the data file, defaults to ‘./data’
label_name (str, optional) – For multi-label dataset, specify the label name, defaults to None
print_stats (bool, optional) – Whether to print basic statistics of the dataset, defaults to False
convert_format (str, optional) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe, defaults to None
tdc.single_pred.test_single_pred module#
- class tdc.single_pred.test_single_pred.TestSinglePred(name, path='./data', label_name=None, print_stats=False, convert_format=None)[source]#
Bases:
DataLoader
Data loader class to test the single instance prediction data loader.
- Parameters:
name (str) – the dataset name.
path (str, optional) – The path to save the data file, defaults to ‘./data’
label_name (str, optional) – For multi-label dataset, specify the label name, defaults to None
print_stats (bool, optional) – Whether to print basic statistics of the dataset, defaults to False
convert_format (str, optional) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe, defaults to None
tdc.single_pred.tox module#
- class tdc.single_pred.tox.Tox(name, path='./data', label_name=None, print_stats=False, convert_format=None)[source]#
Bases:
DataLoader
Data loader class to load datasets in Tox (Toxicity Prediction) task. More info: https://tdcommons.ai/single_pred_tasks/tox/
- Parameters:
name (str) – the dataset name.
path (str, optional) – The path to save the data file, defaults to ‘./data’
label_name (str, optional) – For multi-label dataset, specify the label name, defaults to None
print_stats (bool, optional) – Whether to print basic statistics of the dataset, defaults to False
convert_format (str, optional) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe, defaults to None
tdc.single_pred.yields module#
- class tdc.single_pred.yields.Yields(name, path='./data', label_name=None, print_stats=False, convert_format=None)[source]#
Bases:
DataLoader
Data loader class to load datasets in Yields (Reaction Yields Prediction) task. More info: https://tdcommons.ai/single_pred_tasks/yields/
- Parameters:
name (str) – the dataset name.
path (str, optional) – The path to save the data file, defaults to ‘./data’
label_name (str, optional) – For multi-label dataset, specify the label name, defaults to None
print_stats (bool, optional) – Whether to print basic statistics of the dataset, defaults to False
convert_format (str, optional) – Automatic conversion of SMILES to other molecular formats in MolConvert class. Stored as separate column in dataframe, defaults to None