tdc.generation#
tdc.generation.generation_dataset module#
- class tdc.generation.generation_dataset.DataLoader(name, path, print_stats, column_name)[source]#
Bases:
DataLoader
A base dataset loader class.
- get_data(format='df')[source]#
Return the data from the whole dataset.
- Parameters:
format (str, optional) – the desired format for molecular data.
- Returns:
a dataframe of the dataset/a distionary for information
- Return type:
pandas DataFrame/dict
- Raises:
AttributeError – Use the correct format as input (df, dict)
- get_split(method='random', seed=42, frac=[0.7, 0.1, 0.2])[source]#
Return the data splitted as train, valid, test sets.
- Parameters:
- Returns:
a dataframe of the dataset
- Return type:
pandas DataFrame/dict
- Raises:
AttributeError – Use the correct split method as input (random, scaffold)
- class tdc.generation.generation_dataset.DataLoader3D(name, path, print_stats, dataset_names, column_name)[source]#
Bases:
DataLoader
A basic class for generation of 3D biomedical entities. (under construction)
- get_data(format='df', more_features='None')[source]#
Return the data from the whole dataset.
- Parameters:
- Returns:
a dataframe of the dataset/a distionary for information
- Return type:
pandas DataFrame/dict
- Raises:
AttributeError – Use the correct format as input (df, dict)
ImportError – Please install rdkit by ‘conda install -c conda-forge rdkit’
- get_split(method='random', seed=42, frac=[0.7, 0.1, 0.2])[source]#
Return the data splitted as train, valid, test sets.
- Parameters:
- Returns:
a dataframe of the dataset
- Return type:
pandas DataFrame/dict
- Raises:
AttributeError – Use the correct split method as input (random, scaffold)
- class tdc.generation.generation_dataset.PairedDataLoader(name, path, print_stats, input_name, output_name)[source]#
Bases:
DataLoader
A basic class for generation of biomedical entities conditioned on other entities, such as reaction prediction.
- get_data(format='df')[source]#
Return the data from the whole dataset.
- Parameters:
format (str, optional) – the desired format for molecular data.
- Returns:
a dataframe of the dataset/a distionary for information
- Return type:
pandas DataFrame/dict
- Raises:
AttributeError – Use the correct format as input (df, dict)
- get_split(method='random', seed=42, frac=[0.7, 0.1, 0.2])[source]#
Return the data splitted as train, valid, test sets.
- Parameters:
- Returns:
a dataframe of the dataset
- Return type:
pandas DataFrame/dict
- Raises:
AttributeError – Use the correct split method as input (random, scaffold)
tdc.generation.molgen module#
- class tdc.generation.molgen.MolGen(name, path='./data', print_stats=False, column_name='smiles')[source]#
Bases:
DataLoader
Data loader class accessing to molecular generation task (distribution learning)
tdc.generation.reaction module#
- class tdc.generation.reaction.Reaction(name, path='./data', print_stats=False, input_name='reactant', output_name='product')[source]#
Bases:
PairedDataLoader
Data loader class accessing to forward reaction prediction task.
tdc.generation.retrosyn module#
- class tdc.generation.retrosyn.RetroSyn(name, path='./data', print_stats=False, input_name='product', output_name='reactant')[source]#
Bases:
PairedDataLoader
Data loader class accessing to retro-synthetic prediction task.
- get_split(method='random', seed=42, frac=[0.7, 0.1, 0.2], include_reaction_type=False)[source]#
Return the data splitted as train, valid, test sets.
- Parameters:
- Returns:
a dataframe of the dataset
- Return type:
pandas DataFrame/dict
- Raises:
AttributeError – Use the correct split method as input (random, scaffold)