This repo compiles published organic chemistry data, including raw data and calculated descriptor sets for molecules and reactions.
Each folder contains one reaction (see publication list for details).
Every reaction folder was divided into two sub-folders:
mols
: a list of molecules categorized by reaction roles)rxns
: a list of reaction entries with multiple reaction components and yield).
import pandas as pd
REPO_PATH = 'https://raw.githubusercontent.com/beef-broccoli/ochem-data/main/'
FP = 'deoxyF/paper-dft/train.csv' # change this
df = pd.read_csv(REPO_PATH + FP)
# do things with df...
Each subfolder (ohe
, mol2vec
, mordred
...) includes different descriptor encodings for the list of molecules or reaction entries
- asym_epox: Ni/Photoredox-Catalyzed Enantioselective Cross-Electrophile Coupling of Styrene Oxides with Aryl Iodides