Chemical Functional Descriptors 2
Chemical Functional Descriptors 2
Chemical Functional Descriptors 2
: 06 Computational Biology
Principal Investigator: Dr. Vibha Dhawan, Distinguished Fellow and Sr. Director
The Energy and Resources Institute (TERI), New Delhi
Computational Biology
Biotechnology
Chemical Functional Descriptors - II
Description of Module
Subject Name Biotechnology
Module Id 09
Pre-requisites
Objectives
Keywords
Computational Biology
Biotechnology
Chemical Functional Descriptors - II
P06-module 9 : Chemical Descriptors
The topics will discussed what are Descriptors , how to derive Descriptors from chemical
What kinds of properties are relevant to biological functions? We shall start with 2D
descriptors, like using SMILES to chemical structure and then derive the descriptors. Just by
counting atoms we can get MW which is very effective descriptors, similarly counting many
important properties like hydrogen bond donors/acceptors, aromatic rings, rotatable bonds
etc. will provide good correlation with molecule’s biological functions. From such simple
descriptors we can proceed to find more complicated like topological and electronic
descriptors, these are found to be highly correlated to the physico -chemical properties of the
chemicals. Descriptors are chemical characteristic of compounds which explain the relevant
biological properties. Many structure based descriptors available, but molecular descriptors
are also very important like cLogP , which is directly connected with the hydrophobic
property of molecule.
Topological Descriptors are derived from structural connectivity as shown in figure here for
Tyrosine. Connection table or Distance matrix are also used as topological descriptors.
Computational Biology
Biotechnology
Chemical Functional Descriptors - II
13
OH
11 9
12 8
6
CH2
5
H2N CH
4
O OH
3 1
• 1 2 3 4 5 6 7 8 9 10 11 12 13
1.O 1 2 2 3 3 4 5 6 7 6 5 8
2.1 C 1 1 2 2 3 4 5 6 5 4 7
3.2 1 O 2 3 3 4 5 6 7 6 5 8
4.2 1 2 C 1 1 2 3 4 5 4 3 6
5.3 2 3 1 N 2 3 4 5 6 5 4 7
6.3 2 3 1 2 C 1 2 3 4 3 2 5
From this connectivity matrix one can derive Weiner Index which has strong correlation with
7.4 3 4 2 3 1 C 1 2 3 2 1 4
boiling point of alkanes. It is given by ,
8.5 4 5 3 4 2 1 C 1 2 3 2 3
9.6 5 6 4 5 3 2 1 C 1 2 3 2
10. 7 . 6Similarly 7 Zagreb
5 6index4can 3also be2derived
1 from C each1 non-hydrogen
2
atom, add1up the squares of the number of connections to other non-hydrogen atoms
11. 6 5 6 4 5 3 2 3 2 1 C 1
(regardless of bond order) and given by
2
12. 5 4 5 3 4 2 1 2 3 2 1 C
3 A few examples are worked out in slides.
13. 8 7 8 6 7 5 4 3 2 1 2 3
O
Electronic descriptors depict the property related to valence of bonded atoms, like value
which is given byh where is count of electrons in sigma orbital and h is count of
Computational Biology
Biotechnology
Chemical Functional Descriptors - II
bonded hydrogen atoms. Simple descriptors like this explains many electronic behaviour
of the molecule like The count of adjacent atoms excluding hydrogens , count of sigma
characteristics are included in one descriptor , this explains why using descriptors one can
There are 4 quantum numbers as defined in the slides and a very important descriptor
called Kier-Hall electronegativity value can be derived from this quantum numbers.
From molecular connectivity some more important indices can be derived which
quantifies the shape & structure of chemicals. The Kappa shape indices are the basis of a
encoded into three indices (Kappa values). These Kappa values are derived from counts
of one-bond, two-bond and three-bond fragments, each count being made relative to
fragment counts in reference structures which possess a maximum and minimum value
for that number of atoms. Many such descriptors are discussed in the slides and relevant
Some other molecular properties are also used as descriptors like clogP, which is octanol-
water partition coefficient; this has been found very useful in predicting the
cross cell membranes but soluble enough in water not to get stuck there, many methods
have been proposed for calculating appropriate estimation of this property from
structures. One such starting estimation was proposed by Corwin Hansch and Al Leo at
Pomona College using a good set of chemicals whose clogP experimentally known and
Computational Biology
Biotechnology
Chemical Functional Descriptors - II
using the fragment method a statistically relevant correlation was developed. Later many
Atom based descriptors are also been very effective to predict molar refractivity, charged
It is noted that seminal works were done to convert from simple SMILES to a 3D
(Munich/Erlangen, 1990) & OMEGA ( Open Eye , ~2000 ). Each of them use very
distinct and flexible 3D structures, however x-ray crystallography on small molecules are
already been collected and available with Experimental data in Cambridge Structure
Database.
Milestones in molecular modeling & Chemoinformatics came due to the ligand based
superposition of active site; it is common that if the different compounds bind to the same
active site, they must have a kind of 3D superposition. It can be by the structural
so a substructure based searching became quite relevant. More important method was
developed which takes care of the interaction energy surrounding the molecules which
bind to the same active site, it is called comparative molecular field analysis driven
superposition of ligands.
Computational Biology
Biotechnology
Chemical Functional Descriptors - II
Similar approach also has been seen using graph similarity in the structural superposition
amongst different size of proteins .An example of such case has been shown here. Similar
computational tools are used for ligand superposition and protein superposition.
One exercise has been given for students to learn and verify the SMILES depiction, this
Computational Biology
Biotechnology
Chemical Functional Descriptors - II