Introduction To Chemoinformatics - Vernek

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Introduction to Chemoinformatics

Alexandre Varnek

Definition (wikipedia)
Cheminformatics (also known as chemoinformatics and chemical
informatics) is the use of computer and informational techniques,
applied to a range of problems in the field of chemistry.

These in silico techniques are used in pharmaceutical companies

in the process of drug discovery.

In the U.S., recent NIH emphasis has been placed on developing

public domain Cheminformatics research by creating six
Exploratory Centers for Cheminformatics Research (ECCRs) as
part of the NIH Molecular Libraries Initiative.


Target Protein

High Throughout Screening
Docking Large libraries
of molecules
Small Library of selected hits


Virtual Screening
Major applications of Chemoinformatics

Storage and Search Structure-Property

of chemical information Modeling

In silico
Chemoinformatics Why?
amount of information
many millions of compounds and reactions
many millions of publications

Storage, organization and search experimental data

Chemical Databases
Problem: FloodofInformation

30 000 000
> 47 million compounds 25 000 000

# o f str u c tu r e s
20 000 000
5-7 million new compounds / year 15 000 000
10 000 000
5 000 000
800,000 publications / year
1965 1970 1975 1980 1985 1990 1995 2000


=> can anyone read 4.000 publications / day ?

Problem: NotEnoughInformation

>47,000,000 chemical compounds

~500,000 3Dstructures on
CambridgeCrystallographic File

we have 3D structures for 0.1 % of all compounds

Chemoinformatics Why?
complex relationships
structure - biological activity
chemical reactivity

Prediction of physical, chemical and biological properties

In silico design of new compounds


George S. Hammond
Norris Award Lecture, 1968
Chemoinformatics How?

Storage, organization and Prediction of physical,

search experimental data chemical and biological

Example1: Hansch Analysis
Biological Activity = f (Descriptors) + constant
log1/C = a ( log P )2 + b log P + + Es + C

Hanschs Descriptors can

be broadly classified into
three general types:

Electronic ()
Steric (Es)
Hydrophobic (logP)
Example2: Lipinskiruleoffive

Poor absorption or permeation are more likely when:

Therearemorethan5 Hbonddonors.
Therearemorethan10 Hbondacceptors.

Molecule is represented by 4 parameters:

- the number of H-bond donor groups;
- the number of H-bond acceptor groups;
- molecular weight;
- logP
Chemoinformatics definition

Chemoinformatics isafield
multidimentional chemicalspace
Theoretical chemistry

Quantum Chemistry

Force Field
Molecular Modelling

Theoretical chemistry

Quantum Chemistry
- Molecular model
- Basic concepts
Force Field - Major applications
Molecular Modelling - Learning approaches

Molecular Model

Quantum Chemistry electrons and nuclei

Force Field atoms and bonds

Molecular Modelling

objects in chemical space

Chemoinformatics (graphs, vectors)
Learning approach

deductive >> inductive

Quantum Chemistry

Force Field deductive inductive

Molecular Modelling

Chemoinformatics deductive << inductive


deductive inductive
learning learning
know- generalization

information context

Which approach is more useful for a theoretical design of
compounds possessing desired properties ?

Quantum Chemistry

Force Field Modeling


They are complementary

but Chemoinformatics is the most suitable one for
quantitative predictions of properties
Chemoinformatics definition
Chemoinformatics is a generic term that encompasses the design,
creation, organization, management, retrieval, analysis, dissemination,
visualization, and use of chemical information

Chemoinformatics is the mixing of those information resources to

transform data into information and information into knowledge for the
intended purpose of making better decisions faster in the area of drug lead
identification and optimization

Chemoinformatics is theapplicationofinformatics methods tosolve

chemical problems

Chemoinformatics isafielddealingwithmolecularobjects(graphs,
vectors)inmultidimentional chemicalspace

A. Varnek, 2007
Recommended reading

Chemoinformatics - A Textbook, Johann Gasteiger and

Thomas Engel, Wiley-VCH 2003.

Handbook of Chemoinformatics, Johann Gasteiger,

Wiley-VCH 2003.

An Introduction to Chemoinformatics, Andrew R. Leach,

Valerie J. Gillet, Springer 2007.
Short courses in chemoinformatics, 1 5 June 2009

Day 1
Morning Computer representation of chemical structures A. Varnek
Afternoon Creation and management of chemical databases G. Marcou, A.Varnek
Tutorials with the ChemAxon software
Day 2
Morning Molecular Descriptors A. Varnek
Afternoon Force Field approach. Conformational sampling D. Horvath, A. Varnek
Tutorials with MOE, Codessa Pro
Day 3
Morning Pharmacophores T. Langer, D. Horvath
Afternoon Chemical space, similarity/diversity and chemical library design J. Bajorath
Tutorials with MOE
Short courses in chemoinformatics, 1 5 June 2009

Day 4
Structure-Property modeling G. Marcou, A.Varnek
Afternoon Tutorials with ISIDA, CODESSA Pro and WEKA

Day 5
Morning Docking E. Kellenberger
Afternoon Virtual screening G. Marcou, D. Horvath, A. Varnek
Tutorials with MOE

You might also like