Knime Overview PDF
Knime Overview PDF
Knime Overview PDF
DocumentaDon
- Easy debugging, interrupGons, data included, inspect
each step (2D, in Maestro/PyMOL)
Overview
• Organized by level:
– Get started
– Intermediate
– Advanced funcGonaliGes
• And by topics:
– KNIME desktop: GUI, specificiGes, nodes
– Schrödinger extensions: specificiGes, nodes
• You can jump between the secGons using links (marked with ►
or◄). See the overview slides.
• There are also links to use-case examples (marked with ♦).
Get started
KNIME desktop
– GUI ►
– SpecificiGes ►
– Nodes ►
Schrödinger extensions
– SpecificiGes ►
– Schrödinger nodes ►
► Intermediate
KNIME desktop GUI ◄
• Knime.org and Knime.com
• KNIME desktop
• Start Knime
• Create a new workflow and organize a workspace
• Run a node
• Import and export workflows
• Tips and tricks
• DocumentaGon
Konstanz InformaDon Miner and Ecosystem
KNIME.org
- Leading open-source ‘pipelining/workflow’ tool
- Freely available to academic and industrial researchers
- KNIME Desktop, based on Eclipse h#p://www.knime.org
- Community contributions:
- Modeling tools
- Marvin sketcher
- RDKit
- Indigo
- CDK
- R Scripting
- Erlwood
- Image Processing
- HCS Tools
- Next Generation Sequencing
- Palladian (mainly GPL3)
Schrödinger Extensions
- First released in 2007
- 150+ nodes
- Molecular mechanics
- Molecular dynamics
- Quantum mechanics
- Cheminformatics
- Pharmacophore modeling
- Combinatorial libraries
- Docking
- Protein structure prediction
- Structure and data manipulation
- Maestro integration
- Workflow execution
- Structure exchange
KNIME Desktop GUI
• Full screen mode
• Forget about Eclipse
specific menu items
More about:
• The console ►
Start KNIME
• Start up KNIME:
– On Linux: run $SCHRODINGER/knime
– On Windows: click on the icon
– Use -data MyWorkspace to open a specific workspace
– File > Switch workspace, but KNIME takes Gme to start up again
• Workspace, workflows and workflow groups:
Create a new workflow and organize a workspace
• Under the pop-up menu of Workflow Project repository:
– New KNIME workflow and New Workflow group
– Copy, Paste, Delete, Move, Rename
• Drag and drop the workflows in the Workflow project repository
Run a node
• Connectors
double click or
• Node status pop-up
And also:
• Java snippet, RowID and GroupBy node ►
• Schrödinger nodes for data manipulaGon ►
Data exchange
• As text files: File reader and csv writer nodes
• In Excel format: xls reader and xls writer nodes
• Between workflows: table reader and table writer
nodes
• See also among the Schrödinger nodes:
– Schrödinger reader and writer nodes
– CSV reader (read several files)
– View CSV ►
KNIME workbench nodes ►
• KNIME.com Labs nodes
• ScripGng and run a third party tool
• Java snippet
• RowID
• Group by
• Miscellaneous nodes: InteracGve table, Math formula, CDK Sketcher
• Plopng faciliGes
• Looping funcGonaliGes - Basics
• Model building nodes
Schrödinger extensions specificiDes ◄
• Canvas 2D renderer
• Grouped structures in a cell
• Output column structure opGons
• Jobcontrol tab
Canvas 2D renderer
• Preferences > KNIME > Preferred
renderer
Grouped structures in a cell
• #CTs: number of structures
• Set of conformaGons, Glide poses, Ligprep forms…
• Group and ungroup nodes, match opGon
• Also grouped SD, mol2
Output column structure opDons
• Input plus Output, Output replaces Input, Output only
• Extract MAE properGes, Set MAE properGes and delete MAE
properGes nodes.
Read in View
Ligands Results
Docking
Ligand
Preparation
Filtering
Nodes of general use - Readers and converters
• Molecule reader, SD reader... Glide grid reader...
• Converters (Maestro, mae.gz, SD, sd.gz, mol2, PDB, smiles) including Molecule to
MAE, string to type. Canvas converters (Matrix, Fingerprint, Bitvector from and to
table). SD format checker
• Pose viewer to complexes and Complexes to PoseViewer
Run maestro command and Run Maestro
Nodes of general use - Structure manipulaDon
• Set MAE properGes
http://www.schrodinger.com/knimeworkflows
Other KNIME Workflows
CheminformaDcs • ESP Charges Labs
• Cluster by Fingerprint • Jaguar pKa • Glide Grid Writer
• Database Analysis
• Maximum Common Substructure Search (MCS) • Quantum Mechanical ProperGes • Parameter Flow Variable Use-cases
• Select Diverse Molecules • Run Maestro 1:1 Use-cases
• Semi-empirical OpGmizaGon
• Similarity Search
• Substructure Search General tools
Library Design • Chemistry External Tool Use-cases
Docking and Post-Processing
• Library EnumeraGon • Ensure Molecule Title Uniqueness
• Docking and Scoring
• Ensemble Docking • Output Column Structure OpGon Philosophy
• Loop Over Docking Parameters • Protein Structure Alignment
• Protein PreparaGon and Glide Grid GeneraGon Protein Modeling • Python Script Node Use-cases
• Validate Docking Parameters • Induced Fit Docking Protocol • Run Maestro Command Node Use-cases
• Virtual Screening
• Model Building • Run PyMOL
• SiteMap
• Split and Align MulGmers
Pharmacophore Modeling • Webservice
• Phase Hypothesis IdenGficaGon Workbench • Workflows in the Current Workspace
• Phase Screening • Group By Use-cases
• Shape Screening http://www.schrodinger.com/knimeworkflows/
• Group Looper
Molecular Dynamics: • Unpivot
• Desmond SimulaGon
Real World Examples
Molecular Mechanics • Binding Site Shape Clustering
• Compare ConformaGonal Search Methods • Sitemap and Glide Grid GeneraGon
• ConformaGonal Search and Post-Processing • Vendor Database PreparaGon
Quantum mechanics
• ConformaGonal Search and QM Refinement
Schrödinger nodes ►
• Chemistry tool nodes
• Python nodes
• Row iterator loop start
• Look up and add vs. Joiner node
• Miscellaneous nodes: Compare ligands, Set molecule Gtle
Intermediate
KNIME workbench
- GUI ►
- Nodes ►
Schrödinger extensions
- SpecificiGes ►
- Schrödinger nodes ►
◄ Get started
► Advanced funcGonaliGes
KNIME workbench GUI ◄
• Preferences
• Advanced node funcGonaliGes
• Errors, warnings and Console informaGon
• Flow variables and workflow variables
• Metanodes
• Memory limit
• Tips and tricks
Preferences > KNIME
• Directory for temporary files (See also Schrödinger preferences ►)
• KNIME GUI- disable the node reset, deleGon and reconnecGon confirmaGon
• KNIME GUI- Console view log level: recommended to change to INFO.
Example of informaGon provided by Schrödinger nodes ►
Advanced node funcDonaliDes
• Hovering over an input connector tells you what the node takes as input (eg
Molecules in Maestro, SMILES or SD format)
• Hovering over an output connector reports the number of rows and
columns in the output table
• Comment a workflow: Node pop-up menu > Node name and descripGon
• Data table > change the renderer
Errors, warnings and Console informaDon
• Popup-menu > View Std output/error
• Warning sign above the node status
when the node completed with potenGal
errors.
• Console informaGon:
INFO HierarchicalClusteringNodeModel Preparing input file '/tmp/HierarchicalClustering_in_423741.mat'
INFO HierarchicalClusteringNodeModel Finished preparing file Gme 0.35 seconds
INFO HierarchicalClusteringNodeModel 10:42:45 11.17.2009:
Running cmdline[0]='=/usr/local/schro-latest/uDliDes/canvasHC -im HierarchicalClustering_in_4116794508031023741.mat -ot
HierarchicalClustering_in_4116794508031023741.tree -og HierarchicalClustering_in_4116794508031023741.csv -linkage schrodinger -n 123'
INFO HierarchicalClusteringNodeModel Completed Gme 1.626 seconds
INFO HierarchicalClusteringNodeModel Preparing output
INFO HierarchicalClusteringNodeModel Finished preparing output: Gme 0.06 seconds
INFO LocalNodeExecuGonJob Hierarchical Clustering (from Matrix) 0:2:50 End execute (2 secs)
Flow variables and workflow variables
• The Flow variables are used pass data between nodes on top of the connecGons.
• In the flow variable tab or the configuraGon panel for a couple of nodes:
• Global variables can also be set: with the Java snippet node ►
Or in the Workflow project repository select the workflow and Workflow variables in the
pop-up menu.
See also Schrödinger specificiGes ► and nodes to edit variables ►
Metanodes
• To hide the complexity and organize a workflow
• Chose the number and type of input/output
• The metanodes open up in new tabs
Memory limit
• Check the memory limit: Help > About Knime > InstallaGon details > ConfiguraGon and
search for a line starGng with "eclipse.vmargs=-Xmx" (close to the top).
• Increase the memory allocated to KNIME:
– $SCHRODINGER\knime -maxHeap 4096m
– knime –Xmx4096m (as last opGon in the command line)
– in $SCHRODINGER\knime-v*\bin\*\knime.ini: change -Xmx1024M into 2048M (or
higher on 64 bit)
• The error message usually contain "Java heap space“ when there is a KNIME is running
out of memory.
• Preferences > General > Show heap status and use the garbage collector.
• Knime and Schrödinger tools (eg Canvas) don't compete for memory.
Tips and tricks
• Copy and paste some nodes to a specific place:
Select, copy the nodes (Ctrl+C), right click where you want to paste the nodes and
select Paste in the pop-up menu.
Using Ctrl+V instead the nodes will be pasted a li#le below the original ones.
• The keyboard shortcuts for items on the menus are listed as usual with the menu
item. In File > Preferences > General > Keys you can view all the key bindings to
commands, modify the bindings, and create your own shortcuts.
• All the branches can be run at the same Gme using the GUI toolbar Execute all
executable nodes bu#on. See also Cancel all running nodes.
Known issues
• If you can´t save the workflow with a Java heap space error try
to disconnect the last node or run the garbage collector.
KNIME workbench GUI ►
• Report designer
• Global variables
• Batch execuGon
• Tips and tricks
KNIME workbench nodes ◄
• KNIME.com Labs nodes
• ScripGng and run a third party tool
• Java snippet use-cases
• Manipulate the table row IDs using the RowID node
• AggregaGon using the GroupBy node
• Miscellaneous nodes: InteracGve table, Math formula, CDK Sketcher
• Plopng faciliGes
• Looping funcGonaliGes- Basics
• Model building nodes
KNIME.com Labs nodes
• Pipeline Pilot Connector (other way around?)
• Web Service client, etc
Specific update site: h#p://labs.knime.org/
ScripDng and run a third party tool
• Java snippet ►
• Jython and Schrödinger Python nodes ►
• Perl scripGng
• External tool and Schrödinger Chemistry external tool nodes ►
• Run Maestro commands ►
Java snippet use-cases
• Duplicate numeric or string columns
• Create a new column from scratch (eg a tag)
• Combine columns (and flow variables) but use the Combiner node for simple
tasks
eg return "prefix-"+$$FlowVar$$+"_ref_"+$Col1$;
• Add a row index (see also Set MAE index)
See the corresponding
workflow example.
Manipulate the table row IDs using the RowID node
• Use data table column values as row IDs and store Row IDs in
a column. Use-cases:
– before transposing a data table
– Set the labels to be used by the Plo#er node
• Ensure row ID uniqueness
– eg for Canvas tools before creaGng a matrix)
AggregaDon using the GroupBy node
Some of the aggregaGon methods:
- first, last
- max, min
- Mean
- Sum
- Concatenate
- (unique) count
- List
- Set
Miscellaneous nodes
• InteracGve table: Find & Find Next
equivalent to the Schrödinger Text viewer node
that have more funcGonaliGes
• Math formula
• CDK Structure sketcher
or Marvin sketch
(free of charge from Infocom)
Ploing faciliDes
• Data Views: Plo#er, Histogram…
• Mining > Scoring: Enrichment plo#er, ROC curve
• Advanced capabiliGes available in KNIME Report
designer ►
Looping faciliDes- Basics
• Loop start … Loop end
• Inject and extract variables
• TableRow/Column to and from variables
• Prebuilt protocols
◄ Intermediate
KNIME workbench GUI ◄
• Report designer
• Global variables
• Batch execuGon
• Tips and tricks
Report designer
• From knime.com but free of charge. Included in our distribuGon
• Include To report node(s) in the workflow (can’t be in metanodes) and switch to the
Report designer mode
Report designer- template mode
Report designer- Canvas 2D renderer
The structures can be shown in a report using Canvas 2D renderer using the following procedure:
• 1. In the workflow, add a MAE-to-smiles node and a To report node.
• 2. In ReporGng mode, in the Layout tab, add a table to the report (drag and drop from the Data set view).
• 3. Insert in the "[smiles]" cell (Table- detail row) an Image widget from the Report Items list.
• 4. Configure the widget (using "Edit" on the widget), select "Dynamic image", and press "Select Image
Data..." to select the source column (which should be the Smiles column). Delete "[smiles]" if you want just
the image and no SMILES. You may want to alter the size of the cell by dragging the border verGcally and
horizontally if necessary.
• 5. Change the size of the image to something like 300x300, which is done by ediGng the Data set view
(right click -> Edit -> Parameters), and changing (or creaGng new Parameters typed as integer if they don't
exist yet) the knime-image-height and knime-image-width parameters.
• 6. Check the view in the Preview tab
Global variables
• In the Workflow project list, right-click on the workflow, under Workflow variables
Batch execuDon
• $SCHRODINGER/knime -batch –reset –nosplash -nosave
-workflowFile=<path>/<wkf>.zip or -workflowDir=<path>/<workspace>/<wkf>
• Alter some sepngs -opDon=nodeNumber,valueName,value,type
-opGon=7,filename,"/tmp/new-molprops.csv",String (int, double or String)
Find the node number in the configuraGon panel header (add the metanode numbers)
eg 123/456/78 for the node 78 in the metanode 456 in the metanode 123
Find the opGon name in the workspace directory: <workflow>/node_name(#7)/node.xml eg:
<config key="DataURL">
<entry key="array-size" type="xint" value="1" />
<entry key="0" type="xstring" value="/C:/serotonin_unique.sdf" />
-opGon=2,DataURL\0,"file:/tmp/new-input.mae",String
When the input is an array
• Pass some variables: -workflow.variable=name,value,type (int, double or String)
• Workflows can be run from Maestro using a simple Python script wrapper
Tips and tricks
• Rearrange the panels
• Workflow Meta-Infos
• Try to open a workflow modified with a newer version of KNIME alter the 2
following lines of the file
<workspace>/<workflow>/workflow.knime:
<entry key="created_by" type="xstring" value="2.0.3.0021120"/>
<entry key="version" type="xstring" value="2.0.0"/>
KNIME workbench nodes ◄
• Edit variables and advanced looping funcGonaliGes
• Hilite funcGonaliGes
• Database nodes
• Miscellaneous useful nodes
Edit variables and advanced looping funcDonaliDes