Papers by Wolfram Gronwald
BMC Bioinformatics, Feb 19, 2008
Background: As a rule, peptides are more flexible and unstructured than proteins with their subst... more Background: As a rule, peptides are more flexible and unstructured than proteins with their substantial stabilizing hydrophobic cores. Nevertheless, a few stably folding peptides have been discovered. This raises the question whether there may be more such peptides that are unknown as yet. These molecules could be helpful in basic research and medicine. Results: As a method to explore the space of conformationally stable peptides, we have developed an evolutionary algorithm that allows optimization of sequences with respect to several criteria simultaneously, for instance stability, accessibility of arbitrary parts of the peptide, etc. In a proofof-concept experiment we have perturbed the sequence of the peptide Villin Headpiece, known to be stable in vitro. Starting from the perturbed sequence we applied our algorithm to optimize peptide stability and accessibility of a loop. Unexpectedly, two clusters of sequences were generated in this way that, according to our criteria, should form structures with higher stability than the wild-type. The structures in one of the clusters possess a fold that markedly differs from the native fold of Villin Headpiece. One of the mutants predicted to be stable was selected for synthesis, its molecular 3D-structure was characterized by nuclear magnetic resonance spectroscopy, and its stability was measured by circular dichroism. Predicted structure and stability were in good agreement with experiment. Eight other sequences and structures, including five with a non-native fold are provided as bona fide predictions. Conclusion: The results suggest that much more conformationally stable peptides may exist than are known so far, and that small fold classes could comprise well-separated sub-folds.
ABSTRACT A flexible tool for rigid systems. Residual dipolar couplings (RDCs) have proven to be v... more ABSTRACT A flexible tool for rigid systems. Residual dipolar couplings (RDCs) have proven to be valuable NMR structural parameters that provide insights into the backbone conformations of short linear peptidic foldamers, as illustrated here. This study demonstrates that RDCs at natural abundance can provide essential structural information even in the case of short linear peptides with unnatural amino acids. In addition, they allow for the detection of proline side-chain conformations and are used as a quality check for the parameterizations of rigid unnatural amino acids.
Methods in molecular biology, Nov 29, 2016
New technologies allow for high-dimensional profiling of patients. For instance, genome-wide gene... more New technologies allow for high-dimensional profiling of patients. For instance, genome-wide gene expression analysis in tumors or in blood is feasible with microarrays, if all transcripts are known, or even without this restriction using high-throughput RNA sequencing. Other technologies like NMR finger printing allow for high-dimensional profiling of metabolites in blood or urine. Such technologies for high-dimensional patient profiling represent novel possibilities for molecular diagnostics. In clinical profiling studies, researchers aim to predict disease type, survival, or treatment response for new patients using high-dimensional profiles. In this process, they encounter a series of obstacles and pitfalls. We review fundamental issues from machine learning and recommend a procedure for the computational aspects of a clinical profiling study.
BMC Cancer, Apr 5, 2019
Background: MYC is a heterogeneously expressed transcription factor that plays a multifunctional ... more Background: MYC is a heterogeneously expressed transcription factor that plays a multifunctional role in many biological processes such as cell proliferation and differentiation. It is also associated with many types of cancer including the malignant lymphomas. There are two types of aggressive B-cell lymphoma, namely Burkitt lymphoma (BL) and a subgroup of diffuse large cell lymphoma (DLBCL), which both carry MYC translocations and overexpress MYC but both differ significantly in their clinical outcome. In DLBCL, MYC translocations are associated with an aggressive behavior and poor outcome, whereas MYC-positive BL show a superior outcome. Methods: To shed light on this phenomenon, we investigated the different modes of actions of MYC in aggressive B-cell lymphoma cell lines subdivided into three groups: (i) MYC-positive BL, (ii) DLBCL with MYC translocation (DLBCLpos) and (iii) DLBCL without MYC translocation (DLBCLneg) for control. In order to identify genome-wide MYC-DNA binding sites a chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-Seq) was performed. In addition, ChIP-Seq for H3K4me3 was used for determination of genomic regions accessible for transcriptional activity. These data were supplemented with gene expression data derived from RNA-Seq. Results: Bioinformatics integration of all data sets revealed different MYC-binding patterns and transcriptional profiles in MYC-positive BL and DLBCL cell lines indicating different functional roles of MYC for gene regulation in aggressive B-cell lymphomas. Based on this multi-omics analysis we identified ADGRE5 (alias CD97)-a member of the EGF-TM7 subfamily of adhesion G protein-coupled receptors-as a MYC target gene, which is specifically expressed in BL but not in DLBCL regardless of MYC translocation. Conclusion: Our study describes a diverse genome-wide MYC-DNA binding pattern in BL and DLBCL cell lines with and without MYC translocations. Furthermore, we identified ADREG5 as a MYC target gene able to discriminate between BL and DLBCL irrespectively of the presence of MYC breaks in DLBCL. Since ADGRE5 plays an important role in tumor cell formation, metastasis and invasion, it might also be instrumental to better understand the different pathobiology of BL and DLBCL and help to explain discrepant clinical characteristics of BL and DLBCL.
Scientific Reports, Mar 9, 2018
Non-uniform sampling (NUS) allows the accelerated acquisition of multidimensional NMR spectra. Th... more Non-uniform sampling (NUS) allows the accelerated acquisition of multidimensional NMR spectra. The aim of this contribution was the systematic evaluation of the impact of various quantitative NUS parameters on the accuracy and precision of 2D NMR measurements of urinary metabolites. Urine aliquots spiked with varying concentrations (15.6-500.0 µM) of tryptophan, tyrosine, glutamine, glutamic acid, lactic acid, and threonine, which can only be resolved fully by 2D NMR, were used to assess the influence of the sampling scheme, reconstruction algorithm, amount of omitted data points, and seed value on the quantitative performance of NUS in 1 H, 1 H-TOCSY and 1 H, 1 H-COSY45 NMR spectroscopy. Sinusoidal Poisson-gap sampling and a compressed sensing approach employing the iterative re-weighted least squares method for spectral reconstruction allowed a 50% reduction in measurement time while maintaining sufficient quantitative accuracy and precision for both types of homonuclear 2D NMR spectroscopy. Together with other advances in instrument design, such as stateof-the-art cryogenic probes, use of 2D NMR spectroscopy in large biomedical cohort studies seems feasible.
Nephrology Dialysis Transplantation, Nov 13, 2014
Background. Reduced kidney function is a risk factor for hyperuricaemia and gout, but limited inf... more Background. Reduced kidney function is a risk factor for hyperuricaemia and gout, but limited information on the burden of gout is available from studies of patients with chronic kidney disease (CKD). We therefore examined the prevalence and correlates of gout in the large prospective observational German Chronic Kidney Disease (GCKD) study. Methods. Data from 5085 CKD patients aged 18-74 years with an estimated glomerular filtration rate (eGFR) of 30-<60 mL/ min/1.73 m 2 or eGFR ≥60 and overt proteinuria at recruitment and non-missing values for self-reported gout, medications and urate measurements from a central laboratory were evaluated.
Journal of Computational Chemistry, May 31, 2011
One of the main challenges in protein–protein docking is a meaningful evaluation of the many puta... more One of the main challenges in protein–protein docking is a meaningful evaluation of the many putative solutions. Here we present a program (PROCOS) that calculates a probability‐like measure to be native for a given complex. In contrast to scores often used for analyzing complex structures, the calculated probabilities offer the advantage of providing a fixed range of expected values. This will allow, in principle, the comparison of models corresponding to different targets that were solved with the same algorithm. Judgments are based on distributions of properties derived from a large database of native and false complexes. For complex analysis PROCOS uses these property distributions of native and false complexes together with a support vector machine (SVM). PROCOS was compared to the established scoring schemes of ZRANK and DFIRE. Employing a set of experimentally solved native complexes, high probability values above 50% were obtained for 90% of these structures. Next, the performance of PROCOS was tested on the 40 binary targets of the Dockground decoy set, on 14 targets of the RosettaDock decoy set and on 9 targets that participated in the CAPRI scoring evaluation. Again the advantage of using a probability‐based scoring system becomes apparent and a reasonable number of near native complexes was found within the top ranked complexes. In conclusion, a novel fully automated method is presented that allows the reliable evaluation of protein–protein complexes. © 2011 Wiley Periodicals, Inc. J Comput Chem, 2011
Bioinformatics
Motivation Mixed molecular data combines continuous and categorical features of the same samples,... more Motivation Mixed molecular data combines continuous and categorical features of the same samples, such as OMICS profiles with genotypes, diagnoses, or patient sex. Like all high-dimensional molecular data, it is prone to incorrect values that can stem from various sources for example the technical limitations of the measurement devices, errors in the sample preparation, or contamination. Most anomaly detection algorithms identify complete samples as outliers or anomalies. However, in most cases, not all measurements of those samples are erroneous but only a few one-dimensional features within the samples are incorrect. These one-dimensional data errors are continuous measurements that are either located outside or inside the normal ranges of their features but in both cases show atypical values given all other continuous and categorical features in the sample. Additionally, categorical anomalies can occur for example when the genotype or diagnosis was submitted wrongly. Results We i...
Metabolites
Untargeted metabolomics is a promising tool for identifying novel disease biomarkers and unraveli... more Untargeted metabolomics is a promising tool for identifying novel disease biomarkers and unraveling underlying pathomechanisms. Nuclear magnetic resonance (NMR) spectroscopy is particularly suited for large-scale untargeted metabolomics studies due to its high reproducibility and cost effectiveness. Here, one-dimensional (1D) 1H NMR experiments offer good sensitivity at reasonable measurement times. Their subsequent data analysis requires sophisticated data preprocessing steps, including the extraction of NMR features corresponding to specific metabolites. We developed a novel 1D NMR feature extraction procedure, called Bucket Fuser (BF), which is based on a regularized regression framework with fused group LASSO terms. The performance of the BF procedure was demonstrated using three independent NMR datasets and was benchmarked against existing state-of-the-art NMR feature extraction methods. BF dynamically constructs NMR metabolite features, the widths of which can be adjusted via ...
Cancers
The isocitrate dehydrogenase (IDH) mutation status is an indispensable prerequisite for diagnosis... more The isocitrate dehydrogenase (IDH) mutation status is an indispensable prerequisite for diagnosis of glioma (astrocytoma and oligodendroglioma) according to the WHO classification of brain tumors 2021 and is a potential therapeutic target. Usually, immunohistochemistry followed by sequencing of tumor tissue is performed for this purpose. In clinical routine, however, non-invasive determination of IDH mutation status is desirable in cases where tumor biopsy is not possible and for monitoring neuro-oncological therapies. In a previous publication, we presented reliable prediction of IDH mutation status employing proton magnetic resonance spectroscopy (1H-MRS) on a 3.0 Tesla (T) scanner and machine learning in a prospective cohort of 34 glioma patients. Here, we validated this approach in an independent cohort of 67 patients, for which 1H-MR spectra were acquired at 1.5 T between 2002 and 2007, using the same data analysis approach. Despite different technical conditions, a sensitivity...
Identification of chronic kidney disease (CKD) patients, who are at risk of progressing to kidney... more Identification of chronic kidney disease (CKD) patients, who are at risk of progressing to kidney failure requiring kidney replacement therapy (KRT), frequently also designated as end-stage kidney disease (ESKD), is important for clinical decision-making and clinical trial design and enrollment. [for full text, please go to the a.m. URL]
Metabolites, 2022
Due to organ shortage and rising life expectancy the age of organ donors and recipients is increa... more Due to organ shortage and rising life expectancy the age of organ donors and recipients is increasing. Reliable biomarkers of organ quality that predict successful long-term transplantation outcomes are poorly defined. The aim of this study was the identification of age-related markers of kidney function that might accurately reflect donor organ quality. Histomorphometric, biochemical and molecular parameters were measured in young (3-month-old) and old (24-month-old) male Sprague Dawley rats. In addition to conventional methods, we used urine metabolomics by NMR spectroscopy and gene expression analysis by quantitative RT-PCR to identify markers of ageing relevant to allograft survival. Beside known markers of kidney ageing like albuminuria, changes in the concentration of urine metabolites such as trimethylamine-N-oxide, trigonelline, 2-oxoglutarate, citrate, hippurate, glutamine, acetoacetate, valine and 1-methyl-histidine were identified in association with ageing. In addition, ...
Based on a protein-protein docking approach we have developed a procedure to verify or falsify pr... more Based on a protein-protein docking approach we have developed a procedure to verify or falsify protein-protein interactions that were proposed by other methods such as yeast-2-hybrid assays. Our method currently utilizes intermolecular energies but can be expanded to incorporate additional terms such as amino acid based pair-potentials. We show some early results that demonstrate the general applicability of our approach.
Extracting biomedical information from large metabolomic datasets by multivariate data analysis i... more Extracting biomedical information from large metabolomic datasets by multivariate data analysis is of considerable complexity. Common challenges include among others screening for differentially produced metabolites, estimation of fold changes, and sample classification. Prior to these analysis steps, it is important to minimize contributions from unwanted biases and experimental variance. This is the goal of data preprocessing. In this work, different data normalization methods were compared systematically employing two different datasets generated by means of nuclear magnetic resonance (NMR) spectroscopy. To this end, two different types of normalization methods were used, one aiming to remove unwanted sample-to-sample variation while the other adjusts the variance of the different metabolites by variable scaling and variance stabilization methods. The impact of all methods tested on sample classification was evaluated on urinary NMR fingerprints obtained from healthy volunteers and patients suffering from autosomal polycystic kidney disease (ADPKD). Performance in terms of screening for differentially produced metabolites was investigated on a dataset following a Latin-square design, where varied amounts of 8 different metabolites were spiked into a human urine matrix while keeping the total spike-in amount constant. In addition, specific tests were conducted to systematically investigate the influence of the different preprocessing methods on the structure of the analyzed data. In conclusion, preprocessing methods originally developed for DNA microarray analysis, in particular, Quantile and Cubic-Spline Normalization, performed best in reducing bias, accurately detecting fold changes, and classifying samples.
Metabolites, 2021
NMR spectroscopy is a widely used method for the detection and quantification of metabolites in c... more NMR spectroscopy is a widely used method for the detection and quantification of metabolites in complex biological fluids. However, the large number of metabolites present in a biological sample such as urine or plasma leads to considerable signal overlap in one-dimensional NMR spectra, which in turn hampers both signal identification and quantification. As a consequence, we have developed an easy to use R-package that allows the fully automated deconvolution of overlapping signals in the underlying Lorentzian line-shapes. We show that precise integral values are computed, which are required to obtain both relative and absolute quantitative information. The algorithm is independent of any knowledge of the corresponding metabolites, which also allows the quantitative description of features of yet unknown identity.
Cancers, 2020
Isocitrate dehydrogenase (IDH)-1 mutation is an important prognostic factor and a potential thera... more Isocitrate dehydrogenase (IDH)-1 mutation is an important prognostic factor and a potential therapeutic target in glioma. Immunohistological and molecular diagnosis of IDH mutation status is invasive. To avoid tumor biopsy, dedicated spectroscopic techniques have been proposed to detect D-2-hydroxyglutarate (2-HG), the main metabolite of IDH, directly in vivo. However, these methods are technically challenging and not broadly available. Therefore, we explored the use of machine learning for the non-invasive, inexpensive and fast diagnosis of IDH status in standard 1H-magnetic resonance spectroscopy (1H-MRS). To this end, 30 of 34 consecutive patients with known or suspected glioma WHO grade II-IV were subjected to metabolic positron emission tomography (PET) imaging with O-(2-18F-fluoroethyl)-L-tyrosine (18F-FET) for optimized voxel placement in 1H-MRS. Routine 1H-magnetic resonance (1H-MR) spectra of tumor and contralateral healthy brain regions were acquired on a 3 Tesla magnetic ...
arXiv: Applications, 2018
Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, conseq... more Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, consequently, their diagnosis, treatment, and prevention. To that end, it is necessary to integrate omics data with other data types such as clinical, phenotypic, and demographic parameters of categorical or continuous nature. Here, we exemplify this data integration issue for a study on chronic kidney disease (CKD), where complex clinical and demographic parameters were assessed together with one-dimensional (1D) 1H NMR metabolic fingerprints. Routine analysis screens for associations of single metabolic features with clinical parameters, which requires confounding variables typically chosen by expert knowledge to be taken into account. This knowledge can be incomplete or unavailable. The results of this article are manifold. We introduce a framework for data integration that intrinsically adjusts for confounding variables. We give its mathematical and algorithmic foundation, provide a state-...
American Journal of Kidney Diseases, 2021
RATIONALE & OBJECTIVE Stratification of chronic kidney disease (CKD) patients at risk for pro... more RATIONALE & OBJECTIVE Stratification of chronic kidney disease (CKD) patients at risk for progressing to end-stage kidney disease (ESKD) requiring kidney replacement therapy (KRT) is important for clinical decision-making and trial enrollment. STUDY DESIGN Four independent prospective observational cohort studies. SETTING & PARTICIPANTS The development cohort was comprised of 4,915 CKD patients and three independent validation cohorts were comprised of a total of 3,063. Patients were followed-up for approximately five years. NEW PREDICTORS & ESTABLISHED PREDICTORS 22 demographic, anthropometric and laboratory variables commonly assessed in CKD patients. OUTCOMES Progression to ESKD requiring KRT. ANALYTICAL APPROACH A Least Absolute Shrinkage and Selection Operator (LASSO) Cox proportional hazards model was fit to select laboratory variables that best identified patients at high risk for ESKD. Model discrimination and calibration were assessed and compared against the 4-variable Tangri (T4) risk equation. Both used a resampling approach within the development cohort and in the validation cohorts using cause-specific concordance (C) statistics, net reclassification improvement, and calibration graphs. RESULTS The newly derived 6-variable (Z6) risk score included serum creatinine, albumin, cystatin C and urea, as well as hemoglobin and the urine albumin-to-creatinine ratio. Based on the resampling approach, Z6 achieved a median C value of 0.909 (95% CI, 0.868-0.937) at two years after the baseline visit, whereas the T4 achieved a median C value of 0.855 (95% CI, 0.799-0.915). In the three independent validation cohorts, Z6 C values were 0.894, 0.921, and 0.891, whereas the T4 C values were 0.882, 0.913, and 0.862. LIMITATIONS The Z6 was both derived and tested only in White European cohorts. CONCLUSIONS A new risk equation, based on six routinely available laboratory tests facilitates identification of patients with CKD who are at high risk of progressing to ESKD.
Scientific Reports, 2019
Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, conseq... more Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, consequently, their diagnosis, treatment, and prevention. To this end, omics data are integrated with other data types, e.g., clinical, phenotypic, and demographic parameters of categorical or continuous nature. We exemplify this data integration issue for a chronic kidney disease (CKD) study, comprising complex clinical, demographic, and one-dimensional 1H nuclear magnetic resonance metabolic variables. Routine analysis screens for associations of single metabolic features with clinical parameters while accounting for confounders typically chosen by expert knowledge. This knowledge can be incomplete or unavailable. We introduce a framework for data integration that intrinsically adjusts for confounding variables. We give its mathematical and algorithmic foundation, provide a state-of-the-art implementation, and evaluate its performance by sanity checks and predictive performance assessment on...
Uploads
Papers by Wolfram Gronwald