Papers by Natalia Szostak
Frontiers in Oncology
Basal cell carcinoma (BCC) of the skin is the most common cancer in humans, characterized by the ... more Basal cell carcinoma (BCC) of the skin is the most common cancer in humans, characterized by the highest mutation rate among cancers, and is mostly driven by mutations in genes involved in the hedgehog pathway. To date, almost all BCC genetic studies have focused exclusively on protein-coding sequences; therefore, the impact of noncoding variants on the BCC genome is unrecognized. In this study, with the use of whole-exome sequencing of 27 tumor/normal pairs of BCC samples, we performed an analysis of somatic mutations in both protein-coding sequences and gene-associated noncoding regions, including 5’UTRs, 3’UTRs, and exon-adjacent intron sequences. Separately, in each region, we performed hotspot identification, mutation enrichment analysis, and cancer driver identification with OncodriveFML. Additionally, we performed a whole-genome copy number alteration analysis with GISTIC2. Of the >80,000 identified mutations, ~50% were localized in noncoding regions. The results of the an...
PLOS Computational Biology, May 31, 2018
Back in March 2012, PLOS Computational Biology launched its 'Topic Pages' project as a way to hel... more Back in March 2012, PLOS Computational Biology launched its 'Topic Pages' project as a way to help fill important gaps in Wikipedia's coverage of computational biology content and to credit authors for their contributions. Topic Pages are written in the style of a Wikipedia article and are openly and publicly peer reviewed on the PLOS Wiki before being published in our PLOS journals, with a second, editable version posted to Wikipedia. Six years on, PLOS Computational Biology has published 11 Topic Pages covering a good range of subjects, from the Hypercycle to Approximate Bayesian Computation. The published articles have been widely viewed on Wikipedia as well as in the journal and well received by the community. We are welcoming submissions for further PLOS Computational Biology Topic Pages. We are looking for topics in computational biology that are of interest to our readership, the broader scientific community, and the public at large and that are not yet covered or insufficiently covered (i.e., exist as a 'stub') in Wikipedia. Last year, PLOS Genetics joined the Topic Pages initiative, as detailed in this blog post. We are also exploring how the Topic Pages approach could be extended to include Wikidata, the community-curated database connecting concepts covered in any Wikipedia article with the Semantic Web [1]. For instance, data from more and more research-related databases are being integrated with Wikidata or its semantic core, Wikibase. This creates the need to formalize data models: How should concepts like a disease outbreak, a cell-cycle checkpoint, a sequencer, biomineralization, or a functional magnetic resonance imaging (fMRI) data set be modelled in Wikidata or Wikibase? Conversely, what workflows allow us to collect information about such concepts in Wikidata, to interlink it with related information, to validate it, and to keep it up to date? Or, how can the data from Wikidata be explored or put to use in other contexts relevant to computational biology? We are working on establishing the editorial workflows to handle such Wikidata-focused Topic Pages and would welcome submissions to test these waters. For some inspiration, we suggest taking a look at Wikidata-based tools for browsing microbial genomes [2], scholarly publications [3], or software and file formats [4]. The Author Guidelines for Wikipedia-focused Topic Pages are available here. If you've noticed a gap in Wikipedia's coverage of particular computational biology topics, we want to hear from you! Please send ideas for Topic Pages to [email protected].
Applied Sciences
Background: Breast cancer affects over 2 million women yearly. Its early detection allows for suc... more Background: Breast cancer affects over 2 million women yearly. Its early detection allows for successful treatment, which motivates to research factors that enable an accurate diagnosis. miR-125a is one of them, correlating with different types of cancer. For example, the miR-125a level decreases in breast cancer tissues; polymorphisms in the miR-125a encoding gene are related to prostate cancer and the risk of radiotherapy-induced pneumonitis. Methods: In this work, we investigated two variants of rs12976445 polymorphism in the context of breast cancer. We analyzed the data of 175 blood samples from breast cancer patients and compared them with the control data from 129 control samples. Results: We observed the tendency that in breast cancer cases TT genotype appeared slightly more frequent over CC and CT genotypes (statistically nonsignificant). The TT genotype appeared also to be more frequent among human epidermal growth factor receptor 2 (HER2) positive patients, compared to HE...
Life
The catalytic effects of complex minerals or meteorites are often mentioned as important factors ... more The catalytic effects of complex minerals or meteorites are often mentioned as important factors for the origins of life. To assess the possible role of nanoconfinement within a catalyst consisting of montmorillonite (MMT) and the impact of local electric field on the formation efficiency of the simple hypothetical precursors of nucleic acid bases or amino acids, we performed ab initio Car–Parrinello molecular dynamics simulations. We prepared four condensed-phase systems corresponding to previously suggested prototypes of a primordial soup. We monitored possible chemical reactions occurring within gas-like bulk and MMT-confined four simulation boxes on a 20-ps time scale at 1 atm and 300 K, 400 K, and 600 K. Elevated temperatures did not affect the reactivity of the elementary components of the gas-like boxes considerably; however, the presence of the MMT nanoclay substantially increased the formation probability of new molecules. Approximately 20 different new compounds were found...
Current Bioinformatics
Background: Open science is an emerging movement underlining the importance of transparent, high ... more Background: Open science is an emerging movement underlining the importance of transparent, high quality research where results can be verified and reused by others. However, one of the biggest problems in replicating experiments is the lack of access to the data used by the authors. This problem also occurs during mathematical modeling of a viral infections. It is a process that can provide valuable insights into viral activity or into a drug’s mechanism of action when conducted correctly. Objective: We present the VirDB database (virdb.cs.put.poznan.pl), which has two primary objectives. First, it is a tool that enables collecting data on viral infections that could be used to develop new dynamic models of infections using the FAIR data sharing principles. Second, it allows storing references to descriptions of viral infection models, together with their evaluation results. Methods: To facilitate the fast population of database and the ease of exchange of scientific data, we decid...
Journal of Theoretical Biology
PLOS ONE
Despite years of study, it is still not clear how life emerged from inanimate matter and evolved ... more Despite years of study, it is still not clear how life emerged from inanimate matter and evolved into the complex forms that we observe today. One of the most recognized hypotheses for the origins of life, the RNA World hypothesis, assumes that life was sparked by prebiotic replicating RNA chains. In this paper, we address the problems caused by the interplay between hypothetical prebiotic RNA replicases and RNA parasitic species. We consider the coexistence of parasite RNAs and RNA replicases as well as the impact of parasites on the further evolution of replicases. For these purposes, we used multi-agent modeling techniques that allow for realistic assumptions regarding the movement and spatial interactions of modeled species. The general model used in this study is based on work by Takeuchi and Hogeweg. Our results confirm that the coexistence of parasite RNAs and replicases is possible in a spatially extended system, even if we take into consideration more realistic assumptions than Takeuchi and Hogeweg. However, we also showed that the presence of trade-off that takes into the account an RNA folding process could still pose a serious obstacle to the evolution of replication. We conclude that this might be a cause for one of the greatest transitions in life that took place early in evolution-the separation of the function between DNA templates and protein enzymes, with a central role for RNA species.
European Review, 2016
According to some hypotheses, from a statistical perspective the origin of life seems to be a hig... more According to some hypotheses, from a statistical perspective the origin of life seems to be a highly improbable event. Although there is no rigid definition of life itself, life as it is, is a fact. One of the most recognized hypotheses for the origins of life is the RNA world hypothesis. Laboratory experiments have been conducted to prove some assumptions of the RNA world hypothesis. However, despite some success in the ‘wet-lab’, we are still far from a complete explanation. Bioinformatics, supported by biomathematics, appears to provide the perfect tools to model and test various scenarios of the origins of life where wet-lab experiments cannot reflect the true complexity of the problem. Bioinformatics simulations of early pre-living systems may give us clues to the mechanisms of evolution. Whether or not this approach succeeds is still an open question. However, it seems likely that linking efforts and knowledge from the various fields of science into a holistic bioinformatics p...
PLOS Computational Biology, 2016
PLOS Computational Biology, 2016
A hypercycle is an abstract model of organization of self-replicating molecules connected in a cy... more A hypercycle is an abstract model of organization of self-replicating molecules connected in a cyclic, autocatalytic manner. It was introduced in an ordinary differential equation (ODE) form by the Nobel Prize winner Manfred Eigen in 1971 [1] and subsequently further extended in collaboration with Peter Schuster [2,3]. It was proposed as a solution to the error threshold problem encountered during modelling of replicative molecules that hypothetically existed on the primordial Earth (see: abiogenesis). The hypercycle is a special case of the replicator equation [4]. The most important properties of hypercycles are autocatalytic growth competition between cycles, once-for-ever selective behaviour, utilization of small selective advantage, rapid evolvability, increased information capacity, and selection against parasitic branches.
BMC bioinformatics, 2015
The function of RNA is strongly dependent on its structure, so an appropriate recognition of this... more The function of RNA is strongly dependent on its structure, so an appropriate recognition of this structure, on every level of organization, is of great importance. One particular concern is the assessment of base-base interactions, described as the secondary structure, the knowledge of which greatly facilitates an interpretation of RNA function and allows for structure analysis on the tertiary level. The RNA secondary structure can be predicted from a sequence using in silico methods often adjusted with experimental data, or assessed from 3D structure atom coordinates. Computational approaches typically consider only canonical, Watson-Crick and wobble base pairs. Handling of non-canonical interactions, important for a full description of RNA structure, is still very difficult. We introduce our novel approach to assessing an extended RNA secondary structure, which characterizes both canonical and non-canonical base pairs, along with their type classification. It is based on predicti...
RNA Biology, 2014
Intercellular communication mediated by extracellular vesicles has proved to play an important ro... more Intercellular communication mediated by extracellular vesicles has proved to play an important role in normal and pathological scenarios. however not too much information about the sorting mechanisms involved in loading the vesicles is available. Recently, our group has characterized the mRNA content of vesicles released by hepatic cellular systems, showing that a set of transcripts was particularly enriched in the vesicles in comparison with their intracellular abundance. In the current work, based on in silico bioinformatics tools, we have mapped a novel sequence of 12 nucleotides c
BMC Bioinformatics, 2012
Background: The structures of biological macromolecules provide a framework for studying their bi... more Background: The structures of biological macromolecules provide a framework for studying their biological functions. Three-dimensional structures of proteins, nucleic acids, or their complexes, are difficult to visualize in detail on flat surfaces, and algorithms for their spatial superposition and comparison are computationally costly. Molecular structures, however, can be represented as 2D maps of interactions between the individual residues, which are easier to visualize and compare, and which can be reconverted to 3D structures with reasonable precision. There are many visualization tools for maps of protein structures, but few for nucleic acids. Results: We developed RNAmap2D, a platform-independent software tool for calculation, visualization and analysis of contact and distance maps for nucleic acid molecules and their complexes with proteins or ligands. The program addresses the problem of paucity of bioinformatics tools dedicated to analyzing RNA 2D maps, given the growing number of experimentally solved RNA structures in the Protein Data Bank (PDB) repository, as well as the growing number of tools for RNA 2D and 3D structure prediction. RNAmap2D allows for calculation and analysis of contacts and distances between various classes of atoms in nucleic acid, protein, and small ligand molecules. It also discriminates between different types of base pairing and stacking. Conclusions: RNAmap2D is an easy to use method to visualize, analyze and compare structures of nucleic acid molecules and their complexes with other molecules, such as proteins or ligands and metal ions. Its special features make it a very useful tool for analysis of tertiary structures of RNAs. RNAmap2D for Windows/Linux/MacOSX is freely available for academic users at
Uploads
Papers by Natalia Szostak