Papers by Beatriz Stransky
bioRxiv (Cold Spring Harbor Laboratory), Oct 1, 2020
Coronavirus disease 2019 (COVID-19) rapidly transformed into a global pandemic, for which a deman... more Coronavirus disease 2019 (COVID-19) rapidly transformed into a global pandemic, for which a demand for developing antivirals capable of targeting the SARS-CoV-2 RNA genome and blocking the activity of its genes has emerged. In this work, we propose a database of SARS-CoV-2 targets for siRNA approaches, aiming to speed the design process by providing a broad set of possible targets and siRNA sequences. Beyond target sequences, it also displays more than 170 features, including thermodynamic information, base context, target genes and alignment information of sequences against the human genome, and diverse SARS-CoV-2 strains, to assess whether siRNAs targets bind or not off-target sequences. This dataset is available as a set of four tables in a single spreadsheet file, each table corresponding to sequences of 18, 19, 20, and 21 nucleotides length, respectively, .
Genome Medicine, 2009
Technological advances have enabled a better characterization of all the genetic alterations in t... more Technological advances have enabled a better characterization of all the genetic alterations in tumors. A picture that emerges is that tumor cells are much more genetically heterogeneous than originally expected. Thus, a critical issue in cancer genomics is the identification of the genetic alterations that drive the genesis of a tumor. Recently, a systems biology approach has been used to characterize such alterations and find associations between them and the process of gliomagenesis. Here, we discuss some implications of this strategy for the development of new therapeutic and diagnostic protocols for cancer.
BMC Medical Informatics and Decision Making, Mar 10, 2020
Background: A variant of unknown significance (VUS) is a variant form of a gene that has been ide... more Background: A variant of unknown significance (VUS) is a variant form of a gene that has been identified through genetic testing, but whose significance to the organism function is not known. An actual challenge in precision medicine is to precisely identify which detected mutations from a sequencing process have a suitable role in the treatment or diagnosis of a disease. The average accuracy of pathogenicity predictors is 85%. However, there is a significant discordance about the identification of mutational impact and pathogenicity among them. Therefore, manual verification is necessary for confirming the real effect of a mutation in its casuistic. Methods: In this work, we use variables categorization and selection for building a decision tree model, and later we measure and compare its accuracy with four known mutation predictors and seventeen supervised machinelearning (ML) algorithms. Results: The results showed that the proposed tree reached the highest precision among all tested variables: 91% for True Neutrals, 8% for False Neutrals, 9% for False Pathogenic, and 92% for True Pathogenic. Conclusions: The decision tree exceptionally demonstrated high classification precision with cancer data, producing consistently relevant forecasts for the sample tests with an accuracy close to the best ones achieved from supervised ML algorithms. Besides, the decision tree algorithm is easier to apply in clinical practice by non-IT experts. From the cancer research community perspective, this approach can be successfully applied as an alternative for the determination of potential pathogenicity of VOUS.
Frontiers in Physiology, 2013
Tumorigenesis can be seen as an evolutionary process, in which the transformation of a normal cel... more Tumorigenesis can be seen as an evolutionary process, in which the transformation of a normal cell into a tumor cell involves a number of limiting genetic and epigenetic events, occurring in a series of discrete stages. However, not all mutations in a cell are directly involved in cancer development and it is likely that most of them (passenger mutations) do not contribute in any way to tumorigenesis. Moreover, the process of tumor evolution is punctuated by selection of advantageous (driver) mutations and clonal expansions. Regarding these driver mutations, it is uncertain how many limiting events are required and/or sufficient to promote a tumorigenic process or what are the values associated with the adaptive advantage of different driver mutations. In spite of the availability of high-quality cancer data, several assumptions about the mechanistic process of cancer initiation and development remain largely untested, both mathematically and statistically. Here we review the development of recent mathematical/computational models and discuss their impact in the field of tumor biology.
Scientific Reports, Apr 23, 2021
Coronavirus disease 2019 (COVID-19) rapidly transformed into a global pandemic, for which a deman... more Coronavirus disease 2019 (COVID-19) rapidly transformed into a global pandemic, for which a demand for developing antivirals capable of targeting the SARS-CoV-2 RNA genome and blocking the activity of its genes has emerged. In this work, we presented a database of SARS-CoV-2 targets for small interference RNA (siRNA) based approaches, aiming to speed the design process by providing a broad set of possible targets and siRNA sequences. The siRNAs sequences are characterized and evaluated by more than 170 features, including thermodynamic information, base context, target genes and alignment information of sequences against the human genome, and diverse SARS-CoV-2 strains, to assess possible bindings to off-target sequences. This dataset is available as a set of four tables, available in a spreadsheet and CSV (Comma-Separated Values) formats, each one corresponding to sequences of 18, 19, 20, and 21 nucleotides length, aiming to meet the diversity of technology and expertise among laboratories around the world. A metadata table (Supplementary Table S1), which describes each feature, is also provided in the aforementioned formats. We hope that this database helps to speed up the development of new target antivirals for SARS-CoV-2, contributing to a possible strategy for a faster and effective response to the COVID-19 pandemic. Started in December 2019, coronavirus disease 2019 (COVID-19) rapidly transformed into a global pandemic, with an incidence of almost 100 M cases and more than 2 M deaths around the world in January 2021 1 , with a strong impact on the global economy 2. The SARS-CoV-2 genome has a 29,903 base of single and positive-strand RNA (SARS-CoV-2 Wuhan Hu-1 strain, Accession: NC_045512), and consists of fourteen open reading frames (ORFs) which coded for twenty-seven structural and nonstructural proteins (nsps). The genome organization of SARS-CoV is similar to other CoVs and recent phylogenetic analyses indicated that SARS-CoV and the group 2 CoVs are closely related and may share a common ancestor 3. A comparative analysis of SARS-CoV-2 and SARS-CoV showed that they present an extensive homology at genomic level, sharing approximately 79% of sequence identity 4. Currently, there are hundreds of SARS-CoV-2 variants being sequenced 5 , a handful of vaccines have been authorized and many more vaccine candidates remain in development around the world 6. However, despite all the scientific research and efforts, there is no specific treatment for those that were already infected by SARS-CoV-2. This scenario brought a huge demand for developing antivirals capable of targeting the SARS-CoV-2 RNA genome and RNA interference approach 7-9 emerged as a possible solution. Small interference RNA (siRNAs) are RNA sequences about 20nt-long that, together with RNA-Induced Silencing System (RISC), bind mRNA target molecules 9,10 inhibiting its translation and expression. Since the discovery of the RNAi mechanism in the late 90s 7 and its effect of precisely suppressing any gene by a base sequence match, the potential of its application became evident. Soon it became a ubiquitous tool in biological research and applications, from functional genomics 11 to biomedicine 12-15 and pest control 16,17. Following this 'silent revolution' , in 2018 the US Food and Drug Administration approved the first RNAi therapeutic, a treatment for polyneuropathy caused by transthyretin (TTR) amyloidosis, from Alnylam Pharmaceuticals 18. Many studies have been proposed siRNAs development for SARS-CoV 19-21 , with reports of viral levels decrease 22 and recent works claim that it may also work for SARS-CoV-2 23-26. From experimental studies to patent applications, researchers have explored this approach as a potential treatment for COVID-19. Supplementary Table S3 presents a compilation of recent scientific papers, patents, and product development projects
ABSTRACT form only given. Alternative splicing events (AS) are among the most significant factors... more ABSTRACT form only given. Alternative splicing events (AS) are among the most significant factors determining the complexity of multi-cellular organisms. Most, if not all, multi-exonic human genes undergo AS. Many AS events are involved in the etiology of cancer, among many other common human disorders. The emergence of next-generation sequencing offers a unique opportunity to explore the variability generated by AS in an exhaustive way. Furthermore, recent developments in new mass-spectometry platforms have allowed a deeper survey of the human proteome. Here, an analysis of intron retention, the most rare type of AS, was performed integrating transcriptome and proteome data. Intron retention events were evaluated in relation to several features, focusing on whether they had biological significance or whether they were just spurious products from the splicing machinery. For the transcriptome analysis, the following dataset was used: 30,678 RefSeqs, 258,444 mRNAs, 6,987,423 ESTs and 9,565,439 sequences derived from NGS. For the proteome analysis, data from Geiger et al., MCP, 2012 were used. We were able to detect an intron retention event for 48% of all human genes. Confirming a previous publication from our group [1], these events are enriched at the 3'and 5'untranslated regions (UTRs). Retained introns were significantly enriched with coding potential, which supports a biological role for these events. Furthermore, they were enriched for targets of microRNAs, suggesting a role of this type of AS in the regulation of expression induced by these non-coding RNAs. A significant number of events were detected at the proteome level. This information was integrated together with transcriptome data to further explore the role of intron retention in many biological phenomena.
Cancer Research, 2009
With the availability of the human genome sequence, the sub-cellular localization of gene product... more With the availability of the human genome sequence, the sub-cellular localization of gene products is a critical property since this affects our ability to use it as a potential diagnostic and therapeutic target. In this respect, the identification of cell surface proteins is critical since they represent ideal therapeutic and diagnosis targets. By using bioinformatics tools integrating sequences with annotated trans-membrane (TM) domain reported by PFAM and translated sequences from the Reference Sequences to an algorithm that identifies TM domains using a Hidden Markov Model-based strategy, we generated a catalog of 3,702 TM proteins located at the cell surface of human cells (the human surfaceome). By integrating expression information from a variety of sources (MPSS, SAGE, microarrays), we were able to identify surfaceome genes with a restricted expression in normal tissues and/or differential expression in tumors, important characteristics for a putative tumor target. We employ...
2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology, 2012
ABSTRACT form only given. Alternative splicing events (AS) are among the most significant factors... more ABSTRACT form only given. Alternative splicing events (AS) are among the most significant factors determining the complexity of multi-cellular organisms. Most, if not all, multi-exonic human genes undergo AS. Many AS events are involved in the etiology of cancer, among many other common human disorders. The emergence of next-generation sequencing offers a unique opportunity to explore the variability generated by AS in an exhaustive way. Furthermore, recent developments in new mass-spectometry platforms have allowed a deeper survey of the human proteome. Here, an analysis of intron retention, the most rare type of AS, was performed integrating transcriptome and proteome data. Intron retention events were evaluated in relation to several features, focusing on whether they had biological significance or whether they were just spurious products from the splicing machinery. For the transcriptome analysis, the following dataset was used: 30,678 RefSeqs, 258,444 mRNAs, 6,987,423 ESTs and 9,565,439 sequences derived from NGS. For the proteome analysis, data from Geiger et al., MCP, 2012 were used. We were able to detect an intron retention event for 48% of all human genes. Confirming a previous publication from our group [1], these events are enriched at the 3'and 5'untranslated regions (UTRs). Retained introns were significantly enriched with coding potential, which supports a biological role for these events. Furthermore, they were enriched for targets of microRNAs, suggesting a role of this type of AS in the regulation of expression induced by these non-coding RNAs. A significant number of events were detected at the proteome level. This information was integrated together with transcriptome data to further explore the role of intron retention in many biological phenomena.
This projects provides database of SARS-CoV-2 targets for siRNA approaches, aiming to speed the d... more This projects provides database of SARS-CoV-2 targets for siRNA approaches, aiming to speed the development of new target antivirals.
International journal of gynaecology and obstetrics, Jun 10, 2019
We identified mobile applications (apps) found on digital platforms (iTunes Store and Google Play... more We identified mobile applications (apps) found on digital platforms (iTunes Store and Google Play) that addressed topics about gynecology and obstetrics.
The characterization and quantitative description of histological images is not a simple problem.... more The characterization and quantitative description of histological images is not a simple problem. To reach a final diagnosis, usually the specialist relies on the analysis of characteristics easily observed, such as cells size, shape, staining and texture, but also depends on the hidden information of tissue localization, physiological and pathological mechanisms, clinical aspects, or other etiological agents. In this paper, Mathematical Morphology (MM) and Machine Learning (ML) methods were applied to characterize and classify histological images. MM techniques were employed for image analysis. The measurements obtained from image and graph analysis were fed into Machine Learning algorithms, which were designed and developed to automatically learn to recognize complex patterns and make intelligent decisions based on data. Specifically, a linear Support Vector Machine (SVM) was used to evaluate the discriminatory power of the used measures. The results show that the methodology was successful in characterizing and classifying the differences between the architectural organization of epithelial and adipose tissues. We believe that this approach can be also applied to classify and help the diagnosis of many tissue abnormalities, such as cancers.
African Journal of AIDS Research, Jan 2, 2021
We evaluated existing mobile applications (apps) on both Android and iOS (Apple) platforms that a... more We evaluated existing mobile applications (apps) on both Android and iOS (Apple) platforms that are used by men who have sex with men (MSM) to obtain sexual encounters. The word “gay” was used to search for apps in the Apple and Google Play virtual stores. The 10 most downloaded apps were analysed concerning safe sexual practices (SSP) messages. Out of 245 apps selected, 213 were evaluated — 102 for Android and 111 for iOS. Mostly social networks were accessed by MSM of which 112 allow access to people aged 14 and over. Most of the apps could be downloaded in more than two languages. Of the 10 most downloaded and evaluated apps, 5 had no HIV/STI and SSP messages, only 3 contained HIV/STI and SSP messages, and 2 had information about one or the other. Several social networking apps are available, however, there is no information on HIV/STI in the most accessed apps.
IFMBE proceedings, 2019
The Infusion Pump (IP) is a Medical-Hospital Equipment (MHE) that is used to regulate the passage... more The Infusion Pump (IP) is a Medical-Hospital Equipment (MHE) that is used to regulate the passage of liquids that will be infused into the patient due to the positive pressure generated by the pump. There are three main types of failure causes that can occur in MHE: human error, technology failure, and external phenomena. At the Maternidade Escola Januario Cicco (MEJC), located at Natal, RN, most of the maintenance events in the IPs came from de lack of information from the professionals who handled the equipment, as well as from human error. The present study describes the training in IPs to the medical staff (nurses, nursing technician, and physician), elaborated and performed during the curriculum internship of one of the authors and aims to compare the number of maintenance calls in IPs before and after training to assess the impact of training. For this, the occurrence numbers were obtained before and after training, and a comparison was made between the results. There was a drop of more than 70% in the number of internal maintenance services, showing that the training is very important within the hospital environment.
Cancers, Apr 24, 2022
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Research Square (Research Square), Oct 22, 2020
This protocol aims to describe the building of a database of SARS-CoV-2 targets for siRNA approac... more This protocol aims to describe the building of a database of SARS-CoV-2 targets for siRNA approaches. Starting from the virus reference genome, we will derive sequences from 18 to 21nt-long and verify their similarity against the human genome and coding and non-coding transcriptome, as well as genomes from related viruses. We will also calculate a set of thermodynamic features for those sequences and will infer their e ciencies using three different predictors. The protocol has two main phases: at rst, we align sequences against reference genomes. In the second one, we extract the features. The rst phase varies in terms of duration, depending on computational power from the running machine and the number of reference genomes. Despite that, the second phase lasts about thirty minutes of execution, also depending on the number of cores of running machine. The constructed database aims to speed the design process by providing a broad set of possible SARS-CoV-2 sequences targets and siRNA sequences.
bioRxiv (Cold Spring Harbor Laboratory), Dec 21, 2020
A new method is presented to detect bimodality in gene expression data using the Gaussian Mixture... more A new method is presented to detect bimodality in gene expression data using the Gaussian Mixture Models to cluster samples in each mode. We have used the method to search for bimodal genes in data from 25 tumor types available from The Cancer Genome Atlas. The method identified 554 genes with bimodal gene expression, of which 46 were identified in more than one cancer type. To further illustrate the impact of the method, we show that 96 out of the 554 genes with bimodal expression patterns presented different prognosis when patients belonging to the two expression peaks are compared. The software to execute the method and the corresponding documentation are available at https://github.com/LabBiosystemUFRN/Bimodality_Genes. Introduction Studies on gene expression and regulation have been directed towards a better understanding of a diverse range of biological processes, including the initial differentiation in the embryonic stage and changes in health and disease that occur during life. These patterns of gene expression have been extensively used to establish associations between phenotypes and genetic/epigenetic information [1-2]. The challenges for such studies are significant, however, and the identification of expression signatures enriched with bona fide phenotypic associations is particularly welcome. In that aspect, bimodal gene expression is an interesting pattern since their identification capitalizes on the availability of genetic and clinical data from large cohorts of samples and each mode can, in theory, correspond to a phenotypic state of the system. Few previous studies have searched for bimodality in large-scale gene expression data [3-5]. Causes for such bimodality have been discussed, including: i) differential .
Springer eBooks, Nov 18, 2009
Cancer is a disease determined by several genetic and epigenetic alterations. Due to technologica... more Cancer is a disease determined by several genetic and epigenetic alterations. Due to technological advances in the omics disciplines, cancer research is going through a revolution. The technological advances that lead to the post-genome era have allowed molecular biologists to make meticulous studies on the DNA (genome), the mRNA (transcriptome) and the protein sequences (proteome). Initiatives that intend to describe
Renal carcinoma is a pathology of silent and multifactorial development characterized by a high r... more Renal carcinoma is a pathology of silent and multifactorial development characterized by a high rate of metastases in patients. After several studies have elucidated the activity of coding genes in the metastatic progression of renal carcinoma, new studies seek to evaluate the association of non-coding genes, such as competitive endogenous RNA (ceRNA). Thus, this study aims to build a gene signature for clear cell renal cell carcinoma (ccRCC) associated with metastatic development from a ceRNA network and to analyze the probable biological functions performed by the participants of the signature. Using ccRCC data from The Cancer Genome Atlas (TCGA), we constructed the ceRNA network with the differentially expressed genes, assembled nine gene signatures from eight feature selection techniques, and analyzed the evaluation metrics of the classification models in the benchmarking process. With the signature, we performed somatic and copy number alteration analysis, survival and metastat...
Uploads
Papers by Beatriz Stransky