Papers (science) by Pierre-Luc Germain
Bioinformatics, 2022
microRNAs are important post-transcriptional regulators of gene expression, but the identificatio... more microRNAs are important post-transcriptional regulators of gene expression, but the identification of functionally relevant targets is still challenging. Recent research has shown improved prediction of microRNA-mediated repression using a biochemical model combined with empirically-derived k-mer affinity predictions. Here, we translate this approach into a flexible and user-friendly bioconductor package, scanMiR, also available through a web interface. Using lightweight linear models, scanMiR efficiently scans for binding sites, estimates their affinity, and predicts aggregated transcript repression. Moreover, flexible 3’-supplementary alignment enables the prediction of unconventional interactions, such as bindings potentially leading to target-directed microRNA degradation (TDMD) or slicing. We showcase scanMiR through a systematic scan for such unconventional sites on neuronal transcripts, including lncRNAs and circRNAs.
Science, 2022
Convergent evidence associates exposure to endocrine disrupting chemicals (EDCs) with major human... more Convergent evidence associates exposure to endocrine disrupting chemicals (EDCs) with major human diseases, even at regulation-compliant concentrations. This might be because humans are exposed to EDC mixtures, whereas chemical regulation is based on a risk assessment of individual compounds. Here, we developed a mixture-centered risk assessment strategy that integrates epidemiological and experimental evidence. We identified that exposure to an EDC mixture in early pregnancy is associated with language delay in offspring. At human-relevant concentrations, this mixture disrupted hormone-regulated and disease-relevant regulatory networks in human brain organoids and in the model organisms Xenopus leavis and Danio rerio, as well as behavioral responses. Reinterrogating epidemiological data, we found that up to 54% of the children had prenatal exposures above experimentally derived levels of concern, reaching, for the upper decile compared with the lowest decile of exposure, a 3.3 times higher risk of language delay.
f1000research, 2021
Doublets are prevalent in single-cell sequencing data and can lead to artifactual findings. A num... more Doublets are prevalent in single-cell sequencing data and can lead to artifactual findings. A number of strategies have therefore been proposed to detect them. Building on the strengths of existing approaches, we developed scDblFinder, a fast, flexible and accurate Bioconductor-based doublet detection method. Here we present the method, justify its design choices, demonstrate its performance on both single-cell RNA and accessibility sequencing data, and provide some observations on doublet formation, detection, and enrichment analysis. Even in complex datasets, scDblFinder can accurately identify most heterotypic doublets, and was already found by an independent benchmark to outcompete alternatives.
Biological Psychiatry, 2021
Studying the stress response is a major pillar of neuroscience research not only because stress i... more Studying the stress response is a major pillar of neuroscience research not only because stress is a daily reality but also because the exquisitely fine-tuned bodily changes triggered by stress are a neuroendocrinological marvel. While the genome-wide changes induced by chronic stress have been extensively studied, we know surprisingly little about the complex molecular cascades triggered by acute stressors, the building blocks of chronic stress. The acute stress (or fight-or-flight) response mobilizes organismal energy resources to meet situational demands. However, successful stress coping also requires the efficient termination of the stress response. Maladaptive coping-particularly in response to severe or repeated stressors-can lead to allostatic (over)load, causing wear and tear on tissues, exhaustion, and disease. We propose that deep molecular profiling of the changes triggered by acute stressors could provide molecular correlates for allostatic load and predict healthy or maladaptive stress responses. We present a theoretical framework to interpret multiomic data in light of energy homeostasis and activity-dependent gene regulation, and we review the signaling cascades and molecular changes rapidly induced by acute stress in different cell types in the brain. In addition, we review and reanalyze recent data from multiomic screens conducted mainly in the rodent hippocampus and amygdala after acute psychophysical stressors. We identify challenges surrounding experimental design and data analysis, and we highlight promising new research directions to better understand the stress response on a multiomic level.
BMC Bioinformatics, 2021
Despite the importance of alternative poly-adenylation and 3′ UTR length for a variety of biologi... more Despite the importance of alternative poly-adenylation and 3′ UTR length for a variety of biological phenomena, there are limited means of detecting UTR changes from standard transcriptomic data. We present the difUTR Bioconductor package which streamlines and improves upon diferential exon usage (DEU) analyses, and leverages existing DEU tools and alternative poly-adenylation site databases to enable diferential 3′ UTR usage analysis. We demonstrate the difUTR features and show that it is more fexible and more accurate than state-of-the-art alternatives, both in simulations and in real data. In conclusion, difUTR enables diferential 3′ UTR analysis and more generally facilitates DEU and the exploration of their results.
Genome Biology, 2020
We present pipeComp (https://github.com/plger/pipeComp), a flexible R framework for pipeline comp... more We present pipeComp (https://github.com/plger/pipeComp), a flexible R framework for pipeline comparison handling interactions between analysis steps and relying on multi-level evaluation metrics. We apply it to the benchmark of single-cell RNA-sequencing analysis pipelines using simulated and real datasets with known cell identities, covering common methods of filtering, doublet detection, normalization, feature selection, denoising, dimensionality reduction, and clustering. pipeComp can easily integrate any other step, tool, or evaluation metric, allowing extensible benchmarks and easy applications to other fields, as we demonstrate through a study of the impact of removal of unwanted variation on differential expression analysis.
Cell Reports, 2018
Transdifferentiation of fibroblasts into induced neuronal cells (iNs) by the neuron-specific ... more Transdifferentiation of fibroblasts into induced neuronal cells (iNs) by the neuron-specific transcription factors Brn2, Myt1l, and Ascl1 is a paradigmatic example of inter-lineage conversion across epigenetically distant cells. Despite tremendous progress regarding the transcriptional hierarchy underlying transdifferentiation, the enablers of the concomitant epigenome resetting remain to be elucidated. Here, we investigated the role of KMT2A and KMT2B, two histone H3 lysine 4 methylases with cardinal roles in development, through individual and combined inactivation. We found that Kmt2b, whose human homolog’s mutations
cause dystonia, is selectively required for iN conversion through suppression of the alternative myocyte program and induction of neuronal maturation genes. The identification of KMT2B-vulnerable targets allowed us, in turn, to expose, in a cohort of 225 patients, 45 unique variants in 39 KMT2B targets, which represent promising
candidates to dissect the molecular bases of dystonia.
Stem Cell Reports, 2017
Both the promises and pitfalls of the cell reprogramming research platform rest on human genetic ... more Both the promises and pitfalls of the cell reprogramming research platform rest on human genetic variation, making the measurement of its impact one of the most urgent issues in the field. Harnessing large transcriptomics datasets of induced pluripotent stem cells (iPSC), we investigate the implications of this variability for iPSC-based disease modeling. In particular, we show that the widespread use of more than one clone per individual in combination with current analytical practices is detrimental to the robustness of the findings. We then proceed to identify methods to address this challenge and leverage multiple clones per individual. Finally, we evaluate the specificity and sensitivity of different sample sizes and experimental designs, presenting computational tools for power analysis. These findings and tools reframe the nature of replicates used in disease modeling and provide important resources for the design, analysis, and interpretation of iPSC-based studies.
Nucleic Acids Research, 2016
RNA sequencing (RNAseq) has become the method of choice for transcriptome analysis, yet no consen... more RNA sequencing (RNAseq) has become the method of choice for transcriptome analysis, yet no consensus exists as to the most appropriate pipeline for its analysis, with current benchmarks suffering important limitations. Here, we address these challenges through a rich benchmarking resource harnessing (i) two RNAseq datasets including ERCC ExFold spike-ins; (ii) Nanostring measurements of a panel of 150 genes on the same samples; (iii) a set of internal, genetically-determined controls; (iv) a reanalysis of the SEQC dataset; and (v) a focus on relative quantification (i.e. across-samples). We use this resource to compare different approaches to each step of RNAseq analysis, from alignment to differential expression testing. We show that methods providing the best absolute quantification do not necessarily provide good relative quantification across samples, that count-based methods are superior for gene-level relative quantification, and that the new generation of pseudo-alignment-based software performs as well as established methods, at a fraction of the computing time. We also assess the impact of library type and size on quantification and differential expression analysis. Finally, we have created a R package and a web platform to enable the simple and streamlined application of this resource to the benchmarking of future methods.
Nature Genetics
Cell reprogramming promises to make characterization of the impact of human genetic variation on ... more Cell reprogramming promises to make characterization of the impact of human genetic variation on health and disease experimentally tractable by enabling the bridging of genotypes to phenotypes in developmentally relevant human cell lineages. Here we apply this paradigm to two disorders caused by symmetrical copy number variations of 7q11.23, which display a striking combination of shared and symmetrically opposite phenotypes—Williams-Beuren syndrome and 7q-microduplication syndrome. Through analysis of transgene-free patient-derived induced pluripotent stem cells and their differentiated derivatives, we find that 7q11.23 dosage imbalance disrupts transcriptional circuits in disease-relevant pathways beginning in the pluripotent state. These alterations are then selectively amplified upon differentiation of the pluripotent cells into disease-relevant lineages. A considerable proportion of this transcriptional dysregulation is specifically caused by dosage imbalances in GTF2I, which encodes a key transcription factor at 7q11.23 that is associated with the LSD1 repressive chromatin complex and silences its dosage-sensitive targets.
Journal of Clinical Investigation, 2013
, which impaired tumor growth. In conclusion, EZH2 sustains AID function and prevents terminal di... more , which impaired tumor growth. In conclusion, EZH2 sustains AID function and prevents terminal differentiation of GC B cells, which allows antibody diversification and affinity maturation. Dysregulation of the GC reaction by constitutively active EZH2 facilitates lymphomagenesis and identifies EZH2 as a possible therapeutic target in NHL and other GC-derived B cell diseases.
Transcription factor (TF)-induced reprogramming of somatic cells into induced pluripotent stem ce... more Transcription factor (TF)-induced reprogramming of somatic cells into induced pluripotent stem cells (iPSC) is associated with genome-wide changes in chromatin modifications. Polycomb-mediated histone H3 lysine-27 trimethylation (H3K27me3) has been proposed as a defining mark that distinguishes the somatic from the iPSC epigenome. Here, we dissected the functional role of H3K27me3 in TF-induced reprogramming through the inactivation of the H3K27 methylase EZH2 at the onset of reprogramming. Our results demonstrate that surprisingly the establishment of functional iPSC proceeds despite global loss of H3K27me3. iPSC lacking EZH2 efficiently silenced the somatic transcriptome and differentiated into tissues derived from the three germ layers. Remarkably, the genome-wide analysis of H3K27me3 in Ezh2 mutant iPSC cells revealed the retention of this mark on a highly selected group of Polycomb targets enriched for developmental regulators controlling the expression of lineage specific genes. Erasure of H3K27me3 from these targets led to a striking impairment in TF-induced reprogramming. These results indicate that PRC2-mediated H3K27 trimethylation is required on a highly selective core of Polycomb targets whose repression enables TF-dependent cell reprogramming.
Papers (HPLS) by Pierre-Luc Germain
Penultimate draft
The notion of biological function is fraught with difficulties - intrinsically and irremediably s... more The notion of biological function is fraught with difficulties - intrinsically and irremediably so, we argue. The physiological practice of functional ascription originates from a time when organisms were thought to be designed and remained largely unchanged since. In a secularized worldview, this creates a paradox which accounts of functions as selected effect attempt to resolve. This attempt, we argue, misses its target in physiology and it brings problems of its own. Instead, we propose that a better solution to the conundrum of biological functions is to abandon the notion altogether, a prospect not only less daunting than it appears, but arguably the natural continuation of the naturalisation of biology.
Studies in History and Philosophy of Biological and Biomedical Sciences, 2017
This paper identifies a common political struggle behind debates on the validity and permissibili... more This paper identifies a common political struggle behind debates on the validity and permissibility of animal experimentation, through an analysis of two recent European case studies: the Italian implementation of the European Directive 2010/63/EC regulating the use of animals in science, and the recent European Citizens' Initiative (ECI) 'Stop Vivisection'. Drawing from a historical parallel with Victorian antivivisectionism, we highlight important threads in our case studies that mark the often neglected specificities of debates on animal experimentation. From the representation of the sadistic scientist in the XIX century, to his/her claimed capture by vested interests and evasion of public scrutiny in the contemporary cases, we show that animals are not simply the focus of the debate, but also a privileged locus at which much broader issues are being raised about science, its authority, accountability and potential misalignment with public interest. By highlighting this common socio-political conflict underlying public controversies around animal experimentation, our work prompts the exploration of modes of authority and argumentation that, in establishing the usefulness of animals in science, avoid reenacting the traditional divide between epistemic and political fora.
History and Philosophy of the Life Sciences, 2018
Data-Centric Biology, which received the 2018 Lakatos award, fits squarely with Sabina Leonelli's... more Data-Centric Biology, which received the 2018 Lakatos award, fits squarely with Sabina Leonelli's 'empirical philosophy of science': it pays attention to actors and practices without losing touch with more abstract or general issues in the philosophy of science. At the core of the book is a philosophical view of data, which is seen as a relational category, applicable to portable (though not immutable) research outputs that have or are expected to have evidential value for knowledge claims. A key feature of data-centrism is then, for Leonelli, the recognition that this evidential value is underdetermined, i.e. that we cannot say in advance what claims any given data may bear upon. This prospective and open nature of data has a number of consequences on how it is to be dealt with, which much of the book sets out to explore.
In response to Germain (2012) have recently argued that many adaptations in cancer only make sens... more In response to Germain (2012) have recently argued that many adaptations in cancer only make sense at the tumor level, and that cancer progression mirrors the major evolutionary transitions. While we agree that selection could potentially act at various levels of organization in cancers, we argue that tumor-level selection (MLS2) is unlikely to actually play a relevant role in our understanding of the somatic evolution of human cancers.
The ways in which other animal species can be informative about human biology are not exhausted b... more The ways in which other animal species can be informative about human biology are not exhausted by the traditional picture of the animal model. In this paper, I propose to distinguish two roles which animal models can have in biomedical research. In the more traditional case, organisms act as surrogates for human beings, and as such are expected to be more manageable replicas of humans. However, animal models can inform us about human biology in a much less straightforward way, by being used as measuring devices -- what I call their instrumental role. I first characterize this role and provide criteria for it, before illustrating it with some examples from biomedical research, especially cancer research. In such an instrumental role, phenotypes are not expected to phenocopy human phenomena, but instead have the purely instrumental value of detecting or measuring differences. I
argue that the instrumental role is more prevalent than might first be suspected, and that some characteristics of contemporary biomedical research are increasingly shifting the use of laboratory
organisms to the instrumental role. Finally, in light of the distinction proposed, I discuss the meaning of the expression “animal model”.
In its last round of publications in September 2012, the Encyclopedia Of DNA Elements (ENCODE) as... more In its last round of publications in September 2012, the Encyclopedia Of DNA Elements (ENCODE) assigned a biochemical function to most of the human genome, which was taken up by the media as meaning the end of ‘Junk DNA’. This provoked a heated reaction from evolutionary biologists, who among other things claimed that ENCODE adopted a wrong and much too inclusive notion of function, making its dismissal of junk DNA merely rhetorical. We argue that this criticism rests on misunderstandings concerning the nature of the ENCODE project, the relevant notion of function and the claim that most of our genome is junk. We argue that evolutionary accounts of function presuppose functions as ‘causal roles’, and that selection is but a useful proxy for relevant functions, which might well be unsuitable to biomedical research. Taking a closer look at the discovery process in which ENCODE participates, we argue that ENCODE’s strategy of biochemical signatures successfully identified activities of DNA elements with an eye towards causal roles of interest to biomedical research. We argue that ENCODE’s controversial claim of functionality should be interpreted as saying that 80 % of the genome is engaging in relevant biochemical activities and is very likely to have a causal role in phenomena deemed relevant to biomedical research. Finally, we discuss ambiguities in the meaning of junk DNA and in one of the main arguments raised for its prevalence, and we evaluate the impact of ENCODE’s results on the claim that most of our genome is junk.
New Directions in Philosophy of Science, M.C. Galavotti et al. (eds.), 2014
I discuss the relationship between theoretical terms and measuring devices using a very peculiar ... more I discuss the relationship between theoretical terms and measuring devices using a very peculiar example from biomedical research: cancer transplantation models. I do so through two complementary comparisons. I first show how a historical case study can shed light on a similar case from contemporary biomedical research. But I also compare both to a paradigmatic case of measurement in the physical sciences -thermometrywhich reveals some of the most relevant epistemological issues. The comparison offers arguments for the recent debate on the operational definition of Cancer Stem Cells, and thereby suggests the relevance of a comparative approach in the history and philosophy of science.
Uploads
Papers (science) by Pierre-Luc Germain
cause dystonia, is selectively required for iN conversion through suppression of the alternative myocyte program and induction of neuronal maturation genes. The identification of KMT2B-vulnerable targets allowed us, in turn, to expose, in a cohort of 225 patients, 45 unique variants in 39 KMT2B targets, which represent promising
candidates to dissect the molecular bases of dystonia.
Papers (HPLS) by Pierre-Luc Germain
argue that the instrumental role is more prevalent than might first be suspected, and that some characteristics of contemporary biomedical research are increasingly shifting the use of laboratory
organisms to the instrumental role. Finally, in light of the distinction proposed, I discuss the meaning of the expression “animal model”.
cause dystonia, is selectively required for iN conversion through suppression of the alternative myocyte program and induction of neuronal maturation genes. The identification of KMT2B-vulnerable targets allowed us, in turn, to expose, in a cohort of 225 patients, 45 unique variants in 39 KMT2B targets, which represent promising
candidates to dissect the molecular bases of dystonia.
argue that the instrumental role is more prevalent than might first be suspected, and that some characteristics of contemporary biomedical research are increasingly shifting the use of laboratory
organisms to the instrumental role. Finally, in light of the distinction proposed, I discuss the meaning of the expression “animal model”.