COLOMBOS is a database that integrates publicly available transcriptomics data for several prokar... more COLOMBOS is a database that integrates publicly available transcriptomics data for several prokaryotic model organisms. Compared to the previous version it has more than doubled in size, both in terms of species and data available. The manually curated condition annotation has been overhauled as well, giving more complete information about samples' experimental conditions and their differences. Functionality-wise cross-species analyses now enable users to analyse expression data for all species simultaneously, and identify candidate genes with evolutionary conserved expression behaviour. All the expression-based query tools have undergone a substantial improvement, overcoming the limit of enforced co-expression data retrieval and instead enabling the return of more complex patterns of expression behaviour. COLOMBOS is freely available through a web application at http://colombos.net/. The complete database is also accessible via REST API or downloadable as tab-delimited text files.
The Codon Adaptation Index (CAI) was introduced by Sharp and Li in 1987 to quantify codon usage s... more The Codon Adaptation Index (CAI) was introduced by Sharp and Li in 1987 to quantify codon usage similarities between a coding sequence and a set of reference sequences. When synonymous codons for a given amino acid exist, highly expressed genes seem to prefer some of them, according to tRNA abundance and thermodynamic issues. Some authors have described CAI-based methods to derive expressivity measures for all genes in a genome, in a computational framework.
Reconstruction of the regulatory network is an important step in understanding how organisms cont... more Reconstruction of the regulatory network is an important step in understanding how organisms control the expression of gene products and therefore phenotypes. Recent studies have pointed out the importance of regulatory network plasticity in bacterial adaptation and evolution. The evolution of such networks within and outside the species boundary is however still obscure. Sinorhizobium meliloti is an ideal species for such study, having three large replicons, many genomes available and a significant knowledge of its transcription factors (TF). Each replicon has a specific functional and evolutionary mark; which might also emerge from the analysis of their regulatory signatures. Here we have studied the plasticity of the regulatory network within and outside the S. meliloti species, looking for the presence of 41 TFs binding motifs in 51 strains and 5 related rhizobial species. We have detected a preference of several TFs for one of the three replicons, and the function of regulated genes was found to be in accordance with the overall replicon functional signature: house-keeping functions for the chromosome, metabolism for the chromid, symbiosis for the megaplasmid. This therefore suggests a replicon-specific wiring of the regulatory network in the S. meliloti species. At the same time a significant part of the predicted regulatory network is shared between the chromosome and the chromid, thus adding an additional layer by which the chromid integrates itself in the core genome. Furthermore, the regulatory network distance was found to be correlated with both promoter regions and accessory genome evolution inside the species, indicating that both pangenome compartments are involved in the regulatory network evolution. We also observed that genes which are not included in the species regulatory network are more likely to belong to the accessory genome, indicating that regulatory interactions should also be considered to predict gene conservation in bacterial pangenomes.
In all domains of life, proper regulation of the cell cycle is critical to coordinate genome repl... more In all domains of life, proper regulation of the cell cycle is critical to coordinate genome replication, segregation and cell division. In some groups of bacteria, e.g. Alphaproteobacteria, tight regulation of the cell cycle is also necessary for the morphological and functional differentiation of cells. Sinorhizobium meliloti is an alphaproteobacterium that forms an economically and ecologically important nitrogen-fixing symbiosis with specific legume hosts. During this symbiosis S. meliloti undergoes an elaborate cellular differentiation within host root cells. The differentiation of S. meliloti results in massive amplification of the genome, cell branching and/or elongation, and loss of reproductive capacity. In Caulobacter crescentus, cellular differentiation is tightly linked to the cell cycle via the activity of the master regulator CtrA, and recent research in S. meliloti suggests that CtrA might also be key to cellular differentiation during symbiosis. However, the regulatory circuit driving cell cycle progression in S. meliloti is not well characterized in both the freeliving and symbiotic state. Here, we investigated the regulation and function of CtrA in S. meliloti. We demonstrated that depletion of CtrA cause cell elongation, branching and genome amplification, similar to that observed in nitrogen-fixing bacteroids. We also showed that the cell cycle regulated proteolytic degradation of CtrA is essential in S. meliloti, suggesting a possible mechanism of CtrA depletion in differentiated bacteroids. Using a combination of ChIP-Seq and gene expression microarray analysis we found that although S. meliloti CtrA regulates similar processes as C. crescentus CtrA, it does so through different target genes. For example, our data suggest that CtrA does not control the expression of the Fts complex to control the timing of cell division during the cell cycle, but instead it negatively regulates the septum-inhibiting Min system. Our findings provide valuable PLOS Genetics |
The Codon Adaptation Index (CAI) was introduced by Sharp and Li in 1987 to quantify codon usage s... more The Codon Adaptation Index (CAI) was introduced by Sharp and Li in 1987 to quantify codon usage similarities between a coding sequence and a set of reference sequences. When synonymous codons for a given amino acid exist, highly expressed genes seem to prefer some of them, according to tRNA abundance and thermodynamic issues. Some authors have described CAI-based methods to derive expressivity measures for all genes in a genome, in a computational framework. Here we present the CAIAP (CAI Analyser Package), a platform independent package of computer programs allowing the calculation of the CAI and a deep study of gene expressivity from raw gene sequences. Our approach implements and optimizes a procedure to derive the reference sequences from whole genomes and use their codon usage for CAI estimation. Moreover, a set of analysis tools are provided to perform statistical analyses and therefore to give robustness to results. Our efforts were aimed to produce an easy-to-use and fully a...
Several regulators are involved in the control of cell cycle progression in the bacterial model s... more Several regulators are involved in the control of cell cycle progression in the bacterial model system Caulobacter crescentus, which divides asymmetrically into a vegetative G1-phase (swarmer) cell and a replicative S-phase (stalked) cell. Here we report a novel functional interaction between the enigmatic cell cycle regulator GcrA and the N6-adenosine methyltransferase CcrM, both highly conserved proteins among Alphaproteobacteria, that are activated early and at the end of S-phase, respectively. As no direct biochemical and regulatory relationship between GcrA and CcrM were known, we used a combination of ChIP (chromatin-immunoprecipitation), biochemical and biophysical experimentation, and genetics to show that GcrA is a dimeric DNA-binding protein that preferentially targets promoters harbouring CcrM methylation sites. After tracing CcrM-dependent N6-methyl-adenosine promoter marks at a genome-wide scale, we show that these marks recruit GcrA in vitro and in vivo. Moreover, we found that, in the presence of a methylated target, GcrA recruits the RNA polymerase to the promoter, consistent with its role in transcriptional activation. Since methylation-dependent DNA binding is also observed with GcrA orthologs from other Alphaproteobacteria, we conclude that GcrA is the founding member of a new and conserved class of transcriptional regulators that function as molecular effectors of a methylationdependent (non-heritable) epigenetic switch that regulates gene expression during the cell cycle.
Ensifer (syn. Sinorhizob ium) meliloti is an important symbiotic bacterial species that fixes nit... more Ensifer (syn. Sinorhizob ium) meliloti is an important symbiotic bacterial species that fixes nitrog en. Strains BO21CC and AK58 were previously investigated for their substrate utilization and their plant-g rowth promoting abilities showing interesting features. Here, we describe the complete g enome sequence and annotation of these strains. BO21CC and AK58 genomes are 6,985,065 and 6,974, 333 bp long with 6,746 and 6,992 genes predicted, respectively.
We report the draft genome sequence of Pseudomonas alcaliphila 34, a Cr(VI)-hyperresistant and bi... more We report the draft genome sequence of Pseudomonas alcaliphila 34, a Cr(VI)-hyperresistant and biofilm-producing bacterium that might be used for the bioremediation of chromate-polluted soils. The genome sequence might be helpful in exploring the mechanisms involved in chromium resistance and biofilm formation.
Biological networks are currently being studied with approaches derived from the mathematical and... more Biological networks are currently being studied with approaches derived from the mathematical and physical sciences. Their structural analysis enables to highlight nodes with special properties that have sometimes been correlated with the biological importance of a gene or a protein. However, biological networks are dynamic both on the evolutionary time-scale, and on the much shorter time-scale of physiological processes. There is therefore no unique network for a given cellular process, but potentially many realizations, each with different properties as a consequence of regulatory mechanisms. Such realizations provide snapshots of a same network in different conditions, enabling the study of condition-dependent structural properties. True dynamical analysis can be obtained through detailed mathematical modeling techniques that are not easily scalable to full network models.
Sinorhizobium meliloti is a soil bacterium that invades the root nodules it induces on Medicago s... more Sinorhizobium meliloti is a soil bacterium that invades the root nodules it induces on Medicago sativa, whereupon it undergoes an alteration of its cell cycle and differentiates into nitrogen-fixing, elongated and polyploid bacteroid with higher membrane permeability. In Caulobacter crescentus, a related alphaproteobacterium, the principal cell cycle regulator, CtrA, is inhibited by the phosphorylated response regulator DivK. The phosphorylation of DivK depends on the histidine kinase DivJ, while PleC is the principal phosphatase for DivK. Despite the importance of the DivJ in C. crescentus, the mechanistic role of this kinase has never been elucidated in other Alphaproteobacteria. We show here that the histidine kinases DivJ together with CbrA and PleC participate in a complex phosphorylation system of the essential response regulator DivK in S. meliloti. In particular, DivJ and CbrA are involved in DivK phosphorylation and in turn CtrA inactivation, thereby controlling correct cell cycle progression and the integrity of the cell envelope. In contrast, the essential PleC presumably acts as a phosphatase of DivK. Interestingly, we found that a DivJ mutant is able to elicit nodules and enter plant cells, but fails to establish an effective symbiosis suggesting that proper envelope and/or low CtrA levels are required for symbiosis.
A major problem for the identification of metabolic network models is parameter identifiability, ... more A major problem for the identification of metabolic network models is parameter identifiability, that is, the possibility to unambiguously infer the parameter values from the data. Identifiability problems may be due to the structure of the model, in particular implicit dependencies between the parameters, or to limitations in the quantity and quality of the available data. We address the detection and resolution of identifiability problems for a class of pseudo-linear models of metabolism, so-called linlog models. Linlog models have the advantage that parameter estimation reduces to linear or orthogonal regression, which facilitates the analysis of identifiability. We develop precise definitions of structural and practical identifiability, and clarify the fundamental relations between these concepts. In addition, we use singular value decomposition to detect identifiability problems and reduce the model to an identifiable approximation by a principal component analysis approach. The criterion is adapted to real data, which are frequently scarce, incomplete, and noisy. The test of the criterion on a model with simulated data shows that it is capable of correctly identifying the principal components of the data vector. The application to a state-of-the-art dataset on central carbon metabolism in Escherichia coli yields the surprising result that only 4 out of 31 reactions, and 37 out of 100 parameters, are identifiable. This underlines the practical importance of identifiability analysis and model reduction in the modeling of large-scale metabolic networks. Although our approach has been developed in This work was supported by the Agence Nationale de la Recherche under project MetaGenoReg (ANR-06-BYOS-0003). S. Berthoumieux et al.
COLOMBOS is a database that integrates publicly available transcriptomics data for several prokar... more COLOMBOS is a database that integrates publicly available transcriptomics data for several prokaryotic model organisms. Compared to the previous version it has more than doubled in size, both in terms of species and data available. The manually curated condition annotation has been overhauled as well, giving more complete information about samples' experimental conditions and their differences. Functionality-wise cross-species analyses now enable users to analyse expression data for all species simultaneously, and identify candidate genes with evolutionary conserved expression behaviour. All the expression-based query tools have undergone a substantial improvement, overcoming the limit of enforced co-expression data retrieval and instead enabling the return of more complex patterns of expression behaviour. COLOMBOS is freely available through a web application at http://colombos.net/. The complete database is also accessible via REST API or downloadable as tab-delimited text files.
The Codon Adaptation Index (CAI) was introduced by Sharp and Li in 1987 to quantify codon usage s... more The Codon Adaptation Index (CAI) was introduced by Sharp and Li in 1987 to quantify codon usage similarities between a coding sequence and a set of reference sequences. When synonymous codons for a given amino acid exist, highly expressed genes seem to prefer some of them, according to tRNA abundance and thermodynamic issues. Some authors have described CAI-based methods to derive expressivity measures for all genes in a genome, in a computational framework.
Reconstruction of the regulatory network is an important step in understanding how organisms cont... more Reconstruction of the regulatory network is an important step in understanding how organisms control the expression of gene products and therefore phenotypes. Recent studies have pointed out the importance of regulatory network plasticity in bacterial adaptation and evolution. The evolution of such networks within and outside the species boundary is however still obscure. Sinorhizobium meliloti is an ideal species for such study, having three large replicons, many genomes available and a significant knowledge of its transcription factors (TF). Each replicon has a specific functional and evolutionary mark; which might also emerge from the analysis of their regulatory signatures. Here we have studied the plasticity of the regulatory network within and outside the S. meliloti species, looking for the presence of 41 TFs binding motifs in 51 strains and 5 related rhizobial species. We have detected a preference of several TFs for one of the three replicons, and the function of regulated genes was found to be in accordance with the overall replicon functional signature: house-keeping functions for the chromosome, metabolism for the chromid, symbiosis for the megaplasmid. This therefore suggests a replicon-specific wiring of the regulatory network in the S. meliloti species. At the same time a significant part of the predicted regulatory network is shared between the chromosome and the chromid, thus adding an additional layer by which the chromid integrates itself in the core genome. Furthermore, the regulatory network distance was found to be correlated with both promoter regions and accessory genome evolution inside the species, indicating that both pangenome compartments are involved in the regulatory network evolution. We also observed that genes which are not included in the species regulatory network are more likely to belong to the accessory genome, indicating that regulatory interactions should also be considered to predict gene conservation in bacterial pangenomes.
In all domains of life, proper regulation of the cell cycle is critical to coordinate genome repl... more In all domains of life, proper regulation of the cell cycle is critical to coordinate genome replication, segregation and cell division. In some groups of bacteria, e.g. Alphaproteobacteria, tight regulation of the cell cycle is also necessary for the morphological and functional differentiation of cells. Sinorhizobium meliloti is an alphaproteobacterium that forms an economically and ecologically important nitrogen-fixing symbiosis with specific legume hosts. During this symbiosis S. meliloti undergoes an elaborate cellular differentiation within host root cells. The differentiation of S. meliloti results in massive amplification of the genome, cell branching and/or elongation, and loss of reproductive capacity. In Caulobacter crescentus, cellular differentiation is tightly linked to the cell cycle via the activity of the master regulator CtrA, and recent research in S. meliloti suggests that CtrA might also be key to cellular differentiation during symbiosis. However, the regulatory circuit driving cell cycle progression in S. meliloti is not well characterized in both the freeliving and symbiotic state. Here, we investigated the regulation and function of CtrA in S. meliloti. We demonstrated that depletion of CtrA cause cell elongation, branching and genome amplification, similar to that observed in nitrogen-fixing bacteroids. We also showed that the cell cycle regulated proteolytic degradation of CtrA is essential in S. meliloti, suggesting a possible mechanism of CtrA depletion in differentiated bacteroids. Using a combination of ChIP-Seq and gene expression microarray analysis we found that although S. meliloti CtrA regulates similar processes as C. crescentus CtrA, it does so through different target genes. For example, our data suggest that CtrA does not control the expression of the Fts complex to control the timing of cell division during the cell cycle, but instead it negatively regulates the septum-inhibiting Min system. Our findings provide valuable PLOS Genetics |
The Codon Adaptation Index (CAI) was introduced by Sharp and Li in 1987 to quantify codon usage s... more The Codon Adaptation Index (CAI) was introduced by Sharp and Li in 1987 to quantify codon usage similarities between a coding sequence and a set of reference sequences. When synonymous codons for a given amino acid exist, highly expressed genes seem to prefer some of them, according to tRNA abundance and thermodynamic issues. Some authors have described CAI-based methods to derive expressivity measures for all genes in a genome, in a computational framework. Here we present the CAIAP (CAI Analyser Package), a platform independent package of computer programs allowing the calculation of the CAI and a deep study of gene expressivity from raw gene sequences. Our approach implements and optimizes a procedure to derive the reference sequences from whole genomes and use their codon usage for CAI estimation. Moreover, a set of analysis tools are provided to perform statistical analyses and therefore to give robustness to results. Our efforts were aimed to produce an easy-to-use and fully a...
Several regulators are involved in the control of cell cycle progression in the bacterial model s... more Several regulators are involved in the control of cell cycle progression in the bacterial model system Caulobacter crescentus, which divides asymmetrically into a vegetative G1-phase (swarmer) cell and a replicative S-phase (stalked) cell. Here we report a novel functional interaction between the enigmatic cell cycle regulator GcrA and the N6-adenosine methyltransferase CcrM, both highly conserved proteins among Alphaproteobacteria, that are activated early and at the end of S-phase, respectively. As no direct biochemical and regulatory relationship between GcrA and CcrM were known, we used a combination of ChIP (chromatin-immunoprecipitation), biochemical and biophysical experimentation, and genetics to show that GcrA is a dimeric DNA-binding protein that preferentially targets promoters harbouring CcrM methylation sites. After tracing CcrM-dependent N6-methyl-adenosine promoter marks at a genome-wide scale, we show that these marks recruit GcrA in vitro and in vivo. Moreover, we found that, in the presence of a methylated target, GcrA recruits the RNA polymerase to the promoter, consistent with its role in transcriptional activation. Since methylation-dependent DNA binding is also observed with GcrA orthologs from other Alphaproteobacteria, we conclude that GcrA is the founding member of a new and conserved class of transcriptional regulators that function as molecular effectors of a methylationdependent (non-heritable) epigenetic switch that regulates gene expression during the cell cycle.
Ensifer (syn. Sinorhizob ium) meliloti is an important symbiotic bacterial species that fixes nit... more Ensifer (syn. Sinorhizob ium) meliloti is an important symbiotic bacterial species that fixes nitrog en. Strains BO21CC and AK58 were previously investigated for their substrate utilization and their plant-g rowth promoting abilities showing interesting features. Here, we describe the complete g enome sequence and annotation of these strains. BO21CC and AK58 genomes are 6,985,065 and 6,974, 333 bp long with 6,746 and 6,992 genes predicted, respectively.
We report the draft genome sequence of Pseudomonas alcaliphila 34, a Cr(VI)-hyperresistant and bi... more We report the draft genome sequence of Pseudomonas alcaliphila 34, a Cr(VI)-hyperresistant and biofilm-producing bacterium that might be used for the bioremediation of chromate-polluted soils. The genome sequence might be helpful in exploring the mechanisms involved in chromium resistance and biofilm formation.
Biological networks are currently being studied with approaches derived from the mathematical and... more Biological networks are currently being studied with approaches derived from the mathematical and physical sciences. Their structural analysis enables to highlight nodes with special properties that have sometimes been correlated with the biological importance of a gene or a protein. However, biological networks are dynamic both on the evolutionary time-scale, and on the much shorter time-scale of physiological processes. There is therefore no unique network for a given cellular process, but potentially many realizations, each with different properties as a consequence of regulatory mechanisms. Such realizations provide snapshots of a same network in different conditions, enabling the study of condition-dependent structural properties. True dynamical analysis can be obtained through detailed mathematical modeling techniques that are not easily scalable to full network models.
Sinorhizobium meliloti is a soil bacterium that invades the root nodules it induces on Medicago s... more Sinorhizobium meliloti is a soil bacterium that invades the root nodules it induces on Medicago sativa, whereupon it undergoes an alteration of its cell cycle and differentiates into nitrogen-fixing, elongated and polyploid bacteroid with higher membrane permeability. In Caulobacter crescentus, a related alphaproteobacterium, the principal cell cycle regulator, CtrA, is inhibited by the phosphorylated response regulator DivK. The phosphorylation of DivK depends on the histidine kinase DivJ, while PleC is the principal phosphatase for DivK. Despite the importance of the DivJ in C. crescentus, the mechanistic role of this kinase has never been elucidated in other Alphaproteobacteria. We show here that the histidine kinases DivJ together with CbrA and PleC participate in a complex phosphorylation system of the essential response regulator DivK in S. meliloti. In particular, DivJ and CbrA are involved in DivK phosphorylation and in turn CtrA inactivation, thereby controlling correct cell cycle progression and the integrity of the cell envelope. In contrast, the essential PleC presumably acts as a phosphatase of DivK. Interestingly, we found that a DivJ mutant is able to elicit nodules and enter plant cells, but fails to establish an effective symbiosis suggesting that proper envelope and/or low CtrA levels are required for symbiosis.
A major problem for the identification of metabolic network models is parameter identifiability, ... more A major problem for the identification of metabolic network models is parameter identifiability, that is, the possibility to unambiguously infer the parameter values from the data. Identifiability problems may be due to the structure of the model, in particular implicit dependencies between the parameters, or to limitations in the quantity and quality of the available data. We address the detection and resolution of identifiability problems for a class of pseudo-linear models of metabolism, so-called linlog models. Linlog models have the advantage that parameter estimation reduces to linear or orthogonal regression, which facilitates the analysis of identifiability. We develop precise definitions of structural and practical identifiability, and clarify the fundamental relations between these concepts. In addition, we use singular value decomposition to detect identifiability problems and reduce the model to an identifiable approximation by a principal component analysis approach. The criterion is adapted to real data, which are frequently scarce, incomplete, and noisy. The test of the criterion on a model with simulated data shows that it is capable of correctly identifying the principal components of the data vector. The application to a state-of-the-art dataset on central carbon metabolism in Escherichia coli yields the surprising result that only 4 out of 31 reactions, and 37 out of 100 parameters, are identifiable. This underlines the practical importance of identifiability analysis and model reduction in the modeling of large-scale metabolic networks. Although our approach has been developed in This work was supported by the Agence Nationale de la Recherche under project MetaGenoReg (ANR-06-BYOS-0003). S. Berthoumieux et al.
Uploads
Papers by matteo brilli