Conspecific male animals fight for resources such as food and mating opportunities but typically ... more Conspecific male animals fight for resources such as food and mating opportunities but typically stop fighting after assessing their relative fighting abilities to avoid serious injuries. Physiologically, how the fighting behavior is controlled remains unknown. Using the fighting fish Betta splendens, we studied behavioral and brain-transcriptomic changes during the fight between the two opponents. At the behavioral level, surface-breathing, and biting/striking occurred only during intervals between mouth-locking. Eventually, the behaviors of the two opponents became synchronized, with each pair showing a unique behavioral pattern. At the physiological level, we examined the expression patterns of 23,306 brain transcripts using RNA-sequencing data from brains of fighting pairs after a 20-min (D20) and a 60-min (D60) fight. The two opponents in each D60 fighting pair showed a strong gene expression correlation, whereas those in D20 fighting pairs showed a weak correlation. Moreover, each fighting pair in the D60 group showed pair-specific gene expression patterns in a grade of membership analysis (GoM) and were grouped as a pair in the heatmap clustering. The observed pair-specific individualization in brain-transcriptomic synchronization (PIBS) suggested that this synchronization provides a physiological basis for the behavioral synchronization. An analysis using the synchronized genes in fighting pairs of the D60 group found genes enriched for ion transport, synaptic function, and learning and memory. Brain-transcriptomic synchronization could be a general phenomenon and may provide a new cornerstone with which to investigate coordinating and sustaining social interactions between two interacting partners of vertebrates.
This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
In higher plants, whole-genome duplication (WGD) is thought to facilitate the evolution of C 4 ph... more In higher plants, whole-genome duplication (WGD) is thought to facilitate the evolution of C 4 photosynthesis from C 3 photosynthesis. To understand this issue, we used new and existing leaf-development transcriptomes to construct two coding sequence databases for C 4 Gynandropsis gynandra and C 3 Tarenaya hassleriana, which shared a WGD before their divergence. We compared duplicated genes in the two species and found that the WGD contributed to four aspects of the evolution of C 4 photosynthesis in G. gynandra. First, G. gynandra has retained the duplicates of ALAAT (alanine aminotransferase) and GOGAT (glutamine oxoglutarate aminotransferase) for nitrogen recycling to establish a photorespiratory CO 2 pump in bundle sheath (BS) cells for increasing photosynthesis efficiency, suggesting that G. gynandra experienced a C 3-C 4 intermediate stage during the C 4 evolution. Second, G. gynandra has retained almost all known veindevelopment-related paralogous genes derived from the WGD event, likely contributing to the high vein complexity of G. gynandra. Third, the WGD facilitated the evolution of C 4 enzyme genes and their recruitment into the C 4 pathway. Fourth, several genes encoding photosystem I proteins were derived from the WGD and are upregulated in G. gynandra, likely enabling the NADH dehydrogenase-like complex to produce extra ATPs for the C 4 CO 2 concentration mechanism. Thus, the WGD apparently played an enabler role in the evolution of C 4 photosynthesis in G. gynandra. Importantly, an ALAAT duplicate became highly expressed in BS cells in G. gynandra, facilitating nitrogen recycling and transition to the C 4 cycle. This study revealed how WDG may facilitate C 4 photosynthesis evolution.
Many indicators of protein evolutionary rate have been proposed, but some of them are interrelate... more Many indicators of protein evolutionary rate have been proposed, but some of them are interrelated. The purpose of this study is to disentangle their correlations. We assess the strength of each indicator by controlling for the other indicators under study. We find that the number of microRNA (miRNA) types that regulate a gene is the strongest rate indicator (a negative correlation), followed by disorder content (the percentage of disordered regions in a protein, a positive correlation); the strength of disorder content as a rate indicator is substantially increased after controlling for the number of miRNA types. By dividing proteins into lowly and highly intrinsically disordered proteins (L-IDPs and H-IDPs), we find that proteins interacting with more H-IDPs tend to evolve more slowly, which largely explains the previous observation of a negative correlation between the number of protein-protein interactions and evolutionary rate. Moreover, all of the indicators examined here, except for the number of miRNA types, have different strengths in L-IDPs and in H-IDPs. Finally, the number of phosphorylation sites is weakly correlated with the number of miRNA types, and its strength as a rate indicator is substantially reduced when other indicators are considered. Our study reveals the relative strength of each rate indicator and increases our understanding of protein evolution.
Protein phosphorylation plays an important role in the regulation of protein function. Phosphoryl... more Protein phosphorylation plays an important role in the regulation of protein function. Phosphorylated residues are generally assumed to be subject to functional constraint, but it has recently been suggested from a comparison of distantly related vertebrate species that most phosphorylated residues evolve at the rates consistent with the surrounding regions. To resolve the controversy, we infer the ancestral phosphoproteome of human and mouse to compare the evolutionary rates of phosphorylated and nonphosphorylated serine (S), threonine (T), and tyrosine (Y) residues. This approach enables accurate estimation of evolutionary rates as it does not assume deep conservation of phosphorylated residues. We show that phosphorylated S/T residues tend to evolve more slowly than nonphosphorylated S/T residues not only in disordered but also in ordered protein regions, indicating evolutionary conservation of phosphorylated S/T residues in mammals. Thus, phosphorylated S/T residues tend to be subject to stronger functional constraint than nonphosphorylated residues regardless of the protein regions in which they reside. In contrast, phosphorylated Y residues evolve at similar rates as nonphosphorylated ones. We also find that the human lineage has gained more phosphorylated T residues and lost fewer phosphorylated Y residues than the mouse lineage. The cause of the gain/loss imbalance remains a mystery but should be worth exploring.
The evolution of duplicate genes has been a topic of broad interest. Here, we propose that the co... more The evolution of duplicate genes has been a topic of broad interest. Here, we propose that the conservation of gene family size is a good indicator of the rate of sequence evolution and some other biological properties. By comparing the human-chimpanzee-macaque orthologous gene families with and without family size conservation, we demonstrate that genes with family size conservation evolve more slowly than those without family size conservation. Our results further demonstrate that both family expansion and contraction events may accelerate gene evolution, resulting in elevated evolutionary rates in the genes without family size conservation. In addition, we show that the duplicate genes with family size conservation evolve significantly more slowly than those without family size conservation. Interestingly, the median evolutionary rate of singletons falls in between those of the above two types of duplicate gene families. Our results thus suggest that the controversy on whether duplicate genes evolve more slowly than singletons can be resolved when family size conservation is taken into consideration. Furthermore, we also observe that duplicate genes with family size conservation have the highest level of gene expression/expression breadth, the highest proportion of essential genes, and the lowest gene compactness, followed by singletons and then by duplicate genes without family size conservation. Such a trend accords well with our observations of evolutionary rates. Our results thus point to the importance of family size conservation in the evolution of duplicate genes.
The superior photosynthetic efficiency of C4leaves over C3leaves is owing to their unique Kranz a... more The superior photosynthetic efficiency of C4leaves over C3leaves is owing to their unique Kranz anatomy, in which the vein is surrounded by one layer of bundle sheath (BS) cells and one layer of mesophyll (M) cells. Kranz anatomy development starts from three contiguous ground meristem (GM) cells, but its regulators and underlying molecular mechanism are largely unknown. To identify the regulators, we obtained the transcriptomes of 11 maize embryonic leaf cell types from five stages of pre-Kranz cells starting from median GM cells and six stages of pre-M cells starting from undifferentiated cells. Principal component and clustering analyses of transcriptomic data revealed rapid pre-Kranz cell differentiation in the first two stages but slow differentiation in the last three stages, suggesting early Kranz cell fate determination. In contrast, pre-M cells exhibit a more prolonged transcriptional differentiation process. Differential gene expression and coexpression analyses identified...
It is well known that fish have fewer olfactory receptor (OR) genes than mammals do. In order to ... more It is well known that fish have fewer olfactory receptor (OR) genes than mammals do. In order to investigate the divergence of and evolutionary changes in OR genes in fish, 2 loach OR gene families (D1/3 and D32) from Taiwanese and Chinese populations of 2 species (Misgurnus anguillicaudatus and Paramisgurnus dabryanus) were analyzed. We found duplications in both D1/3 and D32 that had accumulated after the separation of the island of Taiwan from the Asian mainland ∼5 million yrs ago. The accumulation of duplicate copies in a relatively short time suggests that gaining extra copies of OR genes may have been selectively advantageous for adapting to local environments. On the other hand, the loss of some OR genes in the 2 species in Taiwan may reflect loss of selective constraints on some OR genes. Moreover, unlike the situation in mammals in which pseudogenes are found in each family, pseudogenes are clustered in an expanding gene family in P. dabryanus in Taiwan. The presence of pse...
The linkage disequilibrium in a subdivided populaton is shown to be equal to the sum of the avera... more The linkage disequilibrium in a subdivided populaton is shown to be equal to the sum of the average linkage disequilibrium for all subpopulations and the covariance between gene frequencies of the loci concerned. Thus, in a subdivided population the linkage disequilibrium may not be 0 even if the linkage disequilibrium in each subpopulation is 0. If a population is divided into two subpopulations between which migration occurs, the asymptotic rate of approach to linkage equilibrium is equal to either r or 2(m 1 + m 2) - (m 1 + m 2)2, whichever is smaller, where r is the recombination value and m 1 and m 2 are the proportions of immigrants in subpopulations 1 and 2, respectively. Thus, if migration rate is high compared with recombination value, the change of linkage disequilibrium in subdivided populations is similar to that of a single random mating population. On the other hand, if migration rate is low, the approach to lnkage equilibrium may be retarded in subdivided popula...
GAMYB, UDT1, TIP2/bHLH142, TDR, and EAT1/DTD are important transcription factors (TFs) that play ... more GAMYB, UDT1, TIP2/bHLH142, TDR, and EAT1/DTD are important transcription factors (TFs) that play a crucial role during rice pollen development. This study demonstrates that bHLH142 acts downstream of UDT1 and GAMYB and works as a “hub” in these two pollen pathways. We show that GAMYB modulates bHLH142 expression through specific binding to the MYB motif of bHLH142 promoter during early stage of pollen development; while TDR acts as a transcriptional repressor of the GAMYB modulation of bHLH142 by binding to the E-box close to the MYB motif on the promoter. The up- and down-regulation of TFs highlights the importance that a tight, precise, and coordinated regulation among these TFs is essential for normal pollen development. Most notably, this study illustrates the regulatory pathways of GAMYB and UDT1 that rely on bHLH142 in a direct and an indirect manner, respectively, and function in different tissues with distinct biological functions during pollen development. This study advanc...
Proceedings of the National Academy of Sciences, 2019
Time-series transcriptomes of a biological process obtained under different conditions are useful... more Time-series transcriptomes of a biological process obtained under different conditions are useful for identifying the regulators of the process and their regulatory networks. However, such data are 3D (gene expression, time, and condition), and there is currently no method that can deal with their full complexity. Here, we developed a method that avoids time-point alignment and normalization between conditions. We applied it to analyze time-series transcriptomes of developing maize leaves under light–dark cycles and under total darkness and obtained eight time-ordered gene coexpression networks (TO-GCNs), which can be used to predict upstream regulators of any genes in the GCNs. One of the eight TO-GCNs is light-independent and likely includes all genes involved in the development of Kranz anatomy, which is a structure crucial for the high efficiency of photosynthesis in C4plants. Using this TO-GCN, we predicted and experimentally validated a regulatory cascade upstream ofSHORTROOT1...
C 4 photosynthesis is more efficient than C 3 photosynthesis for two reasons. First, C 4 plants h... more C 4 photosynthesis is more efficient than C 3 photosynthesis for two reasons. First, C 4 plants have evolved a repertoire of C 4 enzymes to enhance CO 2 fixation. Second, C 4 leaves have Kranz anatomy with a high vein density in which the veins are surrounded by one layer of bundle sheath (BS) cells and one layer of mesophyll (M) cells. The BS and M cells are not only functionally well differentiated, but also well-coordinated for rapid transport of photo-assimilates between the two types of photosynthetic cells. Recent comparative transcriptomic and anatomical analyses of C 3 and C 4 leaves have revealed early onset of C 4-related processes in leaf development, suggesting that delayed mesophyll differentiation contributes to higher C 4 vein density, and have identified some candidate regulators for the higher vein density in C 4 leaves. Moreover, comparative transcriptomics of maize husk (C 3) and foliar leaves (C 4) has identified a cohort of candidate regulators of Kranz anatomy development. In addition, there has been major progress in the identification of transcription factor binding sites, greatly increasing our knowledge of gene regulation in plants.
A practical way to reduce the cost of surveying single-nucleotide polymorphism (SNP) in a large n... more A practical way to reduce the cost of surveying single-nucleotide polymorphism (SNP) in a large number of individuals is to measure the allele frequencies in pooled DNA samples. Pyrosequencing TM has been frequently used for this application because signals generated by this approach are proportional to the amount of DNA templates. The Pyrosequencing TM pyrogram is determined by the dispensing order of dNTPs, which is usually designed based on the known SNPs to avoid asynchronistic extensions of heterozygous sequences. Therefore, utilizing the pyrogram signals to identify de novo SNPs in DNA pools has never been undertook. Here, in this study we developed an algorithm to address this issue. With the sequence and pyrogram of the wild-type allele known in advance, we could use the pyrogram obtained from the pooled DNA sample to predict the sequence of the unknown mutant allele (de novo SNP) and estimate its allele frequency. Both computational simulation and experimental Pyrosequencing TM test results suggested that our method performs well.
Both cis and trans mutations contribute to gene expression divergence within and between species.... more Both cis and trans mutations contribute to gene expression divergence within and between species. We used Saccharomyces cerevisiae as a model organism to estimate the relative contributions of cis and trans variations to the expression divergence between a laboratory (BY) and a wild (RM) strain of yeast. We examined whether genes regulated by a single transcription factor (TF; single input module, SIM genes) or genes regulated by multiple TFs (multiple input module, MIM genes) are more susceptible to trans variation. Because a SIM gene is regulated by a single immediate upstream TF, the chance for a change to occur in its transacting factors would, on average, be smaller than that for a MIM gene. We chose 232 genes that exhibited expression divergence between BY and RM to test this hypothesis. We examined the expression patterns of these genes in a BY-RM coculture system and in a BY-RM diploid hybrid. We found that trans variation is far more important than cis variation for expression divergence between the two strains. However, because in 75% of the genes studied, cis variation has significantly contributed to expression divergence, cis change also plays a significant role in intraspecific expression evolution. Interestingly, we found that the proportion of genes with diverged expression between BY and RM is larger for MIM genes than for SIM genes; in fact, the proportion tends to increase with the number of transcription factors that regulate the gene. Moreover, MIM genes are, on average, subject to stronger trans effects than SIM genes, though the difference between the two types of genes is not conspicuous.
Pax genes are defined by the presence of a paired box that encodes a DNA-binding domain of 128 am... more Pax genes are defined by the presence of a paired box that encodes a DNA-binding domain of 128 amino acids. They are involved in the development of the central nervous system, organogenesis, and oncogenesis. The known Pax genes are divided into five groups within two supergroups. By means of a novel combination of evolutionary analysis, in vitro binding assays and in vivo functional analyses, we have identified the key residues that determine the differing DNA-binding properties of the two supergroups and of the Pax-2, 5, 8 and Pax-6 subgroups within supergroup I. The differences in binding properties between the two supergroups are largely caused by amino acid changes at residues 20 and 121 of the paired domain. Although the paired domains of the Pax-2, 5, 8 and the Pax-6 group differ by Ͼ19 amino acids, their distinct DNA-binding properties are determined almost completely by a single amino acid change. Thus, a small number of amino acid changes can account in large part for the divergence in binding properties among the known paired domains. Our approach for selecting candidate sites responsible for the functional divergence between genes should also be useful for studying other gene families.
Conspecific male animals fight for resources such as food and mating opportunities but typically ... more Conspecific male animals fight for resources such as food and mating opportunities but typically stop fighting after assessing their relative fighting abilities to avoid serious injuries. Physiologically, how the fighting behavior is controlled remains unknown. Using the fighting fish Betta splendens, we studied behavioral and brain-transcriptomic changes during the fight between the two opponents. At the behavioral level, surface-breathing, and biting/striking occurred only during intervals between mouth-locking. Eventually, the behaviors of the two opponents became synchronized, with each pair showing a unique behavioral pattern. At the physiological level, we examined the expression patterns of 23,306 brain transcripts using RNA-sequencing data from brains of fighting pairs after a 20-min (D20) and a 60-min (D60) fight. The two opponents in each D60 fighting pair showed a strong gene expression correlation, whereas those in D20 fighting pairs showed a weak correlation. Moreover, each fighting pair in the D60 group showed pair-specific gene expression patterns in a grade of membership analysis (GoM) and were grouped as a pair in the heatmap clustering. The observed pair-specific individualization in brain-transcriptomic synchronization (PIBS) suggested that this synchronization provides a physiological basis for the behavioral synchronization. An analysis using the synchronized genes in fighting pairs of the D60 group found genes enriched for ion transport, synaptic function, and learning and memory. Brain-transcriptomic synchronization could be a general phenomenon and may provide a new cornerstone with which to investigate coordinating and sustaining social interactions between two interacting partners of vertebrates.
This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
In higher plants, whole-genome duplication (WGD) is thought to facilitate the evolution of C 4 ph... more In higher plants, whole-genome duplication (WGD) is thought to facilitate the evolution of C 4 photosynthesis from C 3 photosynthesis. To understand this issue, we used new and existing leaf-development transcriptomes to construct two coding sequence databases for C 4 Gynandropsis gynandra and C 3 Tarenaya hassleriana, which shared a WGD before their divergence. We compared duplicated genes in the two species and found that the WGD contributed to four aspects of the evolution of C 4 photosynthesis in G. gynandra. First, G. gynandra has retained the duplicates of ALAAT (alanine aminotransferase) and GOGAT (glutamine oxoglutarate aminotransferase) for nitrogen recycling to establish a photorespiratory CO 2 pump in bundle sheath (BS) cells for increasing photosynthesis efficiency, suggesting that G. gynandra experienced a C 3-C 4 intermediate stage during the C 4 evolution. Second, G. gynandra has retained almost all known veindevelopment-related paralogous genes derived from the WGD event, likely contributing to the high vein complexity of G. gynandra. Third, the WGD facilitated the evolution of C 4 enzyme genes and their recruitment into the C 4 pathway. Fourth, several genes encoding photosystem I proteins were derived from the WGD and are upregulated in G. gynandra, likely enabling the NADH dehydrogenase-like complex to produce extra ATPs for the C 4 CO 2 concentration mechanism. Thus, the WGD apparently played an enabler role in the evolution of C 4 photosynthesis in G. gynandra. Importantly, an ALAAT duplicate became highly expressed in BS cells in G. gynandra, facilitating nitrogen recycling and transition to the C 4 cycle. This study revealed how WDG may facilitate C 4 photosynthesis evolution.
Many indicators of protein evolutionary rate have been proposed, but some of them are interrelate... more Many indicators of protein evolutionary rate have been proposed, but some of them are interrelated. The purpose of this study is to disentangle their correlations. We assess the strength of each indicator by controlling for the other indicators under study. We find that the number of microRNA (miRNA) types that regulate a gene is the strongest rate indicator (a negative correlation), followed by disorder content (the percentage of disordered regions in a protein, a positive correlation); the strength of disorder content as a rate indicator is substantially increased after controlling for the number of miRNA types. By dividing proteins into lowly and highly intrinsically disordered proteins (L-IDPs and H-IDPs), we find that proteins interacting with more H-IDPs tend to evolve more slowly, which largely explains the previous observation of a negative correlation between the number of protein-protein interactions and evolutionary rate. Moreover, all of the indicators examined here, except for the number of miRNA types, have different strengths in L-IDPs and in H-IDPs. Finally, the number of phosphorylation sites is weakly correlated with the number of miRNA types, and its strength as a rate indicator is substantially reduced when other indicators are considered. Our study reveals the relative strength of each rate indicator and increases our understanding of protein evolution.
Protein phosphorylation plays an important role in the regulation of protein function. Phosphoryl... more Protein phosphorylation plays an important role in the regulation of protein function. Phosphorylated residues are generally assumed to be subject to functional constraint, but it has recently been suggested from a comparison of distantly related vertebrate species that most phosphorylated residues evolve at the rates consistent with the surrounding regions. To resolve the controversy, we infer the ancestral phosphoproteome of human and mouse to compare the evolutionary rates of phosphorylated and nonphosphorylated serine (S), threonine (T), and tyrosine (Y) residues. This approach enables accurate estimation of evolutionary rates as it does not assume deep conservation of phosphorylated residues. We show that phosphorylated S/T residues tend to evolve more slowly than nonphosphorylated S/T residues not only in disordered but also in ordered protein regions, indicating evolutionary conservation of phosphorylated S/T residues in mammals. Thus, phosphorylated S/T residues tend to be subject to stronger functional constraint than nonphosphorylated residues regardless of the protein regions in which they reside. In contrast, phosphorylated Y residues evolve at similar rates as nonphosphorylated ones. We also find that the human lineage has gained more phosphorylated T residues and lost fewer phosphorylated Y residues than the mouse lineage. The cause of the gain/loss imbalance remains a mystery but should be worth exploring.
The evolution of duplicate genes has been a topic of broad interest. Here, we propose that the co... more The evolution of duplicate genes has been a topic of broad interest. Here, we propose that the conservation of gene family size is a good indicator of the rate of sequence evolution and some other biological properties. By comparing the human-chimpanzee-macaque orthologous gene families with and without family size conservation, we demonstrate that genes with family size conservation evolve more slowly than those without family size conservation. Our results further demonstrate that both family expansion and contraction events may accelerate gene evolution, resulting in elevated evolutionary rates in the genes without family size conservation. In addition, we show that the duplicate genes with family size conservation evolve significantly more slowly than those without family size conservation. Interestingly, the median evolutionary rate of singletons falls in between those of the above two types of duplicate gene families. Our results thus suggest that the controversy on whether duplicate genes evolve more slowly than singletons can be resolved when family size conservation is taken into consideration. Furthermore, we also observe that duplicate genes with family size conservation have the highest level of gene expression/expression breadth, the highest proportion of essential genes, and the lowest gene compactness, followed by singletons and then by duplicate genes without family size conservation. Such a trend accords well with our observations of evolutionary rates. Our results thus point to the importance of family size conservation in the evolution of duplicate genes.
The superior photosynthetic efficiency of C4leaves over C3leaves is owing to their unique Kranz a... more The superior photosynthetic efficiency of C4leaves over C3leaves is owing to their unique Kranz anatomy, in which the vein is surrounded by one layer of bundle sheath (BS) cells and one layer of mesophyll (M) cells. Kranz anatomy development starts from three contiguous ground meristem (GM) cells, but its regulators and underlying molecular mechanism are largely unknown. To identify the regulators, we obtained the transcriptomes of 11 maize embryonic leaf cell types from five stages of pre-Kranz cells starting from median GM cells and six stages of pre-M cells starting from undifferentiated cells. Principal component and clustering analyses of transcriptomic data revealed rapid pre-Kranz cell differentiation in the first two stages but slow differentiation in the last three stages, suggesting early Kranz cell fate determination. In contrast, pre-M cells exhibit a more prolonged transcriptional differentiation process. Differential gene expression and coexpression analyses identified...
It is well known that fish have fewer olfactory receptor (OR) genes than mammals do. In order to ... more It is well known that fish have fewer olfactory receptor (OR) genes than mammals do. In order to investigate the divergence of and evolutionary changes in OR genes in fish, 2 loach OR gene families (D1/3 and D32) from Taiwanese and Chinese populations of 2 species (Misgurnus anguillicaudatus and Paramisgurnus dabryanus) were analyzed. We found duplications in both D1/3 and D32 that had accumulated after the separation of the island of Taiwan from the Asian mainland ∼5 million yrs ago. The accumulation of duplicate copies in a relatively short time suggests that gaining extra copies of OR genes may have been selectively advantageous for adapting to local environments. On the other hand, the loss of some OR genes in the 2 species in Taiwan may reflect loss of selective constraints on some OR genes. Moreover, unlike the situation in mammals in which pseudogenes are found in each family, pseudogenes are clustered in an expanding gene family in P. dabryanus in Taiwan. The presence of pse...
The linkage disequilibrium in a subdivided populaton is shown to be equal to the sum of the avera... more The linkage disequilibrium in a subdivided populaton is shown to be equal to the sum of the average linkage disequilibrium for all subpopulations and the covariance between gene frequencies of the loci concerned. Thus, in a subdivided population the linkage disequilibrium may not be 0 even if the linkage disequilibrium in each subpopulation is 0. If a population is divided into two subpopulations between which migration occurs, the asymptotic rate of approach to linkage equilibrium is equal to either r or 2(m 1 + m 2) - (m 1 + m 2)2, whichever is smaller, where r is the recombination value and m 1 and m 2 are the proportions of immigrants in subpopulations 1 and 2, respectively. Thus, if migration rate is high compared with recombination value, the change of linkage disequilibrium in subdivided populations is similar to that of a single random mating population. On the other hand, if migration rate is low, the approach to lnkage equilibrium may be retarded in subdivided popula...
GAMYB, UDT1, TIP2/bHLH142, TDR, and EAT1/DTD are important transcription factors (TFs) that play ... more GAMYB, UDT1, TIP2/bHLH142, TDR, and EAT1/DTD are important transcription factors (TFs) that play a crucial role during rice pollen development. This study demonstrates that bHLH142 acts downstream of UDT1 and GAMYB and works as a “hub” in these two pollen pathways. We show that GAMYB modulates bHLH142 expression through specific binding to the MYB motif of bHLH142 promoter during early stage of pollen development; while TDR acts as a transcriptional repressor of the GAMYB modulation of bHLH142 by binding to the E-box close to the MYB motif on the promoter. The up- and down-regulation of TFs highlights the importance that a tight, precise, and coordinated regulation among these TFs is essential for normal pollen development. Most notably, this study illustrates the regulatory pathways of GAMYB and UDT1 that rely on bHLH142 in a direct and an indirect manner, respectively, and function in different tissues with distinct biological functions during pollen development. This study advanc...
Proceedings of the National Academy of Sciences, 2019
Time-series transcriptomes of a biological process obtained under different conditions are useful... more Time-series transcriptomes of a biological process obtained under different conditions are useful for identifying the regulators of the process and their regulatory networks. However, such data are 3D (gene expression, time, and condition), and there is currently no method that can deal with their full complexity. Here, we developed a method that avoids time-point alignment and normalization between conditions. We applied it to analyze time-series transcriptomes of developing maize leaves under light–dark cycles and under total darkness and obtained eight time-ordered gene coexpression networks (TO-GCNs), which can be used to predict upstream regulators of any genes in the GCNs. One of the eight TO-GCNs is light-independent and likely includes all genes involved in the development of Kranz anatomy, which is a structure crucial for the high efficiency of photosynthesis in C4plants. Using this TO-GCN, we predicted and experimentally validated a regulatory cascade upstream ofSHORTROOT1...
C 4 photosynthesis is more efficient than C 3 photosynthesis for two reasons. First, C 4 plants h... more C 4 photosynthesis is more efficient than C 3 photosynthesis for two reasons. First, C 4 plants have evolved a repertoire of C 4 enzymes to enhance CO 2 fixation. Second, C 4 leaves have Kranz anatomy with a high vein density in which the veins are surrounded by one layer of bundle sheath (BS) cells and one layer of mesophyll (M) cells. The BS and M cells are not only functionally well differentiated, but also well-coordinated for rapid transport of photo-assimilates between the two types of photosynthetic cells. Recent comparative transcriptomic and anatomical analyses of C 3 and C 4 leaves have revealed early onset of C 4-related processes in leaf development, suggesting that delayed mesophyll differentiation contributes to higher C 4 vein density, and have identified some candidate regulators for the higher vein density in C 4 leaves. Moreover, comparative transcriptomics of maize husk (C 3) and foliar leaves (C 4) has identified a cohort of candidate regulators of Kranz anatomy development. In addition, there has been major progress in the identification of transcription factor binding sites, greatly increasing our knowledge of gene regulation in plants.
A practical way to reduce the cost of surveying single-nucleotide polymorphism (SNP) in a large n... more A practical way to reduce the cost of surveying single-nucleotide polymorphism (SNP) in a large number of individuals is to measure the allele frequencies in pooled DNA samples. Pyrosequencing TM has been frequently used for this application because signals generated by this approach are proportional to the amount of DNA templates. The Pyrosequencing TM pyrogram is determined by the dispensing order of dNTPs, which is usually designed based on the known SNPs to avoid asynchronistic extensions of heterozygous sequences. Therefore, utilizing the pyrogram signals to identify de novo SNPs in DNA pools has never been undertook. Here, in this study we developed an algorithm to address this issue. With the sequence and pyrogram of the wild-type allele known in advance, we could use the pyrogram obtained from the pooled DNA sample to predict the sequence of the unknown mutant allele (de novo SNP) and estimate its allele frequency. Both computational simulation and experimental Pyrosequencing TM test results suggested that our method performs well.
Both cis and trans mutations contribute to gene expression divergence within and between species.... more Both cis and trans mutations contribute to gene expression divergence within and between species. We used Saccharomyces cerevisiae as a model organism to estimate the relative contributions of cis and trans variations to the expression divergence between a laboratory (BY) and a wild (RM) strain of yeast. We examined whether genes regulated by a single transcription factor (TF; single input module, SIM genes) or genes regulated by multiple TFs (multiple input module, MIM genes) are more susceptible to trans variation. Because a SIM gene is regulated by a single immediate upstream TF, the chance for a change to occur in its transacting factors would, on average, be smaller than that for a MIM gene. We chose 232 genes that exhibited expression divergence between BY and RM to test this hypothesis. We examined the expression patterns of these genes in a BY-RM coculture system and in a BY-RM diploid hybrid. We found that trans variation is far more important than cis variation for expression divergence between the two strains. However, because in 75% of the genes studied, cis variation has significantly contributed to expression divergence, cis change also plays a significant role in intraspecific expression evolution. Interestingly, we found that the proportion of genes with diverged expression between BY and RM is larger for MIM genes than for SIM genes; in fact, the proportion tends to increase with the number of transcription factors that regulate the gene. Moreover, MIM genes are, on average, subject to stronger trans effects than SIM genes, though the difference between the two types of genes is not conspicuous.
Pax genes are defined by the presence of a paired box that encodes a DNA-binding domain of 128 am... more Pax genes are defined by the presence of a paired box that encodes a DNA-binding domain of 128 amino acids. They are involved in the development of the central nervous system, organogenesis, and oncogenesis. The known Pax genes are divided into five groups within two supergroups. By means of a novel combination of evolutionary analysis, in vitro binding assays and in vivo functional analyses, we have identified the key residues that determine the differing DNA-binding properties of the two supergroups and of the Pax-2, 5, 8 and Pax-6 subgroups within supergroup I. The differences in binding properties between the two supergroups are largely caused by amino acid changes at residues 20 and 121 of the paired domain. Although the paired domains of the Pax-2, 5, 8 and the Pax-6 group differ by Ͼ19 amino acids, their distinct DNA-binding properties are determined almost completely by a single amino acid change. Thus, a small number of amino acid changes can account in large part for the divergence in binding properties among the known paired domains. Our approach for selecting candidate sites responsible for the functional divergence between genes should also be useful for studying other gene families.
Uploads
Papers by Wen-hsiung Li