Ferroptosis is a form of regulated cell death driven by lipid peroxidation of polyunsaturated fat... more Ferroptosis is a form of regulated cell death driven by lipid peroxidation of polyunsaturated fatty acids (PUFAs). Lipid peroxidation can propagate through either hydrogen-atom transfer (HAT) or peroxyl radical addition (PRA) mechanism. However, the contribution of the PRA mechanism to the induction of ferroptosis has not been studied. In this study, we aim to elucidate the relationship between the reactivity and mechanisms of lipid peroxidation and ferroptosis induction. We found that while some peroxidation-reactive lipids, such as 7-dehydrocholesterol, vitamins D3 and A, and coenzyme Q10, suppress ferroptosis, both nonconjugated and conjugated PUFAs enhanced cell death induced by RSL3, a ferroptosis inducer. Importantly, we found that conjugated polyunsaturated fatty acids (PUFAs), including conjugated linolenic acid (CLA 18:3) and conjugated linoleic acid (CLA 18:2) can induce or potentiate ferroptosis much more potently than nonconjugated PUFAs. We next sought to elucidate the mechanism underlying the different ferroptosis-inducing potency of conjugated and nonconjugated PUFAs. Lipidomics revealed that conjugated and nonconjugated PUFAs are incorporated into distinct cellular lipid species. The different peroxidation mechanisms predict the formation of higher levels of reactive electrophilic aldehydes from conjugated PUFAs than nonconjugated PUFAs, which was confirmed by aldehyde-trapping and mass spectrometry. RNA sequencing revealed that protein processing in the endoplasmic reticulum and proteasome are among the most significantly upregulated pathways in cells treated with CLA 18:3, suggesting increased ER stress and activation of unfolded protein response. Significantly, using click chemistry, we observed increased protein adduction by oxidized lipids in cells treated with an alkynylated CLA 18:2 probe. These results suggest that protein damage by lipid electrophiles is a key step in ferroptosis.
This paper aims to establish a link between aggregate organizational resilience capabilities and ... more This paper aims to establish a link between aggregate organizational resilience capabilities and managerial risk perception aspects during a major global crisis. We argue that a multi-theory perspective, dynamic capability at an organizational level and enactment theory at a managerial level allow us to better understand how the sensemaking process within managerial risk perception assists organizational resilience. We draw from in-depth interviews with 40 managers across the UK's food industry, which has been able to display resilience during the pandemic. In sensing supply chain risks (SCRs), managers within both authority-based and consensus-based organizational structures utilize risk-capture heuristics and enact actions related to effective communications, albeit at different information costs. In seizing, we found that managers adhere to distinct heuristics that are idiosyncratic to their organizational structures. Through limited horizontal communication channels, authority-based structures adhere to rudimentary how-to heuristics, whereas consensus-based structures use obtainable how-to heuristics. We contribute to the organizational resilience and dynamic capabilities literature by identifying assessment as an additional step prior to transforming, which depicts a retention process to inform future judgements. Our study presents a novel framework of organizational resilience to SCRs during equivocal environments, by providing a nuanced understanding of the construction of dynamic capabilities through sensemaking.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
In this study, we investigate robustness against covariate drift in spoken language understanding... more In this study, we investigate robustness against covariate drift in spoken language understanding (SLU). Covariate drift can occur in SLU when there is a drift between training and testing regarding what users request or how they request it. To study this we propose a method that exploits natural variations in data to create a covariate drift in SLU datasets. Experiments show that a state-of-the-art BERT-based model suffers performance loss under this drift. To mitigate the performance loss, we investigate distributionally robust optimization (DRO) for finetuning BERT-based models. We discuss some recent DRO methods, propose two new variants and empirically show that DRO improves robustness under drift.
With the expanding role of voice-controlled devices, bootstrapping spoken language understanding ... more With the expanding role of voice-controlled devices, bootstrapping spoken language understanding models from little labeled data becomes essential. Semi-supervised learning is a common technique to improve model performance when labeled data is scarce. In a real-world production system, the labeled data and the online test data often may come from different distributions. In this work, we use semi-supervised learning based on pseudolabeling with an auxiliary task on incoming unlabeled noisy data, which is closer to the test distribution. We demonstrate empirically that our approach can mitigate negative effects arising from training with non-representative labeled data as well as the negative impacts of noises in the data, which are introduced by pseudo-labeling and automatic speech recognition.
An entry from the Cambridge Structural Database, the world's repository for small molecule cr... more An entry from the Cambridge Structural Database, the world's repository for small molecule crystal structures. The entry contains experimental data from a crystal diffraction study. The deposited dataset for this entry is freely available from the CCDC and typically includes 3D coordinates, cell parameters, space group, experimental conditions and quality measures.
An entry from the Cambridge Structural Database, the world's repository for small molecule cr... more An entry from the Cambridge Structural Database, the world's repository for small molecule crystal structures. The entry contains experimental data from a crystal diffraction study. The deposited dataset for this entry is freely available from the CCDC and typically includes 3D coordinates, cell parameters, space group, experimental conditions and quality measures.
An entry from the Cambridge Structural Database, the world's repository for small molecule cr... more An entry from the Cambridge Structural Database, the world's repository for small molecule crystal structures. The entry contains experimental data from a crystal diffraction study. The deposited dataset for this entry is freely available from the CCDC and typically includes 3D coordinates, cell parameters, space group, experimental conditions and quality measures.
An entry from the Cambridge Structural Database, the world's repository for small molecule cr... more An entry from the Cambridge Structural Database, the world's repository for small molecule crystal structures. The entry contains experimental data from a crystal diffraction study. The deposited dataset for this entry is freely available from the CCDC and typically includes 3D coordinates, cell parameters, space group, experimental conditions and quality measures.
International Journal of Production Economics, 2021
The management of seafood processing by-products (SPBPs) is an interesting but underexplored topi... more The management of seafood processing by-products (SPBPs) is an interesting but underexplored topic in the circular economy (CE) research stream. The extant CE literature is mainly devoted to the topic's theoretical aspects and largely neglects the linkages between theory and practice, particularly in developing countries. This paper aims to empirically investigate CE implementation and its associated drivers and barriers in the context of SPBP management in a developing country. A multiple-case design is used on a sample of five firms that engage in SPBP treatment in Vietnam. We find evidence of circular practices in SPBP management that aim at cascading use and higher value creation. We also delineate eight drivers and 14 barriers rooted in four clusters: regulatory, sociocognitive, economic and supply chain, and technological factors. In addition to generic factors, we identify three exclusive drivers and five unique barriers specific to our cases. The findings are then interpreted through the lens of extended institutional theory to derive a holistic framework that captures the dynamic influences of various factors on CE diffusion. Our framework includes two addons: institutional logic and uncertainty. 'Legitimacy-embedded efficiency' is established as a shared logic of CE, whereby economic growth is achieved in harmony with environmental protection via the optimal use of resources. Uncertainty moderates the relative influences of legitimacy and efficiencyrelated factors on CE diffusion. Our practical contribution is to offer an actionable guide for key stakeholders of the SPBP supply chain, including local authorities in the transition from lowefficiency practices to novel circular ones.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
This paper addresses the question as to what degree a BERT-based multilingual Spoken Language Und... more This paper addresses the question as to what degree a BERT-based multilingual Spoken Language Understanding (SLU) model can transfer knowledge across languages. Through experiments we will show that, although it works substantially well even on distant language groups, there is still a gap to the ideal multilingual performance. In addition, we propose a novel BERT-based adversarial model architecture to learn language-shared and language-specific representations for multilingual SLU. Our experimental results prove that the proposed model is capable of narrowing the gap to the ideal multilingual performance.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021
In Natural Language Understanding (NLU), to facilitate Cross-Lingual Transfer Learning (CLTL), es... more In Natural Language Understanding (NLU), to facilitate Cross-Lingual Transfer Learning (CLTL), especially CLTL between distant languages, we integrate CLTL with Machine Translation (MT), and thereby propose a novel CLTL model named Translation Aided Language Learner (TALL). TALL is constructed as a standard transformer, where the encoder is a pre-trained multilingual language model. The training of TALL includes an MT-oriented pre-training and an NLU-oriented fine-tuning. To make use of unannotated data, we implement the recently proposed Unsupervised Machine Translation (UMT) technique in the MToriented pre-training of TALL. The experimental results show that the application of UMT enables TALL to consistently achieve better CLTL performance than our baseline model, which is the pre-trained multilingual language model serving as the encoder of TALL, without using more annotated data, and the performance gain is relatively prominent in the case of distant languages.
International Journal of Production Economics, 2021
This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
This paper provides the first experimental study on the impact of using domain-specific represent... more This paper provides the first experimental study on the impact of using domain-specific representations on a BERT-based multi-task spoken language understanding (SLU) model for multi-domain applications. Our results on a real-world dataset covering three languages indicate that by using domain-specific representations learned adversarially, model performance can be improved across all of the three SLU subtasks domain classification, intent classification and slot filling. Gains are particularly large for domains with limited training data.
In deployed real-world spoken language understanding (SLU) applications, data continuously flows ... more In deployed real-world spoken language understanding (SLU) applications, data continuously flows into the system. This leads to distributional differences between training and application data that can deteriorate model performance. While regularly retraining the deployed model with new data helps mitigating this problem, it implies significant computational and human costs. In this paper, we develop a method, which can help guiding decisions on whether a model is safe to keep in production without notable performance loss or needs to be retrained. Towards this goal, we build a performance drop regression model for an SLU model that was trained offline to detect a potential model drift in the production phase. We present a wide range of experiments on multiple real-world datasets, indicating that our method is useful for guiding decisions in the SLU model development cycle and to reduce costs for model retraining.
Despite the fact that data imbalance is becoming more and more common in real-world Spoken Langua... more Despite the fact that data imbalance is becoming more and more common in real-world Spoken Language Understanding (SLU) applications, it has not been studied extensively in the literature. To the best of our knowledge, this paper presents the first systematic study on handling data imbalance for SLU. In particular, we discuss the application of existing data balancing techniques for SLU and propose a multi-task SLU model for intent classification and slot filling. Aiming to avoid over-fitting, in our model methods for data balancing are leveraged indirectly via an auxiliary task which makes use of a class-balanced batch generator and (possibly) synthetic data. Our results on a real-world dataset indicate that i) our proposed model can boost performance on low frequency intents significantly while avoiding a potential performance decrease on the head intents, ii) synthetic data are beneficial for bootstrapping new intents when realistic data are not available, but iii) once a certain amount of realistic data becomes available, using synthetic data in the auxiliary task only yields better performance than adding them to the primary task training data, and iv) in a joint training scenario, balancing the intent distribution individually improves not only intent classification but also slot filling performance.
High-throughput, quantitative approaches have enabled the discovery of fundamental principles des... more High-throughput, quantitative approaches have enabled the discovery of fundamental principles describing bacterial physiology. These principles provide a foundation for predicting the behavior of biological systems, a widely held aspiration. However, these approaches are often exclusively applied to the best-known model organism, E. coli . In this report, we investigate to what extent quantitative principles discovered in Gram-negative E. coli are applicable to Gram-positive B. subtilis . We found that these two extremely divergent bacterial species employ deeply similar strategies in order to coordinate growth, cell size, and the cell cycle. These similarities mean that the quantitative physiological principles described here can likely provide a beachhead for others who wish to understand additional, less-studied prokaryotes.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019
A typical cross-lingual transfer learning approach boosting model performance on a resource-poor ... more A typical cross-lingual transfer learning approach boosting model performance on a resource-poor language is to pre-train the model on all available supervised data from another resource-rich language. However, in large-scale systems, this leads to high training times and computational requirements. In addition, characteristic differences between the source and target languages raise a natural question of whether source-language data selection can improve the knowledge transfer. In this paper, we address this question and propose a simple but effective language model based source-language data selection method for cross-lingual transfer learning in largescale spoken language understanding. The experimental results show that with data selection i) the source data amount and hence training speed is reduced significantly and ii) model performance is improved.
Bacillus subtilisandEscherichia coliare evolutionarily divergent model organisms that have elucid... more Bacillus subtilisandEscherichia coliare evolutionarily divergent model organisms that have elucidated fundamental differences between Gram-positive and Gram-negative bacteria, respectively. Despite their differences in cell cycle control at the molecular level, both organisms follow the same phenomenological principle for cell size homeostasis known as the adder. We thus asked to what extentB. subtilisandE. colishare common physiological principles in coordinating growth and the cell cycle. To answer this question, we measured physiological parameters ofB. subtilisunder various steady-state growth conditions with and without translation inhibition at both population and single-cell level. These experiments revealed core shared physiological principles betweenB. subtilisandE. coli. Specifically, we show that both organisms maintain an invariant cell size per replication origin at initiation, with and without growth inhibition, and even during nutrient shifts at the single-cell level....
A series of boron-functionalized BODIPY dyes with cyano groups were prepared from their correspon... more A series of boron-functionalized BODIPY dyes with cyano groups were prepared from their corresponding BF2derivatives using SnCl4/TMSCN at room temperature for 10 min. Replacement of the fluorines by cyano groups reduces the B–N bond lengths, decreases the charge on boron, and causes characteristic [Formula: see text]B NMR chemical shifts. The 4,4[Formula: see text]-dicyano-BODIPYs show significantly enhanced stability to acidic conditions (excess TFA) and, with one exception, enhanced fluorescence quantum yields. Furthermore, the B(CN)2-BODIPYs were non-cytotoxic to HEp2 cells, both in the dark and upon exposure to light (1.5 J/cm[Formula: see text], and rapidly accumulated within cells, localizing mainly in the lysosomes, ER and Golgi.
Ferroptosis is a form of regulated cell death driven by lipid peroxidation of polyunsaturated fat... more Ferroptosis is a form of regulated cell death driven by lipid peroxidation of polyunsaturated fatty acids (PUFAs). Lipid peroxidation can propagate through either hydrogen-atom transfer (HAT) or peroxyl radical addition (PRA) mechanism. However, the contribution of the PRA mechanism to the induction of ferroptosis has not been studied. In this study, we aim to elucidate the relationship between the reactivity and mechanisms of lipid peroxidation and ferroptosis induction. We found that while some peroxidation-reactive lipids, such as 7-dehydrocholesterol, vitamins D3 and A, and coenzyme Q10, suppress ferroptosis, both nonconjugated and conjugated PUFAs enhanced cell death induced by RSL3, a ferroptosis inducer. Importantly, we found that conjugated polyunsaturated fatty acids (PUFAs), including conjugated linolenic acid (CLA 18:3) and conjugated linoleic acid (CLA 18:2) can induce or potentiate ferroptosis much more potently than nonconjugated PUFAs. We next sought to elucidate the mechanism underlying the different ferroptosis-inducing potency of conjugated and nonconjugated PUFAs. Lipidomics revealed that conjugated and nonconjugated PUFAs are incorporated into distinct cellular lipid species. The different peroxidation mechanisms predict the formation of higher levels of reactive electrophilic aldehydes from conjugated PUFAs than nonconjugated PUFAs, which was confirmed by aldehyde-trapping and mass spectrometry. RNA sequencing revealed that protein processing in the endoplasmic reticulum and proteasome are among the most significantly upregulated pathways in cells treated with CLA 18:3, suggesting increased ER stress and activation of unfolded protein response. Significantly, using click chemistry, we observed increased protein adduction by oxidized lipids in cells treated with an alkynylated CLA 18:2 probe. These results suggest that protein damage by lipid electrophiles is a key step in ferroptosis.
This paper aims to establish a link between aggregate organizational resilience capabilities and ... more This paper aims to establish a link between aggregate organizational resilience capabilities and managerial risk perception aspects during a major global crisis. We argue that a multi-theory perspective, dynamic capability at an organizational level and enactment theory at a managerial level allow us to better understand how the sensemaking process within managerial risk perception assists organizational resilience. We draw from in-depth interviews with 40 managers across the UK's food industry, which has been able to display resilience during the pandemic. In sensing supply chain risks (SCRs), managers within both authority-based and consensus-based organizational structures utilize risk-capture heuristics and enact actions related to effective communications, albeit at different information costs. In seizing, we found that managers adhere to distinct heuristics that are idiosyncratic to their organizational structures. Through limited horizontal communication channels, authority-based structures adhere to rudimentary how-to heuristics, whereas consensus-based structures use obtainable how-to heuristics. We contribute to the organizational resilience and dynamic capabilities literature by identifying assessment as an additional step prior to transforming, which depicts a retention process to inform future judgements. Our study presents a novel framework of organizational resilience to SCRs during equivocal environments, by providing a nuanced understanding of the construction of dynamic capabilities through sensemaking.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
In this study, we investigate robustness against covariate drift in spoken language understanding... more In this study, we investigate robustness against covariate drift in spoken language understanding (SLU). Covariate drift can occur in SLU when there is a drift between training and testing regarding what users request or how they request it. To study this we propose a method that exploits natural variations in data to create a covariate drift in SLU datasets. Experiments show that a state-of-the-art BERT-based model suffers performance loss under this drift. To mitigate the performance loss, we investigate distributionally robust optimization (DRO) for finetuning BERT-based models. We discuss some recent DRO methods, propose two new variants and empirically show that DRO improves robustness under drift.
With the expanding role of voice-controlled devices, bootstrapping spoken language understanding ... more With the expanding role of voice-controlled devices, bootstrapping spoken language understanding models from little labeled data becomes essential. Semi-supervised learning is a common technique to improve model performance when labeled data is scarce. In a real-world production system, the labeled data and the online test data often may come from different distributions. In this work, we use semi-supervised learning based on pseudolabeling with an auxiliary task on incoming unlabeled noisy data, which is closer to the test distribution. We demonstrate empirically that our approach can mitigate negative effects arising from training with non-representative labeled data as well as the negative impacts of noises in the data, which are introduced by pseudo-labeling and automatic speech recognition.
An entry from the Cambridge Structural Database, the world's repository for small molecule cr... more An entry from the Cambridge Structural Database, the world's repository for small molecule crystal structures. The entry contains experimental data from a crystal diffraction study. The deposited dataset for this entry is freely available from the CCDC and typically includes 3D coordinates, cell parameters, space group, experimental conditions and quality measures.
An entry from the Cambridge Structural Database, the world's repository for small molecule cr... more An entry from the Cambridge Structural Database, the world's repository for small molecule crystal structures. The entry contains experimental data from a crystal diffraction study. The deposited dataset for this entry is freely available from the CCDC and typically includes 3D coordinates, cell parameters, space group, experimental conditions and quality measures.
An entry from the Cambridge Structural Database, the world's repository for small molecule cr... more An entry from the Cambridge Structural Database, the world's repository for small molecule crystal structures. The entry contains experimental data from a crystal diffraction study. The deposited dataset for this entry is freely available from the CCDC and typically includes 3D coordinates, cell parameters, space group, experimental conditions and quality measures.
An entry from the Cambridge Structural Database, the world's repository for small molecule cr... more An entry from the Cambridge Structural Database, the world's repository for small molecule crystal structures. The entry contains experimental data from a crystal diffraction study. The deposited dataset for this entry is freely available from the CCDC and typically includes 3D coordinates, cell parameters, space group, experimental conditions and quality measures.
International Journal of Production Economics, 2021
The management of seafood processing by-products (SPBPs) is an interesting but underexplored topi... more The management of seafood processing by-products (SPBPs) is an interesting but underexplored topic in the circular economy (CE) research stream. The extant CE literature is mainly devoted to the topic's theoretical aspects and largely neglects the linkages between theory and practice, particularly in developing countries. This paper aims to empirically investigate CE implementation and its associated drivers and barriers in the context of SPBP management in a developing country. A multiple-case design is used on a sample of five firms that engage in SPBP treatment in Vietnam. We find evidence of circular practices in SPBP management that aim at cascading use and higher value creation. We also delineate eight drivers and 14 barriers rooted in four clusters: regulatory, sociocognitive, economic and supply chain, and technological factors. In addition to generic factors, we identify three exclusive drivers and five unique barriers specific to our cases. The findings are then interpreted through the lens of extended institutional theory to derive a holistic framework that captures the dynamic influences of various factors on CE diffusion. Our framework includes two addons: institutional logic and uncertainty. 'Legitimacy-embedded efficiency' is established as a shared logic of CE, whereby economic growth is achieved in harmony with environmental protection via the optimal use of resources. Uncertainty moderates the relative influences of legitimacy and efficiencyrelated factors on CE diffusion. Our practical contribution is to offer an actionable guide for key stakeholders of the SPBP supply chain, including local authorities in the transition from lowefficiency practices to novel circular ones.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
This paper addresses the question as to what degree a BERT-based multilingual Spoken Language Und... more This paper addresses the question as to what degree a BERT-based multilingual Spoken Language Understanding (SLU) model can transfer knowledge across languages. Through experiments we will show that, although it works substantially well even on distant language groups, there is still a gap to the ideal multilingual performance. In addition, we propose a novel BERT-based adversarial model architecture to learn language-shared and language-specific representations for multilingual SLU. Our experimental results prove that the proposed model is capable of narrowing the gap to the ideal multilingual performance.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021
In Natural Language Understanding (NLU), to facilitate Cross-Lingual Transfer Learning (CLTL), es... more In Natural Language Understanding (NLU), to facilitate Cross-Lingual Transfer Learning (CLTL), especially CLTL between distant languages, we integrate CLTL with Machine Translation (MT), and thereby propose a novel CLTL model named Translation Aided Language Learner (TALL). TALL is constructed as a standard transformer, where the encoder is a pre-trained multilingual language model. The training of TALL includes an MT-oriented pre-training and an NLU-oriented fine-tuning. To make use of unannotated data, we implement the recently proposed Unsupervised Machine Translation (UMT) technique in the MToriented pre-training of TALL. The experimental results show that the application of UMT enables TALL to consistently achieve better CLTL performance than our baseline model, which is the pre-trained multilingual language model serving as the encoder of TALL, without using more annotated data, and the performance gain is relatively prominent in the case of distant languages.
International Journal of Production Economics, 2021
This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
This paper provides the first experimental study on the impact of using domain-specific represent... more This paper provides the first experimental study on the impact of using domain-specific representations on a BERT-based multi-task spoken language understanding (SLU) model for multi-domain applications. Our results on a real-world dataset covering three languages indicate that by using domain-specific representations learned adversarially, model performance can be improved across all of the three SLU subtasks domain classification, intent classification and slot filling. Gains are particularly large for domains with limited training data.
In deployed real-world spoken language understanding (SLU) applications, data continuously flows ... more In deployed real-world spoken language understanding (SLU) applications, data continuously flows into the system. This leads to distributional differences between training and application data that can deteriorate model performance. While regularly retraining the deployed model with new data helps mitigating this problem, it implies significant computational and human costs. In this paper, we develop a method, which can help guiding decisions on whether a model is safe to keep in production without notable performance loss or needs to be retrained. Towards this goal, we build a performance drop regression model for an SLU model that was trained offline to detect a potential model drift in the production phase. We present a wide range of experiments on multiple real-world datasets, indicating that our method is useful for guiding decisions in the SLU model development cycle and to reduce costs for model retraining.
Despite the fact that data imbalance is becoming more and more common in real-world Spoken Langua... more Despite the fact that data imbalance is becoming more and more common in real-world Spoken Language Understanding (SLU) applications, it has not been studied extensively in the literature. To the best of our knowledge, this paper presents the first systematic study on handling data imbalance for SLU. In particular, we discuss the application of existing data balancing techniques for SLU and propose a multi-task SLU model for intent classification and slot filling. Aiming to avoid over-fitting, in our model methods for data balancing are leveraged indirectly via an auxiliary task which makes use of a class-balanced batch generator and (possibly) synthetic data. Our results on a real-world dataset indicate that i) our proposed model can boost performance on low frequency intents significantly while avoiding a potential performance decrease on the head intents, ii) synthetic data are beneficial for bootstrapping new intents when realistic data are not available, but iii) once a certain amount of realistic data becomes available, using synthetic data in the auxiliary task only yields better performance than adding them to the primary task training data, and iv) in a joint training scenario, balancing the intent distribution individually improves not only intent classification but also slot filling performance.
High-throughput, quantitative approaches have enabled the discovery of fundamental principles des... more High-throughput, quantitative approaches have enabled the discovery of fundamental principles describing bacterial physiology. These principles provide a foundation for predicting the behavior of biological systems, a widely held aspiration. However, these approaches are often exclusively applied to the best-known model organism, E. coli . In this report, we investigate to what extent quantitative principles discovered in Gram-negative E. coli are applicable to Gram-positive B. subtilis . We found that these two extremely divergent bacterial species employ deeply similar strategies in order to coordinate growth, cell size, and the cell cycle. These similarities mean that the quantitative physiological principles described here can likely provide a beachhead for others who wish to understand additional, less-studied prokaryotes.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019
A typical cross-lingual transfer learning approach boosting model performance on a resource-poor ... more A typical cross-lingual transfer learning approach boosting model performance on a resource-poor language is to pre-train the model on all available supervised data from another resource-rich language. However, in large-scale systems, this leads to high training times and computational requirements. In addition, characteristic differences between the source and target languages raise a natural question of whether source-language data selection can improve the knowledge transfer. In this paper, we address this question and propose a simple but effective language model based source-language data selection method for cross-lingual transfer learning in largescale spoken language understanding. The experimental results show that with data selection i) the source data amount and hence training speed is reduced significantly and ii) model performance is improved.
Bacillus subtilisandEscherichia coliare evolutionarily divergent model organisms that have elucid... more Bacillus subtilisandEscherichia coliare evolutionarily divergent model organisms that have elucidated fundamental differences between Gram-positive and Gram-negative bacteria, respectively. Despite their differences in cell cycle control at the molecular level, both organisms follow the same phenomenological principle for cell size homeostasis known as the adder. We thus asked to what extentB. subtilisandE. colishare common physiological principles in coordinating growth and the cell cycle. To answer this question, we measured physiological parameters ofB. subtilisunder various steady-state growth conditions with and without translation inhibition at both population and single-cell level. These experiments revealed core shared physiological principles betweenB. subtilisandE. coli. Specifically, we show that both organisms maintain an invariant cell size per replication origin at initiation, with and without growth inhibition, and even during nutrient shifts at the single-cell level....
A series of boron-functionalized BODIPY dyes with cyano groups were prepared from their correspon... more A series of boron-functionalized BODIPY dyes with cyano groups were prepared from their corresponding BF2derivatives using SnCl4/TMSCN at room temperature for 10 min. Replacement of the fluorines by cyano groups reduces the B–N bond lengths, decreases the charge on boron, and causes characteristic [Formula: see text]B NMR chemical shifts. The 4,4[Formula: see text]-dicyano-BODIPYs show significantly enhanced stability to acidic conditions (excess TFA) and, with one exception, enhanced fluorescence quantum yields. Furthermore, the B(CN)2-BODIPYs were non-cytotoxic to HEp2 cells, both in the dark and upon exposure to light (1.5 J/cm[Formula: see text], and rapidly accumulated within cells, localizing mainly in the lysosomes, ER and Golgi.
Uploads
Papers by Quynh Do