Variability in red blood cell volumes (distribution width, RDW) increases with age and is strongl... more Variability in red blood cell volumes (distribution width, RDW) increases with age and is strongly predictive of mortality, incident coronary heart disease and cancer. We investigated inherited genetic variation associated with RDW in 116,666 UK Biobank volunteers. A large proportion RDW is explained by genetic variants (29%), especially in the older group (60+ year olds, 33.8%, <50 year olds, 28.4%). RDW associated with 194 independent genetic signals; 71 are known for conditions including autoimmune disease, certain cancers, BMI, Alzheimer's disease, longevity, age at menopause, bone density, myositis, Parkinson's disease, and age-related macular degeneration. Pathways analysis showed enrichment for telomere maintenance, ribosomal RNA and apoptosis. The majority of RDW-associated signals were intronic (119 of 194), including SNP rs6602909 located in an intron of oncogene GAS6; the SNP is also an eQTL for this gene in whole blood. RDW-associated exonic genetic signals in...
Genomes encompass all the information necessary to specify the development and function of an org... more Genomes encompass all the information necessary to specify the development and function of an organism. In addition to genes, genomes also contain a myriad of functional elements that control various steps in gene expression. A major class of these elements function only when transcribed into RNA as they serve as the binding sites for RNA binding proteins (RBPs), which act to control post-transcriptional processes including splicing, cleavage and polyadenylation, RNA editing, RNA localization, stability, and translation. Despite the importance of these functional RNA elements encoded in the genome, they have been much less studied than genes and DNA elements. Here, we describe the mapping and characterization of RNA elements recognized by a large collection of human RBPs in K562 and HepG2 cells. These data expand the catalog of functional elements encoded in the human genome by addition of a large set of elements that function at the RNA level through interaction with RBPs.
Epidemiologic data has linked obesity to a higher risk of pancreatic cancer, but the underlying m... more Epidemiologic data has linked obesity to a higher risk of pancreatic cancer, but the underlying mechanisms are poorly understood. To allow for detailed mechanistic studies in a relevant model mimicking diet-induced obesity and pancreatic cancer, a high-fat, high-calorie diet (HFCD) was given to P48+/Cre;LSL-KRASG12D (KC) mice carrying a pancreas-specific oncogenic Kras mutation. The mice were randomly allocated to a HFCD or control diet (CD). Cohorts were sacrificed at 3, 6, and 9 months and tissues were harvested for further analysis. Compared to CD-fed mice, HFCD-fed animals gained significantly more weight. Importantly, the cancer incidence was remarkably increased in HFCD-fed KC mice, particularly in male KC mice. In addition, KC mice fed the HFCD showed more extensive inflammation and fibrosis, and more advanced PanIN lesions in the pancreas, compared to age-matched CD-fed animals. Interestingly, we found that the HFCD reduced autophagic flux in PanIN lesions in KC mice. Furthe...
Substantial evidence suggests that the phasic activities of dopamine neurons in the midbrain of p... more Substantial evidence suggests that the phasic activities of dopamine neurons in the midbrain of primates report prediction errors in the delivery of reward. Recent recordings from these neurons in a task involving uncertain reward delivery present a crucial challenge to this hypothesis, and pose questions regarding the effects of both external and representational uncertainty on dopamine activity. Here, we analyse this issue, showing that the apparently anomalous activities are in fact what is expected under a standard prediction error account in light of different scalings of positive and negative prediction errors. We also study the implications of certain forms of representational noise on temporal difference learning.
Multi-armed bandits may be viewed as decompositionally-structured Markov decision processes (MDP'... more Multi-armed bandits may be viewed as decompositionally-structured Markov decision processes (MDP's) with potentially verylarge state sets. A particularly elegant methodology for computing optimal policies was developed over twenty ago by Gittins Gittins & Jones, 1974]. Gittins' approach reduces the problem of nding optimal policies for the original MDP to a sequence of low-dimensional stopping problems whose solutions determine the optimal policy through the so-called \Gittins indices." Katehakis and Veinott Katehakis & Veinott, 1987] have shown that the Gittins index for a process in state i may be interpreted as a particular component of the maximum-value function associated with the \restart-in-i" process, a simple MDP to which standard solution methods for computing optimal policies, such as successive approximation, apply. This paper explores the problem of learning the Gittins indices on-line without the aid of a process model; it suggests utilizing process-statespeci c Q-learning agents to solve their respective restart-in-state-i subproblems, and includes an example in which the online reinforcement learning approach is applied to a problem of stochastic scheduling|one instance drawn from a wide class of problems that may be formulated as bandit problems.
Advances in Neural Information Processing Systems, 1994
We describe the relationship between certain reinforcement learn-ing (RL) methods based on dynami... more We describe the relationship between certain reinforcement learn-ing (RL) methods based on dynamic programming (DP) and a class of unorthodox Monte Carlo methods for solving systems of linear equations proposed in the 1950's. These methods recast the solu-tion of the ...
International Conference on Machine Learning, 2003
Given a Markov decision process (MDP) with expressed prior uncertainties in the process transitio... more Given a Markov decision process (MDP) with expressed prior uncertainties in the process transition probabilities, we consider the problem of computing a policy that optimizes expected total (finite-horizon) reward. Implicitly, such a policy would effectively resolve the "exploration-versus-exploitation tradeoff" faced, for example, by an agent that seeks to optimize total reinforcement obtained over the entire duration of its interaction with an uncertain world. A Bayesian formulation leads to an associated MDP defined over a set of generalized process "hyperstates" whose cardinality grows exponentiaily with the planning horizon. Here we retain the full Bayesian framework, but sidestep intractability by applying techniques from reinforcement learning theory. We apply our resulting actor-critic algorithm to a problem of "optimal probing," in which the task is to identify unknown transition probabilities of an MDP using online experience.
Given a Markov chain with uncertain transition probabilities modelled in a Bayesian way, we inves... more Given a Markov chain with uncertain transition probabilities modelled in a Bayesian way, we investigate a technique for analytically approximating the mean transition frequency counts over a finite horizon. Conventional techniques for addressing this problem either require the enumeration of a set of generalized process "hyperstates" whose cardinality grows exponentially with the terminal horizon, or axe limited to the two-state case and expressed in terms of hypergeometric series. Our approach makes use of a diffusion approximation technique for modelling the evolution of information state components of the hyperstate process. Interest in this problem stems from a consideration of the policy evaluation step of policy iteration algorithms applied to Markov decision processes with uncertain transition probabilities.
International Joint Conference on Neural Networks, 1989
Summary form only given, as follows. An attempt is described to model the note-to-note tonal tran... more Summary form only given, as follows. An attempt is described to model the note-to-note tonal transitions in the music of J.S. Bach with a two-layer neural set. The learning phase employs the standard backpropagation algorithm, along with several minor modifications to improve performance. In the simulation or 'composition' phase, where the goal is the production of a new piece of music possessing Bach-like qualities, a mixture of net realizations is proposed as a means of avoiding trivial cycling.<<ETX>>
Variability in red blood cell volumes (distribution width, RDW) increases with age and is strongl... more Variability in red blood cell volumes (distribution width, RDW) increases with age and is strongly predictive of mortality, incident coronary heart disease and cancer. We investigated inherited genetic variation associated with RDW in 116,666 UK Biobank volunteers. A large proportion RDW is explained by genetic variants (29%), especially in the older group (60+ year olds, 33.8%, <50 year olds, 28.4%). RDW associated with 194 independent genetic signals; 71 are known for conditions including autoimmune disease, certain cancers, BMI, Alzheimer's disease, longevity, age at menopause, bone density, myositis, Parkinson's disease, and age-related macular degeneration. Pathways analysis showed enrichment for telomere maintenance, ribosomal RNA and apoptosis. The majority of RDW-associated signals were intronic (119 of 194), including SNP rs6602909 located in an intron of oncogene GAS6; the SNP is also an eQTL for this gene in whole blood. RDW-associated exonic genetic signals in...
Genomes encompass all the information necessary to specify the development and function of an org... more Genomes encompass all the information necessary to specify the development and function of an organism. In addition to genes, genomes also contain a myriad of functional elements that control various steps in gene expression. A major class of these elements function only when transcribed into RNA as they serve as the binding sites for RNA binding proteins (RBPs), which act to control post-transcriptional processes including splicing, cleavage and polyadenylation, RNA editing, RNA localization, stability, and translation. Despite the importance of these functional RNA elements encoded in the genome, they have been much less studied than genes and DNA elements. Here, we describe the mapping and characterization of RNA elements recognized by a large collection of human RBPs in K562 and HepG2 cells. These data expand the catalog of functional elements encoded in the human genome by addition of a large set of elements that function at the RNA level through interaction with RBPs.
Epidemiologic data has linked obesity to a higher risk of pancreatic cancer, but the underlying m... more Epidemiologic data has linked obesity to a higher risk of pancreatic cancer, but the underlying mechanisms are poorly understood. To allow for detailed mechanistic studies in a relevant model mimicking diet-induced obesity and pancreatic cancer, a high-fat, high-calorie diet (HFCD) was given to P48+/Cre;LSL-KRASG12D (KC) mice carrying a pancreas-specific oncogenic Kras mutation. The mice were randomly allocated to a HFCD or control diet (CD). Cohorts were sacrificed at 3, 6, and 9 months and tissues were harvested for further analysis. Compared to CD-fed mice, HFCD-fed animals gained significantly more weight. Importantly, the cancer incidence was remarkably increased in HFCD-fed KC mice, particularly in male KC mice. In addition, KC mice fed the HFCD showed more extensive inflammation and fibrosis, and more advanced PanIN lesions in the pancreas, compared to age-matched CD-fed animals. Interestingly, we found that the HFCD reduced autophagic flux in PanIN lesions in KC mice. Furthe...
Substantial evidence suggests that the phasic activities of dopamine neurons in the midbrain of p... more Substantial evidence suggests that the phasic activities of dopamine neurons in the midbrain of primates report prediction errors in the delivery of reward. Recent recordings from these neurons in a task involving uncertain reward delivery present a crucial challenge to this hypothesis, and pose questions regarding the effects of both external and representational uncertainty on dopamine activity. Here, we analyse this issue, showing that the apparently anomalous activities are in fact what is expected under a standard prediction error account in light of different scalings of positive and negative prediction errors. We also study the implications of certain forms of representational noise on temporal difference learning.
Multi-armed bandits may be viewed as decompositionally-structured Markov decision processes (MDP'... more Multi-armed bandits may be viewed as decompositionally-structured Markov decision processes (MDP's) with potentially verylarge state sets. A particularly elegant methodology for computing optimal policies was developed over twenty ago by Gittins Gittins & Jones, 1974]. Gittins' approach reduces the problem of nding optimal policies for the original MDP to a sequence of low-dimensional stopping problems whose solutions determine the optimal policy through the so-called \Gittins indices." Katehakis and Veinott Katehakis & Veinott, 1987] have shown that the Gittins index for a process in state i may be interpreted as a particular component of the maximum-value function associated with the \restart-in-i" process, a simple MDP to which standard solution methods for computing optimal policies, such as successive approximation, apply. This paper explores the problem of learning the Gittins indices on-line without the aid of a process model; it suggests utilizing process-statespeci c Q-learning agents to solve their respective restart-in-state-i subproblems, and includes an example in which the online reinforcement learning approach is applied to a problem of stochastic scheduling|one instance drawn from a wide class of problems that may be formulated as bandit problems.
Advances in Neural Information Processing Systems, 1994
We describe the relationship between certain reinforcement learn-ing (RL) methods based on dynami... more We describe the relationship between certain reinforcement learn-ing (RL) methods based on dynamic programming (DP) and a class of unorthodox Monte Carlo methods for solving systems of linear equations proposed in the 1950's. These methods recast the solu-tion of the ...
International Conference on Machine Learning, 2003
Given a Markov decision process (MDP) with expressed prior uncertainties in the process transitio... more Given a Markov decision process (MDP) with expressed prior uncertainties in the process transition probabilities, we consider the problem of computing a policy that optimizes expected total (finite-horizon) reward. Implicitly, such a policy would effectively resolve the "exploration-versus-exploitation tradeoff" faced, for example, by an agent that seeks to optimize total reinforcement obtained over the entire duration of its interaction with an uncertain world. A Bayesian formulation leads to an associated MDP defined over a set of generalized process "hyperstates" whose cardinality grows exponentiaily with the planning horizon. Here we retain the full Bayesian framework, but sidestep intractability by applying techniques from reinforcement learning theory. We apply our resulting actor-critic algorithm to a problem of "optimal probing," in which the task is to identify unknown transition probabilities of an MDP using online experience.
Given a Markov chain with uncertain transition probabilities modelled in a Bayesian way, we inves... more Given a Markov chain with uncertain transition probabilities modelled in a Bayesian way, we investigate a technique for analytically approximating the mean transition frequency counts over a finite horizon. Conventional techniques for addressing this problem either require the enumeration of a set of generalized process "hyperstates" whose cardinality grows exponentially with the terminal horizon, or axe limited to the two-state case and expressed in terms of hypergeometric series. Our approach makes use of a diffusion approximation technique for modelling the evolution of information state components of the hyperstate process. Interest in this problem stems from a consideration of the policy evaluation step of policy iteration algorithms applied to Markov decision processes with uncertain transition probabilities.
International Joint Conference on Neural Networks, 1989
Summary form only given, as follows. An attempt is described to model the note-to-note tonal tran... more Summary form only given, as follows. An attempt is described to model the note-to-note tonal transitions in the music of J.S. Bach with a two-layer neural set. The learning phase employs the standard backpropagation algorithm, along with several minor modifications to improve performance. In the simulation or 'composition' phase, where the goal is the production of a new piece of music possessing Bach-like qualities, a mixture of net realizations is proposed as a means of avoiding trivial cycling.<<ETX>>
Uploads
Papers by michael duff