Everyone agrees that real cognition requires much more than static pattern recognition. In partic... more Everyone agrees that real cognition requires much more than static pattern recognition. In particular, it requires the ability to learn sequences of patterns (or actions) But learning sequences really means being able to learn multiple sequences, one after the other, wi thout the most recently learned ones erasing the previously learned ones. But if catastrophic interference is a problem for the sequential learning of individual patterns, the problem is amplified many times over when multiple sequences of patterns have to be learned consecutively, because each new sequence consists of many linked patterns. In this paper we will present a connectionist architecture that would seem to solve the problem of multiple sequence learning using pseudopatterns.
Modeling Language, Cognition and Action - Proceedings of the Ninth Neural Computation and Psychology Workshop, 2005
Children aged 3½ exhibit less backwards blocking effect than those aged 4½; the latter only are s... more Children aged 3½ exhibit less backwards blocking effect than those aged 4½; the latter only are sensitive to probabilities . The original account proposed by is that children develop a mechanism for Bayesian structure learning. This account is problematic because it evades the explanation of the origins of the initial core of knowledge that is used by the posited Bayesian mechanism. I propose here an alternative explanation: Children's differential performance stems from a memory limitation, with retroactive interference stronger in younger children, but adult-like in older children. This claim is supported by simulations with Ans and Rousset's (1997) memory self-refreshing neural networks architecture.
From Associations to Rules - Connectionist Models of Behavior and Cognition - Proceedings of the Tenth Neural Computation and Psychology Workshop, 2008
Associative models, such as the Rescorla-Wagner model , correctly predict how some experimental m... more Associative models, such as the Rescorla-Wagner model , correctly predict how some experimental manipulations give rise to illusory correlations. However, they predict that outcome-density effects (and illusory correlations, in general) are a preasymptotic bias that vanishes as learning proceeds, and only predict positive illusory correlations. Behavioural data showing illusory correlations that persist after extensive training and showing persistent negative illusory correlations exist but have been considered as anomalies. We investigated what the simplest connectionist architecture should comprise in order to encompass these results. Though the phenomenon involves the acquisition of hetero-associative relationships, a simple hetero-associator did not suffice. An auto-hetero-associator was needed in order to simulate the behavioural data. This indicates that the structure of the inputs contributes to the outcome-density effect.
People often believe that they exert control on uncontrollable outcomes, a phenomenon that has be... more People often believe that they exert control on uncontrollable outcomes, a phenomenon that has been called illusion of control. Psychologists tend ,to attribute ,this illusion to personality variables. However, we present simulations showing,that the illusion of control ,can be explained ,at a simpler level of analysis. In brief, if a person desires an outcome and tends to act as
Many theories of contingency learning assume (either explicitly or implicitly) that predicting wh... more Many theories of contingency learning assume (either explicitly or implicitly) that predicting whether an outcome will occur should be easier than making a causal judgment. Previous research suggests that outcome predictions would depart from normative standards less often than causal judgments, which is consistent with the idea that the latter are based on more numerous and complex processes. However, only indirect evidence exists for this view. The experiment presented here specifically addresses this issue by allowing for a fair comparison of causal judgments and outcome predictions, both collected at the same stage with identical rating scales. Cue density, a parameter known to affect judgments, is manipulated in a contingency learning paradigm. The results show that, if anything, the cue-density bias is stronger in outcome predictions than in causal judgments. These results contradict key assumptions of many influential theories of contingency learning.
a general class of models (also known as random-or mixed-effect generalized 1 linear models) that... more a general class of models (also known as random-or mixed-effect generalized 1 linear models) that takes into account the hierarchical nature (i.e., the non-independence) of data.
While there is growing evidence that some dyslexic children suffer from a deficit in simultaneous... more While there is growing evidence that some dyslexic children suffer from a deficit in simultaneously processing multiple visually displayed elements, the precise nature of the deficit remains largely unclear. The aim of the present study is to investigate possible cognitive impairments at the source of this visual processing deficit in dyslexic children.
Attractors of nonlinear neural systems are at the core of the memory self-refreshing mechanism of... more Attractors of nonlinear neural systems are at the core of the memory self-refreshing mechanism of human memory models that suppose memories are dynamically maintained in a distributed network [, 4, 27-32]. Are humans able to learn never seen items from attractor patterns generated by a highly distributed artificial neural network? First, an opposition method was implemented to ensure that the attractors are not the items used to train the network, the source items: attractors were selected to be more similar (both at the exemplar and the centroïd level) to some control items than to the source items. In spite of this very severe selection, blank networks trained only on selected attractors performed better at test on the never seen source items than on the never seen control items. The results of two behavioural experiments using the opposition method show that humans exhibit more familiarity with the never seen source items than with the never seen control items, just as networks do. Thus, humans are sensitive to the particular type of information that allows distributed artificial neural networks to dynamically maintain their memory, and this information does not amount to the exemplars used to train the network that produced the attractors.
Absiriici. While relroaclive interference (RI) is a well-known phenomenon in humans, the differen... more Absiriici. While relroaclive interference (RI) is a well-known phenomenon in humans, the differential effect of the structure of the learning material was only seldom addressed. Minnan and Spivey (2001, Connecting Science, 13: 257-275) reported on behavioural result.s that show more RI for the subjects exposed to 'Structured' items than for those exposed to "Unstructured" items. These authors claimed that two complementary memory systems functioning on radically different neural mechanisms are required to account for the behavioural results they reported. Using the same paradigm but controlling for proactive interference, we found the opposite pattern of results, that is, more RI for subjects exposed t{) 'Unstructured' items than for those exposed to 'Structured' items (experiment I). Two additional experiments showed that this structure effect on RI is a genuine one. Experiment 2 confirmed that the design of experiment 1 forced the subjects from the "Structured" condition to learn the items at the exemplar level, thus allowing for a close match between the two to-becompared conditions (as "Unstructured" condition items can be learned only at the exemplar level). Experiment 3 veritied that the subjects from the 'Structured' condition could gcnerali/e to novel items. Simulations conducted with a three-layer neural network, that is. a single-memory system, produced a pattern of results that mirrors the structure effect reported here. By construction, Mirman and Spivey's architecture cannot simulate this behavioural structure effect. The results are discussed within the framework of catastrophic interference in distributed neural networks, with an emphasis on the relevance of these networks to the modelling of human memory.
While humans forget gradually, highly distributed connectionist networks forget catastrophically:... more While humans forget gradually, highly distributed connectionist networks forget catastrophically: newly learned information often completely erases previously learned information. This is not just implausible cognitively, but disastrous practically. However, it is not easy in connectionist cognitive modelling to keep away from highly distributed neural networks, if only because of their ability to generalize. A realistic and effective system that solves the problem of catastrophic interference in sequential learning of 'static' (i.e. non-temporally ordered) patterns has been proposed recently ). The basic principle is to learn new external patterns interleaved with internally generated 'pseudopatterns' (generated from random activation) that reflect the previously learned information. However, to be credible, this self-refreshing mechanism for static learning has to encompass our human ability to learn serially many temporal sequences of patterns without catastrophic forgetting. Temporal sequence learning is arguably more important than static pattern learning in the real world. In this paper, we develop a dual-network architecture in which self-generated pseudopatterns reflect (non-temporally) all the sequences of temporally ordered items previously learned. Using these pseudopatterns, several self-refreshing mechanisms that eliminate catastrophic forgetting in sequence learning are described and their efficiency is demonstrated through simulations. Finally, an experiment is presented that evidences a close similarity between human and simulated behaviour. for a review, see French 1999). The heart of the problem is that the very property-a single set of weights to encode informationthat gives connectionist networks their remarkable abilities to generalize and to have a gradual drop off in performance in the presence of incomplete information is also the root cause of catastrophic interference. When such a system learns a new set of items, a large majority of its weights that were adjusted for previously learned items will be modified, which frequently results in a severe loss of the old information. This is the 'sensitivity-stability dilemma' or 'stability-plasticity dilemma' (Grossberg 1987, Carpenter and. While it is true that memory models that use non-overlapping, or sparsely distributed, representations face this dilemma to a considerably lesser extent (e.g. generalization often requires a highly distributed system; and along with this high degree of distribution comes catastrophic interference.
Although normatively irrelevant to the relationship between a cue and an outcome, outcome density... more Although normatively irrelevant to the relationship between a cue and an outcome, outcome density (i.e. its base-rate probability) affects people's estimation of causality. By what process causality is incorrectly estimated is of importance to an integrative theory of causal learning. A potential explanation may be that this happens because outcome density induces a judgement bias. An alternative explanation is explored here, following which the incorrect estimation of causality is grounded in the processing of cue-outcome information during learning. A first neural network simulation shows that, in the absence of a deep processing of cue information, cue-outcome relationships are acquired but causality is correctly estimated. The second simulation shows how an incorrect estimation of causality may emerge from the active processing of both cue and outcome information. In an experiment inspired by the simulations, the role of a deep processing of cue information was put to test. In addition to an outcome density manipulation, a shallow cue manipulation was introduced: cue information was either still displayed (concurrent) or no longer displayed (delayed) when outcome information was given. Behavioural and simulation results agree: the outcome-density effect was maximal in the concurrent condition. The results are discussed with respect to the extant explanations of the outcome-density effect within the causal learning framework.
This research shows that the motivation to posses a desired characteristic (or to avoid an undesi... more This research shows that the motivation to posses a desired characteristic (or to avoid an undesired one) results in self-perceptions that guide people's use of base rate in the Lawyer-Engineer problem . In four studies, participants induced to believe (or recall, Exp. 2) that a rational cognitive style is success-conducive (or an intuitive cognitive style failure-conducive) subsequently viewed themselves as more rational and relied more on base rate in their probability estimates than those induced to believe that a rational cognitive style is failure-conducive (or an intuitive cognitive style success-conducive). These findings show that the desired self had an influence on reasoning in the self-unrelated lawyer-engineer task, since the use of base rates was mediated by changes in participants' perceptions of their own rationality. These findings therefore show that the desired self, through the working self-concept that it entails, constitutes another factor influencing people's use of distinct modes of reasoning.
Building on the human memory model that consider LTM to be similar to a distributed network , and... more Building on the human memory model that consider LTM to be similar to a distributed network , and informed by the recent solutions to catastrophic forgetting that suppose memories are dynamically maintained in a dual architecture through a memory self-refreshing mechanism (checked whether false memories of never seen (target) items can be created in humans by exposure to "pseudo-patterns" generated from random input in an artificial neural network (previously trained on the target items). In a behavioral experiment using an opposition method it is shown that the answer is yes: Though the pseudo-patterns presented to the participants were selected so as to resemble (both at the exemplar and the prototype level) more the control items than the target items, the participants exhibited more familiarity for the target items previously learned by the artificial neural network. This behavioral result analogous to the one found in simulations indicates that humans, like distributed neural networks, are able to make use of the information the memory self-refreshing mechanism is based upon. The implications of these findings are discussed in the framework of memory consolidation.
Everyone agrees that real cognition requires much more than static pattern recognition. In partic... more Everyone agrees that real cognition requires much more than static pattern recognition. In particular, it requires the ability to learn sequences of patterns (or actions) But learning sequences really means being able to learn multiple sequences, one after the other, wi thout the most recently learned ones erasing the previously learned ones. But if catastrophic interference is a problem for the sequential learning of individual patterns, the problem is amplified many times over when multiple sequences of patterns have to be learned consecutively, because each new sequence consists of many linked patterns. In this paper we will present a connectionist architecture that would seem to solve the problem of multiple sequence learning using pseudopatterns.
Modeling Language, Cognition and Action - Proceedings of the Ninth Neural Computation and Psychology Workshop, 2005
Children aged 3½ exhibit less backwards blocking effect than those aged 4½; the latter only are s... more Children aged 3½ exhibit less backwards blocking effect than those aged 4½; the latter only are sensitive to probabilities . The original account proposed by is that children develop a mechanism for Bayesian structure learning. This account is problematic because it evades the explanation of the origins of the initial core of knowledge that is used by the posited Bayesian mechanism. I propose here an alternative explanation: Children's differential performance stems from a memory limitation, with retroactive interference stronger in younger children, but adult-like in older children. This claim is supported by simulations with Ans and Rousset's (1997) memory self-refreshing neural networks architecture.
From Associations to Rules - Connectionist Models of Behavior and Cognition - Proceedings of the Tenth Neural Computation and Psychology Workshop, 2008
Associative models, such as the Rescorla-Wagner model , correctly predict how some experimental m... more Associative models, such as the Rescorla-Wagner model , correctly predict how some experimental manipulations give rise to illusory correlations. However, they predict that outcome-density effects (and illusory correlations, in general) are a preasymptotic bias that vanishes as learning proceeds, and only predict positive illusory correlations. Behavioural data showing illusory correlations that persist after extensive training and showing persistent negative illusory correlations exist but have been considered as anomalies. We investigated what the simplest connectionist architecture should comprise in order to encompass these results. Though the phenomenon involves the acquisition of hetero-associative relationships, a simple hetero-associator did not suffice. An auto-hetero-associator was needed in order to simulate the behavioural data. This indicates that the structure of the inputs contributes to the outcome-density effect.
People often believe that they exert control on uncontrollable outcomes, a phenomenon that has be... more People often believe that they exert control on uncontrollable outcomes, a phenomenon that has been called illusion of control. Psychologists tend ,to attribute ,this illusion to personality variables. However, we present simulations showing,that the illusion of control ,can be explained ,at a simpler level of analysis. In brief, if a person desires an outcome and tends to act as
Many theories of contingency learning assume (either explicitly or implicitly) that predicting wh... more Many theories of contingency learning assume (either explicitly or implicitly) that predicting whether an outcome will occur should be easier than making a causal judgment. Previous research suggests that outcome predictions would depart from normative standards less often than causal judgments, which is consistent with the idea that the latter are based on more numerous and complex processes. However, only indirect evidence exists for this view. The experiment presented here specifically addresses this issue by allowing for a fair comparison of causal judgments and outcome predictions, both collected at the same stage with identical rating scales. Cue density, a parameter known to affect judgments, is manipulated in a contingency learning paradigm. The results show that, if anything, the cue-density bias is stronger in outcome predictions than in causal judgments. These results contradict key assumptions of many influential theories of contingency learning.
a general class of models (also known as random-or mixed-effect generalized 1 linear models) that... more a general class of models (also known as random-or mixed-effect generalized 1 linear models) that takes into account the hierarchical nature (i.e., the non-independence) of data.
While there is growing evidence that some dyslexic children suffer from a deficit in simultaneous... more While there is growing evidence that some dyslexic children suffer from a deficit in simultaneously processing multiple visually displayed elements, the precise nature of the deficit remains largely unclear. The aim of the present study is to investigate possible cognitive impairments at the source of this visual processing deficit in dyslexic children.
Attractors of nonlinear neural systems are at the core of the memory self-refreshing mechanism of... more Attractors of nonlinear neural systems are at the core of the memory self-refreshing mechanism of human memory models that suppose memories are dynamically maintained in a distributed network [, 4, 27-32]. Are humans able to learn never seen items from attractor patterns generated by a highly distributed artificial neural network? First, an opposition method was implemented to ensure that the attractors are not the items used to train the network, the source items: attractors were selected to be more similar (both at the exemplar and the centroïd level) to some control items than to the source items. In spite of this very severe selection, blank networks trained only on selected attractors performed better at test on the never seen source items than on the never seen control items. The results of two behavioural experiments using the opposition method show that humans exhibit more familiarity with the never seen source items than with the never seen control items, just as networks do. Thus, humans are sensitive to the particular type of information that allows distributed artificial neural networks to dynamically maintain their memory, and this information does not amount to the exemplars used to train the network that produced the attractors.
Absiriici. While relroaclive interference (RI) is a well-known phenomenon in humans, the differen... more Absiriici. While relroaclive interference (RI) is a well-known phenomenon in humans, the differential effect of the structure of the learning material was only seldom addressed. Minnan and Spivey (2001, Connecting Science, 13: 257-275) reported on behavioural result.s that show more RI for the subjects exposed to 'Structured' items than for those exposed to "Unstructured" items. These authors claimed that two complementary memory systems functioning on radically different neural mechanisms are required to account for the behavioural results they reported. Using the same paradigm but controlling for proactive interference, we found the opposite pattern of results, that is, more RI for subjects exposed t{) 'Unstructured' items than for those exposed to 'Structured' items (experiment I). Two additional experiments showed that this structure effect on RI is a genuine one. Experiment 2 confirmed that the design of experiment 1 forced the subjects from the "Structured" condition to learn the items at the exemplar level, thus allowing for a close match between the two to-becompared conditions (as "Unstructured" condition items can be learned only at the exemplar level). Experiment 3 veritied that the subjects from the 'Structured' condition could gcnerali/e to novel items. Simulations conducted with a three-layer neural network, that is. a single-memory system, produced a pattern of results that mirrors the structure effect reported here. By construction, Mirman and Spivey's architecture cannot simulate this behavioural structure effect. The results are discussed within the framework of catastrophic interference in distributed neural networks, with an emphasis on the relevance of these networks to the modelling of human memory.
While humans forget gradually, highly distributed connectionist networks forget catastrophically:... more While humans forget gradually, highly distributed connectionist networks forget catastrophically: newly learned information often completely erases previously learned information. This is not just implausible cognitively, but disastrous practically. However, it is not easy in connectionist cognitive modelling to keep away from highly distributed neural networks, if only because of their ability to generalize. A realistic and effective system that solves the problem of catastrophic interference in sequential learning of 'static' (i.e. non-temporally ordered) patterns has been proposed recently ). The basic principle is to learn new external patterns interleaved with internally generated 'pseudopatterns' (generated from random activation) that reflect the previously learned information. However, to be credible, this self-refreshing mechanism for static learning has to encompass our human ability to learn serially many temporal sequences of patterns without catastrophic forgetting. Temporal sequence learning is arguably more important than static pattern learning in the real world. In this paper, we develop a dual-network architecture in which self-generated pseudopatterns reflect (non-temporally) all the sequences of temporally ordered items previously learned. Using these pseudopatterns, several self-refreshing mechanisms that eliminate catastrophic forgetting in sequence learning are described and their efficiency is demonstrated through simulations. Finally, an experiment is presented that evidences a close similarity between human and simulated behaviour. for a review, see French 1999). The heart of the problem is that the very property-a single set of weights to encode informationthat gives connectionist networks their remarkable abilities to generalize and to have a gradual drop off in performance in the presence of incomplete information is also the root cause of catastrophic interference. When such a system learns a new set of items, a large majority of its weights that were adjusted for previously learned items will be modified, which frequently results in a severe loss of the old information. This is the 'sensitivity-stability dilemma' or 'stability-plasticity dilemma' (Grossberg 1987, Carpenter and. While it is true that memory models that use non-overlapping, or sparsely distributed, representations face this dilemma to a considerably lesser extent (e.g. generalization often requires a highly distributed system; and along with this high degree of distribution comes catastrophic interference.
Although normatively irrelevant to the relationship between a cue and an outcome, outcome density... more Although normatively irrelevant to the relationship between a cue and an outcome, outcome density (i.e. its base-rate probability) affects people's estimation of causality. By what process causality is incorrectly estimated is of importance to an integrative theory of causal learning. A potential explanation may be that this happens because outcome density induces a judgement bias. An alternative explanation is explored here, following which the incorrect estimation of causality is grounded in the processing of cue-outcome information during learning. A first neural network simulation shows that, in the absence of a deep processing of cue information, cue-outcome relationships are acquired but causality is correctly estimated. The second simulation shows how an incorrect estimation of causality may emerge from the active processing of both cue and outcome information. In an experiment inspired by the simulations, the role of a deep processing of cue information was put to test. In addition to an outcome density manipulation, a shallow cue manipulation was introduced: cue information was either still displayed (concurrent) or no longer displayed (delayed) when outcome information was given. Behavioural and simulation results agree: the outcome-density effect was maximal in the concurrent condition. The results are discussed with respect to the extant explanations of the outcome-density effect within the causal learning framework.
This research shows that the motivation to posses a desired characteristic (or to avoid an undesi... more This research shows that the motivation to posses a desired characteristic (or to avoid an undesired one) results in self-perceptions that guide people's use of base rate in the Lawyer-Engineer problem . In four studies, participants induced to believe (or recall, Exp. 2) that a rational cognitive style is success-conducive (or an intuitive cognitive style failure-conducive) subsequently viewed themselves as more rational and relied more on base rate in their probability estimates than those induced to believe that a rational cognitive style is failure-conducive (or an intuitive cognitive style success-conducive). These findings show that the desired self had an influence on reasoning in the self-unrelated lawyer-engineer task, since the use of base rates was mediated by changes in participants' perceptions of their own rationality. These findings therefore show that the desired self, through the working self-concept that it entails, constitutes another factor influencing people's use of distinct modes of reasoning.
Building on the human memory model that consider LTM to be similar to a distributed network , and... more Building on the human memory model that consider LTM to be similar to a distributed network , and informed by the recent solutions to catastrophic forgetting that suppose memories are dynamically maintained in a dual architecture through a memory self-refreshing mechanism (checked whether false memories of never seen (target) items can be created in humans by exposure to "pseudo-patterns" generated from random input in an artificial neural network (previously trained on the target items). In a behavioral experiment using an opposition method it is shown that the answer is yes: Though the pseudo-patterns presented to the participants were selected so as to resemble (both at the exemplar and the prototype level) more the control items than the target items, the participants exhibited more familiarity for the target items previously learned by the artificial neural network. This behavioral result analogous to the one found in simulations indicates that humans, like distributed neural networks, are able to make use of the information the memory self-refreshing mechanism is based upon. The implications of these findings are discussed in the framework of memory consolidation.
Uploads
Papers by Serban Musca