Habit, Choice, and Addiction (Vandaele & Ahmed, 2021)
Habit, Choice, and Addiction (Vandaele & Ahmed, 2021)
Habit, Choice, and Addiction (Vandaele & Ahmed, 2021)
com/npp
REVIEW ARTICLE
Habit, choice, and addiction
1
Y. Vandaele and S. H. Ahmed2,3
Addiction was suggested to emerge from the progressive dominance of habits over goal-directed behaviors. However, it is
generally assumed that habits do not persist in choice settings. Therefore, it is unclear how drug habits may persist in real-world
scenarios where this factor predominates. Here, we discuss the poor translational validity of the habit construct, which impedes our
ability to determine its role in addiction. New evidence of habitual behavior in a drug choice setting are then described and
discussed. Interestingly, habitual preference did not promote drug choice but instead favored abstinence. Here, we propose several
clues to reconcile these unexpected results with the habit theory of addiction, and we highlight the need in experimental research
to face the complexity of drug addicts’ decision-making environments by investigating drug habits in the context of choice and in
the presence of cues. On a theoretical level, we need to consider more complex frameworks, taking into account continuous
interactions between goal-directed and habitual systems, and alternative decision-making models more representative of real-
world conditions.
INTRODUCTION the balance between these two systems would be shifted toward
Tobacco, alcohol, and substance use disorders, which will be habit in SUD.
referred to as addiction in the present review, are all driven by a However, the relation between drug use and habit remains
transition toward compulsive drug use characterized by a loss of controversial in humans, with mixed results and significant
control over drug intake, persistent drug use despite dreadful discrepancies [9, 10]. Furthermore, although the literature in
consequences, and frequent episodes of relapse. Among recrea- rodents converges to show that drug exposure promotes habit,
tional users, only a subset ultimately lose control over drug use how drug habits favor further drug use and, ultimately, the
and develop an addiction. To explain this transition, several, often transition to addiction remains unclear. In this review, we try to
overlapping, theories have been proposed [1]. Among them, the address this question by reviewing behavioral evidence
influential but controversial habit theory of addiction posits that supporting the habit theory of addiction in rodents and
the transition to addiction emerges from the progressive discussing important limitations, notably the absence of habit
development and dominance of drug habits over goal-directed in choice settings. We then present new evidence of habitual
control [2, 3]. Although drug habits appear omnipresent in any behavior in a drug choice setting and propose several clues to
form of addiction, whether formation or expression of drug habits explain our unexpected results in the light of the habit theory of
contribute to the transition to addiction remains a matter of addiction. We propose new perspectives on this theory that
debate. embrace the complexity of the decision-making environment of
The involvement of automatic processes in addiction was drug addicts and of interactions between decision-making
suggested 30 years ago in the seminal work of Tiffany [4]. Several processes.
diagnostic criteria for SUD are consistent with the concept of drug
habit; notably, the persistence of drug use when it is no longer
pleasurable and despite negative consequences, the high DRUGS PROMOTE HABIT
reactivity to drug-associated cues and context, and the fact that A large number of studies in rodents show that drugs of abuse
addictive behaviors appear out of voluntary control [1, 5, 6]. Habits promote habit. Following drug self-administration training, drugs
are defined as automatic responses elicited by antecedent stimuli can be devalued using either sensory-specific satiety or CTA
without deliberation or representation of the consequences of before responding for the drug is tested under extinction (Box 1).
one’s action. Because habits do not depend on the Using this procedure, it was shown that responding for ethanol
response–outcome association underlying goal-directed behavior, [11–17], cocaine [18, 19], and nicotine [20, 21] becomes habitual
they are generally operationalized as an absence of goal-directed after various length of training. In some studies, the transition to
behavior; that is, actions not affected by a reduction of the habit was faster for the drug compared to a nondrug reward
outcome value and/or by a degradation of the response–outcome suggesting stronger facilitation of habit formation for drug
contingency are under habitual control (Box 1) [7, 8]. Although seeking [11, 13, 15, 18, 21]. Interestingly, studies in which rats
these tests typically answer a yes-or-no question, habit and goal- are trained to self-administer cocaine or heroin in a seeking-taking
directed systems likely control behavior along a continuum, and schedule (e.g., heterogeneous chains; seeking RI30—taking FR1 on
1
Department of Psychiatry, Lausanne University Hospital, Lausanne, Switzerland; 2Institut des Maladies Neurodégénératives, Université de Bordeaux, Bordeaux, France and
3
Institut des Maladies Neurodégénératives, CNRS, Bordeaux, France
Correspondence: Y. Vandaele ([email protected])
Fig. 1 Habitual preference for saccharin in a drug choice setting. A–C Responding for saccharin is not reduced following saccharin
devaluation by specific satiety. A Rats’ performance on the cocaine and saccharin levers did not differ between the devalued group (D; white)
and the non-devalued group (ND; blue) across 1 min time bins in the extinction test. *p < 0.05 Coc vs. Sacch. B The total number of lever
presses was higher on the saccharin lever compared to the cocaine lever but was not affected by devaluation. *p < 0.05 Coc vs. Sacch.
C Saccharin was correctly devalued as measured by a reduction in posttest consumption of saccharin in the D group compared to the ND
group. D–F Preference for saccharin is also insensitive to saccharin devaluation by CTA. D, E Rats responded more on the saccharin lever
compared to the cocaine lever but did not differ as a function of devaluation. *p < 0.05 Coc vs. Sacch. F Devaluation of saccharin was
confirmed during the test of consumption immediately after the extinction session. Adapted from [50].
experiencing the devalued outcome during ITI and water trials did since rats became sensitive to the altered outcome value in the
not reverse preference toward the still valued drug option by presence of an altered interoceptive state (water satiation), it
reengaging goal-directed control, indicating that preference for could be argued that rats progressively learned to reengage MB
water was habitual and inflexible. goal-directed control. Yet, rats maintained their preference for
A progressive reversal of preference toward the drug was water following quinine-induced devaluation, despite a significant
observed across nine cycles of water restriction and satiation, suppression of water consumption (Fig. 2A, B), indicating that rats
indicating that preference can only change after repeated training cannot flexibly adjust their preference in response to outcome
with the novel water value. These results could be well explained devaluation using another modality (e.g., taste instead of
in the context of model-based (MB) and model-free (MF) control, motivational state). A more parsimonious hypothesis is that rats
used as proxies for goal-directed and habitual control, respectively learned instead to select options according to their motivational
(Box 2) [48, 52–54]. The slow reversal of preference observed in state under MF control (i.e., select water when thirsty), without
our study is what would be expected under MF control, which relying on the outcome value per se.
depends on iterative and retrospective learning of an action’s
values in a given “state”. Thus, rats may have learned to compute Possible explanations
the actions’ value from the start of the session, based on their The results described above are surprising since responding for
motivational state. In other words, rats learn to select water when the nondrug reward was habitual despite choice and reinforce-
thirsty, and cocaine when sated, without relying on the expected ment. In the following subsection, we will discuss possible
current value of these two rewards. To test this hypothesis, rats explanations for these unexpected results.
were tested again with 1 h water access before the session but not Both experiments included prior training in the discrete-trial
during ITI (1h-Ø; Fig. 2A). Although this condition moderately choice schedule to assess preference under baseline conditions. In
decreased consumption during water trials, the preference for this procedure, the lever insertion and retraction at each trial
cocaine increased to 50% and was significantly higher than constitute salient cues predicting reward availability and delivery,
cocaine preference before devaluation training under the same respectively. By reducing uncertainty about reward delivery and
conditions. These results suggest that during devaluation training, alleviating the need for attentional monitoring, these cues can
rats learn to use their motivational state as a discriminative cue to promote the rapid development of habit [47, 55, 56]. Indeed,
predict the most valuable option, under MF control. Alternatively, arbitration between MF and MB control has been suggested to
Fig. 2 Inflexible preference for the alternative nondrug reward in a drug choice setting is under habitual, model-free control. Water-
restricted rats offered a choice between water and cocaine expressed a robust preference for water (black; baseline preference under water
deprivation). Water was then partially devalued with 1 h (1h-Ø, pink) and 2 h free-water access (2h-Ø, purple) before the choice session. Water
preference was not affected (A) but there was moderate suppression of water consumption. B Thus, free-water access was also introduced
during each intertrial interval (ITI) of choice sessions in addition to the hour of water presession access (white; 1 h + ITI, Free-Water FW).
Although this condition drastically suppressed water consumption from the first FW session (B), nine sessions were needed to observe a
complete reversal of preference (A). Following this devaluation training, 1 h water access was sufficient to raise cocaine preference to 50% in a
second 1h-Ø choice session (pink). Finally, devaluation of water by taste adulteration with quinine (blue) only moderately affected preference
(A) despite a strong suppression of water consumption (B). Adapted from [51].
rely on the relative uncertainty of predictions from each system may be promoted by the structure of the discrete-trial choice
[52, 57]. In procedures involving discrete trials, the low uncertainty procedure. It is noteworthy that studies showing goal-directed
about MF predictions derived from the lever cues through choice between two nondrug rewards use self-paced random-
reinforcement learning is hypothesized to favor habit. This could ratio or -interval schedules, in absence of reward-predictive cues
explain why habitual responding for sucrose is observed after only and thus, under conditions of higher reward uncertainty
five sessions whereas 8 weeks of training are not sufficient to [34, 42, 44, 58].
observe habit when these cues are not available [11, 55]. The strong initial preference for the alternative nondrug reward
Therefore, habitual preference in the two studies described above in our studies indicates large difference in outcome values [50, 51].
Fig. 3 Rats are oblivious to the cocaine option during self-initiated choice. A Rats are required to nosepoke in a hole under a fixed ratio 10
to trigger the presentation of two levers. Two consecutive presses on the left or right lever result in the delivery of saccharin or an intravenous
infusion of cocaine, respectively. B In this procedure, rats expressed a strong preference for saccharin. Interestingly, this preference was
exclusive for a majority of rats (right panel). C Analysis of choice patterns reveals that rats choosing saccharin exclusively did so in bouts of
varying lengths separated by pauses, during which they did not self-initiated any trial for cocaine, despite transient saccharin devaluation by
sensory-specific satiety. This behavior represents an opportunity cost because the duration of pauses is sufficient to earn several cocaine
injections (right panel). Adapted from [117].
into account (1) the continuous arbitration between goal-directed ADDITIONAL INFORMATION
and habitual systems, (2) the hierarchical decision-making
architectures combining these two systems and (3) alternative Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims
sequential decision-making models suggesting that individuals in published maps and institutional affiliations.
may consider one option at a time when making decisions.
Although much remains to be done, our hope is that this review
opens up new perspectives to determine the role of habit and REFERENCES
choice in addiction. 1. Redish AD, Jensen S, Johnson A. A unified framework for addiction: vulner-
abilities in the decision process. Behav Brain Sci. 2008;31:415–87.
2. Everitt BJ, Robbins TW. Drug addiction: updating actions to habits to compul-
FUNDING AND DISCLOSURE sions ten years on. Annu Rev Psychol. 2016;67:23–50.
3. Everitt BJ, Robbins TW. Neural systems of reinforcement for drug addiction: from
This work was supported by the French Research Council (CNRS), the
actions to habits to compulsion. Nat Neurosci. 2005;8:1481–9.
Université de Bordeaux, the French National Agency (ANR-2010- 4. Tiffany ST. A cognitive model of drug urges and drug-use behavior: role of
BLAN-1404-01), the Ministère de l’Enseignement Supérieur et de la automatic and nonautomatic processes. Psychol Rev. 1990;97:147–68.
Recherche (MESR), the Fondation pour la Recherche Médicale (FRM 5. Heather N. Is the concept of compulsion useful in the explanation or description
DPA20140629788), and the Peter und Traudl Engelhorn foundation. of addictive behaviour and experience? Addict Behav Rep. 2017;6:15–38.
The authors declare no competing interests. 6. Ostlund SB, Balleine BW. On habits and addiction: an associative analysis of
compulsive drug seeking. Drug Discov Today Dis Model 2008;5:235–45.
7. Dickinson A, Balleine B. Motivational control of instrumental action. Anim Learn
Behav. 1994;22:1–18.
ACKNOWLEDGEMENTS
8. Balleine BW, Dickinson A. Goal-directed instrumental action: contingency and
We thank Christophe Bernard, Mathieu Louvet, and Eric Wattelet for administrative
incentive learning and their cortical substrates. Neuropharmacology.
assistance. We also thank Dr. Patricia Janak for her helpful comments on a previous
1998;37:407–19.
version of the review, and Emma Chaloux-Pinette for proofreading the paper.
9. Hogarth L, Lam-Cassettari C, Pacitti H, Currah T, Mahlberg J, Hartley L, et al.
Intact goal-directed control in treatment-seeking drug users indexed by
outcome-devaluation and Pavlovian to instrumental transfer: critique of habit
AUTHOR CONTRIBUTIONS theory. Eur J Neurosci. 2018;50:2513–2525.
YV drafted the first version of the paper; SHA and YV revised and edited the paper; YV 10. Hogarth L. Addiction is driven by excessive goal-directed drug choice under
and SHA approved the final version of the paper. negative affect: translational critique of habit and compulsion theory.