Regulating Chatgpt and Other Large Generative Ai Models: Philipp Hacker Andreas Engel Marco Mauer
Regulating Chatgpt and Other Large Generative Ai Models: Philipp Hacker Andreas Engel Marco Mauer
Regulating Chatgpt and Other Large Generative Ai Models: Philipp Hacker Andreas Engel Marco Mauer
ABSTRACT Fairness, Accountability, and Transparency (FAccT ’23), June 12–15, 2023,
Large generative AI models (LGAIMs), such as ChatGPT, GPT-4 Chicago, IL, USA. ACM, New York, NY, USA, 12 pages. https://doi.org/10.
1145/3593013.3594067
or Stable Diffusion, are rapidly transforming the way we commu-
nicate, illustrate, and create. However, AI regulation, in the EU
and beyond, has primarily focused on conventional AI models, not 1 INTRODUCTION
LGAIMs. This paper will situate these new generative models in Large generative AI models (LGAIMs) are rapidly transforming the
the current debate on trustworthy AI regulation, and ask how the way we communicate, create, and work. Their consequences are
law can be tailored to their capabilities. After laying technical foun- bound to affect all sectors of society, from business development to
dations, the legal part of the paper proceeds in four steps, covering medicine, from education to research, and from coding to entertain-
(1) direct regulation, (2) data protection, (3) content moderation, ment and the arts. LGAIMs harbor great potential, but also carry
and (4) policy proposals. It suggests a novel terminology to capture significant risk. Today, they are relied upon by millions of users to
the AI value chain in LGAIM settings by differentiating between generate human-level text (e.g., GPT-4, ChatGPT, Luminous, Bard,
LGAIM developers, deployers, professional and non-professional Bing), images (e.g., Stable Diffusion, DALL·E 2), videos (e.g., Synthe-
users, as well as recipients of LGAIM output. We tailor regulatory sia), or audio (e.g., MusicLM), while further alternatives are already
duties to these different actors along the value chain and suggest in the pipeline [1-3]. Soon, they may be part of employment tools
strategies to ensure that LGAIMs are trustworthy and deployed ranking and replying to job candidates, or of hospital administration
for the benefit of society at large. Rules in the AI Act and other systems drafting letters to patients based on case files. Freeing up
direct regulation must match the specificities of pre-trained mod- time for professionals to focus on substantive matters–for example,
els. The paper argues for three layers of obligations concerning actual patient treatment–, such multi-modal decision engines may
LGAIMs (minimum standards for all LGAIMs; high-risk obligations contribute to a more effective, and more just, allocation of resources.
for high-risk use cases; collaborations along the AI value chain). However, errors will be costly, and risks ranging from discrimina-
In general, regulation should focus on concrete high-risk appli- tion and privacy to disrespectful content need to be adequately
cations, and not the pre-trained model itself, and should include addressed [4-6]. Already now, LGAIMs’ unbridled capacities may
(i) obligations regarding transparency and (ii) risk management. be harnessed to take manipulation, fake news, and harmful speech
Non-discrimination provisions (iii) may, however, apply to LGAIM to an entirely new level [7-11]. As a result, the debate on how (not)
developers. Lastly, (iv) the core of the DSA’s content moderation to regulate LGAIMs is becoming increasingly intense [12-22].
rules should be expanded to cover LGAIMs. This includes notice In this paper, we argue that regulation, and EU regulation in
and action mechanisms, and trusted flaggers. particular, is not only ill-prepared for the advent of this new gen-
eration of AI models, but also sets the wrong focus by quarreling
CCS CONCEPTS mainly about direct regulation in the AI Act at the expense of the,
• Social and professional topics → Computing / technology pol- arguably, more pressing content moderation concerns under the
icy; Government / technology policy; Governmental regulations; • Digital Services Act (DSA). AI regulation, in the EU and beyond,
Additional Keywords and Phrases: LGAIMs, LGAIM regula- has primarily focused on conventional AI models, not on the new
tion, general-purpose AI systems, GPAIS, foundation models, generation whose birth we are witnessing today. The paper will
large language models, LLMs, AI regulation, AI Act, direct situate these new generative models in the current debate on trust-
AI regulation, data protection, GDPR, Digital Services Act, worthy AI regulation, and ask what novel tools might be needed to
content moderation; tailor current and future law to their capabilities. Inter alia, we sug-
gest that the terminology and obligations in the AI Act and other
ACM Reference Format:
Philipp Hacker, Andreas Engel, and Marco Mauer. 2023. Regulating Chat- pertaining regulation be further differentiated to better capture the
GPT and other Large Generative AI Models. In 2023 ACM Conference on realities of the evolving AI value chain. Some of these observations
also apply to traditional AI systems; however, generative models
Permission to make digital or hard copies of part or all of this work for personal or are special in so far as they create output designed for commu-
classroom use is granted without fee provided that copies are not made or distributed nication or speech–and thus raise important and novel questions
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for third-party components of this work must be honored. concerning the regulation of AI-enabled communication, which we
For all other uses, contact the owner/author(s). analyze through the lens of the DSA and, in the technical report,
FAccT ’23, June 12–15, 2023, Chicago, IL, USA non-discrimination law and the GDPR.
© 2023 Copyright held by the owner/author(s).
ACM ISBN 979-8-4007-0192-4/23/06. To do so, the paper proceeds in five steps. First, we cover techni-
https://doi.org/10.1145/3593013.3594067 cal foundations of LGAIMs, and typical scenarios of their use, to
1112
FAccT ’23, June 12–15, 2023, Chicago, IL, USA Philipp Hacker et al.
the extent that they are necessary for the ensuing legal discussion. constitutes a significant technical advance (for foundations, see
Second, we critique the EU AI Act, which seeks to directly address [39-43]), they harness, to a reasonable extent, existing technologies
risks by AI systems. The versions adopted by the Council (Art. 4a-c in a vastly increased scale and scope. LGAIMs are usually trained
AI Act1 ) and the European Parliament (Art. 28-28b AI Act EP Ver- with several billion, if not hundreds of billions, parameters [43, 44],
sion2 ) contain provisions to explicitly regulate LGAIMs, even if requiring large amounts of training data and computing power [45].
their providers are based outside of the EU [14, cf. also 23]. These While there are ongoing research efforts to make training language
proposals, however, arguably fail to fully accommodate the capaci- models, and, in particular, transformers, more efficient [46, 47], the
ties and broad applicability of LGAIMs, particularly concerning the energy required to train models this large has triggered concerns
obligation for an encompassing risk management system covering from a climate policy perspective [24, 48-52] (see also Part 6).
all possible high-risk purposes (Art. 9 AI Act; Art. 28b(1)(a) AI Act Hence, LGAIMs “are advanced machine learning models that are
EP Version) [12, pp. 6-10, 24, pp. 13, 51 et seqq.]. Third, we briefly trained to generate new data, such as text, images, or audio” (Prompt
touch on non-discrimination and data protection law (more detail 1, see Annex H1). This “makes them distinct from other AI models [. . .
in the Technical Report). Fourth, we turn to content moderation only] designed to make predictions or classifications” (Prompt 2) or to
[see, e.g., 25, 26, 27]. Recent experiments have shown that ChatGPT, fulfil other specific functions. This increased scope of application
despite innate protections [28], may be harnessed to produce hate is one of the reasons for the large amount of data and compute
speech campaigns at scale, including the code needed for maximum required to train them. LGAIMs employ a variety of techniques
proliferation [8]. Furthermore, the speed and syntactical accuracy [28, 53] that aim at allowing them “to find patterns and relationships
of LGAIMs make them the perfect tool for the mass creation of in the data on its own, without being [explicitly] told what to look for.
highly polished, seemingly fact-loaded, yet deeply twisted fake Once the model has learned these patterns, it can generate new
news [7, 17]. In combination with the factual dismantling of con- examples that are similar to the training data” (Prompt 3). In simple
tent moderation on platforms such as Twitter, a perfect storm is terms, training data are represented as probability distributions. By
gathering for the next global election cycle. We show that the EU’s sampling from and mixing them, the model can generate content
prime instrument to combat harmful speech, the DSA [29, 30], does beyond the training data set–thus something new, as some com-
not apply to LGAIMs, creating a dangerous regulatory loophole. mentators put it [54, 55]. LGAIMs can often digest human text input
Finally, the paper argues for three layers of obligations con- [56, 57] and produce an output (text; image; audio; video) based
cerning LGAIMs (minimum standards for all LGAIMs; high-risk on it. The vast amounts of data required imply that developers of
obligations for high-risk use cases; collaborations along the AI value LGAIMs must often rely on training data that is openly available
chain; cf. now also Art. 28 and 28b AI EP Version) and makes four on the internet, which can hardly be considered perfect from a data
specific policy proposals to ensure that LGAIMs are trustworthy quality perspective [58]. The content generated by these models
and deployed for the benefit of society at large: direct regulation of can, therefore, be biased, prejudiced, or harmful [15, 59]. To avoid
LGAIM deployers and users, including (i) transparency and (ii) risk or at least mitigate this issue, model developers need to use proper
management; (iii) the application of non-discrimination provisions curating techniques [60, 61]. OpenAI, controversially, hired a large
to LGAIM developers; and (iv) specific content moderation rules content moderation team in Kenya [62].
for LGAIMs. We conclude with a brief assessment concerning the “[L]arge generative models can generate synthetic content that
vice and virtue of technology-neutral regulation. is difficult to distinguish from real content, making it challenging
Due to space constraints, we cannot address all social and regula- to differentiate between real and fake information. [. . . T]he sheer
tory concerns regarding LGAIMs and have to bracket, for example, volume of content generated by these models can make it difficult to
questions of IP law, power dynamics [31, 32], a deeper exploration of manually review and moderate all of the generated content” (Prompt
the comparative advantages of technology-neutral and technology- 4). For as much as we know [28], and according to ChatGPT itself,
specific regulation [33], or the use of LGAIMs in military contexts the creators of ChatGPT sought to address this problem by using
[34, 35]. “a combination of techniques to detect and remove inappropriate con-
tent. This process includes pre-moderation, where a team of human
2 TECHNICAL FOUNDATIONS OF LARGE moderators review and approve content before it is made publicly
GENERATIVE AI MODELS AND EXEMPLARY available. Additionally, ChatGPT uses filtering, which involves us-
USAGE SCENARIOS ing natural language processing and machine learning algorithms
to detect and remove offensive or inappropriate content. This is done
The AI models covered by this contribution are often referred to as
by training a machine learning model on a dataset of examples of
‘foundation models’ [36], ‘large language models’ (LLMs) [37] or
inappropriate content, and then using this model to identify similar
‘large generative AI models’ (LGAIMs–the term adopted in this ar-
content in new inputs” (Prompt 5). While we cannot perfectly verify
ticle) [38]. Although the emergence of these models in recent years
these claims due to lack of transparency on OpenAI’s side, it seems
1 Unless otherwise noted, all references to the AI Act are to the gen- that ChatGPT relied or relies on humans that train an automatic
eral approach adopted by the EU Council on Dec. 6, 2022, available under content moderation system to prevent the output from becoming
https://data.consilium.europa.eu/doc/document/ST-14954-2022-INIT/en/pdf; we have
been able to incorporate policy developments until May 12, 2023. abusive [62].
2 DRAFT Compromise Amendments on the Draft Report, Pro- Even (idealized) automated and perfect detection of abusive con-
posal for a regulation of the European Parliament and of the tent would only solve half the problem, though. What remains
Council, Brando Benifei & Ioan-Dragoş Tudorache (May 9, 2023),
https://www.europarl.europa.eu/meetdocs/2014_2019/plmrep/COMMITTEES/CJ40 is the danger of creating “fake news” that are hard to spot [17].
/DV/2023/05-11/ConsolidatedCA_IMCOLIBE_AI_ACT_EN.pdf (= AI Act EP Version). Regulation arguably needs to tackle these challenges. To better
1113
Regulating ChatGPT and other Large Generative AI Models FAccT ’23, June 12–15, 2023, Chicago, IL, USA
highlight them, for the discussion that follows, we will consider the its capabilities; rightly, this only corresponds to a minority position
following two lead examples: in a business context, one might think in the technical GPAIS literature [14, 66].
of a sportswear manufacturer (e.g., adidas or Nike) that wants to
use the potential of a LGAIM specifically for the design of clothing. 3.1.2 Risk Management for GPAIS. Second, even a narrower defi-
For this purpose, adidas might use a pre-trained model provided by nition would not avoid other problems. Precisely because large AI
a developer (e.g., Stability AI), while another entity, the deployer, models are so versatile, providers will generally not be able to avail
would fine-tune the model according to adidas’ requirements (and themselves of the exception in Art. 4c(1) AI Act: by excluding all
possibly host it on a cloud platform). As a second exemplary use high-risk uses, they would not act in good faith, as they would have
case, in a private setting, one could think of a young parent that to know that the system, once released, may and likely will be used
uses an AI text generator to generate a funny (and suitable) invita- for at least one high-risk application. For example, language models
tion text for her daughter’s birthday party. To do so, (s)he might may be used to summarize or rate medical patient files, or student,
consult Aleph Alpha’s Luminous or ChatGPT and ask the chatbot job, credit or insurance applications (Annexes II, Section A. No.
to come up with an appropriate suggestion. 12, 13 and III No. 3-5 AI Act. Unless any misuse can be verifiably
technically excluded, LGAIMs will therefore generally count as
3 DIRECT REGULATION OF THE AI VALUE high-risk systems under the proposed provision.
This, however, entails that they have to abide by the high-risk
CHAIN: THE EUROPEAN AI ACT
obligations, in particular the establishment of a comprehensive risk
On May 13, 2022, the French Council presidency circulated an management system, according to Art. 9 AI Act. Setting up such a
amendment to the draft AI Act, Art. 4a-4c, on what the text calls system seems to border on the impossible, given LGAIMs’ versatil-
“general-purpose AI systems” (GPAIS). This novel passage has come ity. It would compel LGAIM providers to identify and analyze all
to form the nucleus of direct regulation of LGAIMs. It was fiercely “known and foreseeable risks most likely to occur to health, safety
contested in the EP [63-65] and will be a key point of debate for the and fundamental rights” concerning all possible high-risk uses of
final version of the AI Act. The general approach adopted by the the LGAIM (Art. 9(2)(a), 4b(6) AI Act). On this basis, mitigation
Council on December 6, 2022, defines GPAIS as systems “intended strategies for all these risks have to be developed and implemented
by the provider to perform generally applicable functions such as (Art. 9(2)(d) and (4) AI Act). Providers of LGAIMs such as GPT-4
image and speech recognition, audio and video generation, pattern would, therefore have to analyze the risks for every single, possible
detection, question answering, translation and others; a general application in every single high-risk case contained in Annexes
purpose AI system may be used in a plurality of contexts and be II and III concerning health, safety and all possible fundamental
integrated in a plurality of other AI systems” (Art. 3(1b) AI Act). rights.
Under the Council version, GPAIS are subjected to the high-risk Similarly, performance, robustness, and cybersecurity tests will
obligations (e.g., Art. 8 to 15 AI Act) if they may be used as high-risk have to be conducted concerning all possible high-risk uses (Art.
systems or as components thereof (Art. 4b(1)(1) and 4b(2) AI Act). 15(1), 4b(6) AI Act). This seems not only almost prohibitively costly
but also hardly feasible. The entire analysis would have to be based
3.1 Critique of the GPAIS AI Act Rules on an abstract, hypothetical investigation, and coupled with–again
The AI Act heroically strives to keep pace with the accelerating hypothetical–risk mitigation measures that will, in many cases,
dynamics in the AI technology space. However, in our view, the depend on the concrete deployment, which by definition has not
recently introduced rules on GPAIS fail to do justice to the pecu- been implemented at the moment of analysis. What is more, many
liarities of large AI models, and particularly LGAIMs, for three of these possible use cases will, in the end, not even be realized.
reasons. Hence, such a rule would likely create “much ado about nothing”,
3.1.1 Toward a Definition of GPAIS. First, the definition in Art. in other words: a waste of resources.
3(1b) AI Act is significantly over-inclusive. Rules on GPAIS were 3.1.3 Adverse Consequences for Competition. Third, the current
inspired by the surge in the release of and literature on foundation GPAIS rules would likely have significantly adverse consequences
models and LGAIMs. As seen in Part 2, LGAIMs operate with large for the competitive environment surrounding LGAIMs. The AI Act
numbers of parameters, training data, and compute. Significantly, definition specifically includes open source developers as LGAIM
they generally operate on a wider range of problems than traditional providers, of which there are several.3 Some of these will explore
models do [43]. Conceptually, their “generality” may refer to their LGAIMs not for commercial, but for philanthropic or research rea-
ability (e.g., language versus vision, or combinations in multimodal sons. While, according to its Art. 2(7), the AI Act shall not apply to
models); domain of use cases (e.g., educational versus economic); any (scientific, see Recital 12b AI Act) research and development
breadth of tasks covered (e.g., summarizing versus completing text), activity regarding AI systems, this research exemption arguably
or versatility of output (e.g., black and white versus multicolored does not apply anymore once the system is released into the wild
image) [14]. GPAIS, in our view, must necessarily display significant (cf. Recital 12b AI Act).
generality in ability, tasks, or outputs, beyond the mere fact that they As a result, all entities–large or small–developing LGAIMs and
might be integrated into various use cases (which also holds true for placing them on the market will have to comply with the same
extremely simple algorithms). The broad definition of GPAIS in the stringent high-risk obligations, and be subject to the same liability
AI Act (Council general approach) clashes with this understanding,
however. According to that rule, every simple image or speech 3 See,
e.g., https://www.kdnuggets.com/2022/09/john-snow-top-open-source-large-
recognition system seems to qualify, irrespective of the breadth of language-models.html.
1114
FAccT ’23, June 12–15, 2023, Chicago, IL, USA Philipp Hacker et al.
risks under the new product liability framework [24]. Given the dif- of copyrighted material contained in training data may indeed help
ficulty to comply with the AI Act’s GPAIS rules, it can be expected authors and creators enforce their rights. However, even experts
that only large, deep-pocketed players (such as Google, Meta, Mi- often argue whether certain works are copyrightable at all or not.
crosoft/Open AI) may field the costs to release an approximately What must be avoided is that developers who have, e.g., processed
AI Act-compliant LGAIM. For open source developers and many 20 million images now have to conduct a full-scale legal due dili-
SMEs, compliance will likely be prohibitively costly. Hence, the gence on these 20 million images to decide for themselves whether
AI Act may have the unintended consequence of spurring further they are copyrightable or not. Hence, it must therefore be sufficient
anti-competitive concentration in the LGAIM development market. to disclose, even in an over-inclusive manner, works which may be
Similar effects have already been established concerning the GDPR copyrightable, including those for which it is not clear whether they
[67]. In this sense, the AI Act threatens to undermine the efforts of are ultimately copyrightable or not. Otherwise, again, practically
the Digital Markets Act to infuse workable competition into the prohibitive due diligence costs will arise. The individual author
core of the digital and platform economy. must then decide, when she discovers her work, whether she thinks
it is protected by copyright or not.
3.1.4 Critique of the European Parliament proposal. In the EP, the
The second level refers to “new providers” which significantly
question of how to regulate large generative AI models significantly
modify the AI system, Art. 28(1)(b) and (ba) AI Act EP Version. This
delayed the formulation of the EP position on the AI Act. After
new provider, which is called deployer in our paper (see Section
a lengthy debate, a compromise was reached in late April/early
3.2.1), assumes the obligations of the former provider upon sub-
May 2023.4 The compromise foresees three layers of obligations
stantial modification; the new provider takes on this role (Art. 28(1)
that apply to generative AI systems [65, 68]. The first layer will
and (2)(1) AI Act EP Version). A third level of requirements relates
apply to the providers (=developers) of a subset of GPAIS denomi-
to the AI value chain (Art. 28(2)(2) and (2a) AI Act EP Version), in
nated “foundation models” (Art. 28b(1)-(3) AI Act EP Version) and
line with suggestions made below in this paper (see Section 3.2.2).
generative AI (Article 28b(4) AI Act EP Version). Referring to a
In our view, while containing steps in the right direction, this
well-known term in the computer science community [see, e.g., 36,
proposal would be ultimately unconvincing for as it effectively
69], the EP version defines foundation models as an AI system “that
treats foundation models as high-risk applications (cf. Art. 28b(1)(a)
is trained on broad data at scale, is designed for generality of output,
and (f) AI Act EP Version). Of course, as noted and discussed in
and can be adapted to a wide range of distinctive tasks” (Art. 3(1c)
detail below (Part 5), AI output may be misused for harmful speech
AI Act EP Version) [cf. also 36, at 3]. The focus on generality of
and acts (as almost any technology). But not only does this seem
output and tasks is indeed better suited to capture the specifics of
to be rather the exception than the rule. The argument concern-
large generative AI models than the vague definition of GPAIS (see
ing adverse competitive consequences applies equally here. Under
Section 3.1.1). In line with suggestions made in this paper, the gen-
the EP version, risk assessment, mitigation, and management still
eral obligations for all foundation models include data governance
remain focused on the model itself rather than the use-case spe-
measures, particularly with a view to the mitigation of bias (Art.
cific application (Art. 28b(2)(a) and (f) AI Act EP Version), even
28b(2)(b) AI Act EP Version; see Section 4). Furthermore, appro-
though Recital 58a acknowledges that risks related from AI systems
priate levels of performance, interpretability, corrigibility, safety
can stem from their specific use. Again, this leads to the onerous
and cybersecurity must be maintained throughout the model’s life-
assessment and mitigation of hypothetical risks that may never
cycle. These requirements have to be tested for, documented, and
materialize–instead of managing risks at the application level where
verified by independent experts, Art. 28b(2)(c) AI Act EP Version.
the concrete deployment can be considered.
Crucially, however, all foundation models also need to implement
risk assessments, risk mitigation measures, and risk management
strategies with a view to reasonably foreseeable risks to health,
3.2 Proposal: Focus on Deployers and Users
safety, fundamental rights, the environment, democracy and the This critique does not imply, of course, that LGAIMs should not
rule of law, again with the involvement of independent experts, be regulated at all. However, in our view, a different approach is
Art. 28b(2)(a) AI Act EP Version. Effectively, this requirement is warranted. Scholars have noted that the regulatory focus should
tantamount to classifying foundation models as high-risk per se. shift [12, 13] and move towards LGAIM deployers and users, i.e.,
A crucial element of the minimum standards for generative AI those calibrating LGAIMs for and using them in concrete high-risk
is contained in the “ChatGPT Rule” Art. 28b(4) AI Act EP Version. applications. While some general rules, such as data governance,
It contains three main elements. (i) The transparency obligation non-discrimination and cybersecurity provisions, should indeed
concerning the use of AI is a step in the right direction. It addresses apply to all foundation models (see Section 4), the bulk of the high-
obligations of providers towards users of AI systems. In our view, risk obligations of the AI Act should be triggered for specific use
additionally, obligations of users towards recipients are warranted cases only and target primarily deployers and professional users.
in some instances to fight the spread of fake news and misinfor-
3.2.1 Terminology: Developers, Deployers, Users, and Recipients.
mation (see Section 6.1). (ii) The rule on preventing a breach of
Lilian Edwards, for example, has rightly suggested to differentiate
EU law, however, arguably does not go far enough. Here, the com-
between developers of GPAIS, deployers, and end users [12, see
pliance mechanisms of the DSA should be transferred much more
also 24]. In the following, we take this beginning differentiation in
specifically, for example through clear, mandatory notice and action
the AI value chain one step further. In many scenarios, there will be
procedures and trusted flaggers (see Section 6.4). (iii) The disclosure
at least four entities involved, in different roles [cf. 70]. We suggest
4 See note 2. that the terminology in the AI Act and other pertaining regulation
1115
Regulating ChatGPT and other Large Generative AI Models FAccT ’23, June 12–15, 2023, Chicago, IL, USA
must be adapted to the evolving AI value chain in the following that deployers and users can reasonably be expected to comply
way. with them, both by implementing the necessary technological ad-
Developer: this is the entity originally creating and (pre-) justments and by absorbing the compliance costs. Second, many of
training the model. In the AI Act, this entity is called the provider the AI Act’s high-risk obligations refer to the training and model-
(under some further conditions, see Art. 3(2)). Real-world examples ing phase conducted, at least partially, by the LGAIM developers.
would be OpenAI, Stability, or Google. Deployer: this is the en- Typically, LGAIM developers will pre-train a large model, which
tity fine-tuning the model for a specific use case. The AI Act EP may then be fine-tuned by deployers, potentially in collaboration
Version also uses the term, albeit in a slightly different manner, cov- with developers [73, 74], while users ultimately make the decision
ering any person or entity using an AI system under its authority, what the AI system is used for specifically (e.g. commercial use for
except where the AI system is used in the course of a personal non- design or private use for generating an invitation text). To meet the
professional activity (Art. 3(4) AI Act EP Version); for the purposes AI Act requirements concerning training data (Art. 10), documenta-
of the AI Act EP Version, a deployer can be a (new) provider, Art. tion and record-keeping (Art. 11 and 12), transparency and human
28(2)(1). Note that there could be several deployers (working jointly oversight (Art. 13 and 14), performance, robustness and cybersecu-
or consecutively), leading to a true AI value chain similar to OEM rity (Art. 15), and to establish the comprehensive risk management
value chains. Alternatively, the developer could simultaneously act system (Art. 9), any person responsible will need to have access to
as a deployer (vertical integration)–just as for the purposes of the the developer’s and deployer’s data and expertise. This unveils a
AI Act EP Version, a deployer can be a (new) provider, Art. 28(2)(1). regulatory dilemma: focusing exclusively on developers entails po-
User: this is the entity actually generating output from an LGAIM, tentially excessive and inefficient compliance obligations; focusing
e.g. via prompts, and putting it to use. The user may harness the on deployers and users risks burdening those who cannot comply
output in a professional or a non-professional capacity [71, 72]. due to limited insight or resources. Third, and related to the first
Potential real-world examples of professional users would be the and second aspect, individual actors in the AI value chain may
clothing and sportswear manufacturer from the first lead example, simply not have the all-encompassing knowledge and control that
or any other entity from the groups of professional users just listed. would be required if they were the sole addressees of regulatory
Note that any individual making professionally motivated com- duties [75]. This more abstract observation also shows that shared
ments online would also count as a professional user in this respect. and overlapping responsibilities may be needed.
Finally, some exceptions from the EU consumer definition are in or- In our view, the only way forward are legally mandated col-
der: for example, employees5 (and students, for that matter) should laborations between LGAIM providers, deployers and users with
presumptively count as professional users when applying LGAIMs respect to the fulfillment of regulatory duties. More specifically, we
for job- or education-related tasks. Particularly concerning negative suggest a combination of strategies known from pre-trial discovery,
externalities of AI output, it should not matter whether users are trade secrets law, and the GDPR. Under the current AI Act (Coun-
pursuing a dependent or independent professional activity (e.g., Art. cil general approach), such teamwork is encouraged in Art. 4b(5):
29 AI Act). By contrast, the AI Act largely exempts non-professional providers “shall” cooperate with and provide necessary information
users (cf. Art. 2(8) AI Act; the AI Act EP Version contains no general to users. A key issue, also mentioned in the Article, is access to
exemption, but excludes non-professionals from the definition of information potentially protected as trade secrets or intellectual
deployers, Art. 3(4)). The parent from the lead example using Chat- property (IP) rights [13, 76]. To be workable, this obligation needs
GPT for birthday party would fall into this category. Recipient: further concretization; the same holds true for more recent propos-
this is the entity consuming the product offered by the user. als by the EP in this direction [77]; Art. 10(6a) AI Act EP Version
With this terminology in place, regulatory obligations can be al- only explicitly addresses a situation where such cooperation does
located to different types of actors in more nuanced ways. While de- not take place.
velopers should, to a certain extent, be subject to non-discrimination The problem of balancing collaboration and disclosure with the
law and certain data governance provisions (Section 4), we suggest protection of information is not limited to the AI Act. In our view,
that the focus of regulatory duties should lie on deployers and users, it has an internal and external dimension. Internally, i.e., in the
for example concerning risk management systems (Art. 9 AI Act) relationship between the party requesting and the party granting
or performance and robustness thresholds (Art. 15 AI Act) (see also access, access rights are often countered, by the granting party, by
below, Part 6). reference to supposedly unsurmountable trade secrets or IP rights
[78-80]. The liability directives proposed by the EU Commission,
3.2.2 The AI Value Chain. Such a shift of the regulatory focus on
for example, contain elaborate evidence disclosure rules pitting the
deployers and users, however, entails several follow-up problems
compensation interests of injured persons against the secrecy inter-
that need to be addressed [12]. First, deployers and users may be
ests of AI developers and deployers [24, 81, 82]. Extensive literature
much smaller and less technologically sophisticated than LGAIM
and practical experience concerning this problem exists in the realm
developers. This is not a sufficient reason to exempt them from
of the US pretrial discovery system [83-87]. Under this mechanism,
regulation and liability, but it points to the importance of designing
partially adopted by the proposed EU evidence disclosure rules [24],
a feasible allocation of responsibilities along the AI value chain.
injured persons may seek access to documents and information held
Recent proposals discussed in the EP point in this direction as well
by the potential defendant before even launching litigation. This, in
(see Section 3.1.4). Obligations must be structured in such a way
turn, may lead to non-meritorious access requests by competitors.
5 But see German Constitutional Court, Order of November 23, 2006, Case 1 BvR Similarly, in the AI value chain, developers, deployers and users
1909/06: employees are consumers in the sense of the EU consumer law. may indeed not only be business partners but also be (potential)
1116
FAccT ’23, June 12–15, 2023, Chicago, IL, USA Philipp Hacker et al.
competitors. Hence, deployers’ and users’ access must be limited. 4 NON-DISCRIMINATION LAW AND THE
Conversely, some flow of information must be rendered possible to GDPR
operationalize compliance with high-risk obligations by deployers.
Some rules will have to apply directly to LGAIMs and LGAIM de-
To guard against abuse, we suggest a range of measures. It may
velopers, however (see Section 6). A clear candidate for such rules
be worthwhile to introduce provisions inspired by the US pretrial
is non-discrimination law. Generally, it applies, in the US as well as
discovery system [80, 83, 88] and the proposed EU evidence disclo-
the EU, in a technology-neutral way [91-95]. Importantly, however,
sure mechanism (Art. 3(4) AI Liability Directive, protective order).
it only covers certain enumerated areas of activity, such as employ-
Hence, courts should be empowered to issue protective orders, which
ment, education, or publicly available offers of goods and services
endow nondisclosure agreements with further weight and subject
[91, 96]. This begs the question whether general-purpose systems
them to potential administrative penalties. The order may also ex-
may be affected by non-discrimination provisions even before they
empt certain trade secrets from disclosure or allow access only
have been deployed in specific use cases. While a detailed discus-
under certain conditions (see F.R.C.P. Rule 26(c)(1)(G)). Further-
sion transcends the scope of this paper (see Technical Report), it
more, the appointment of a special master may, ultimately, strike a
seems convincing to consider adequate non-discrimination rules
balance between information access and the undue appropriation
a crucial element of any future regulatory perimeter for LGAIMs
of competitive advantage (cf. F.R.C.P. Rule 53(a)) [88]. With these
(Section 6.3).
safeguards in place, LGAIM developers should be compelled, and
A third major challenge for any AI model is GDPR compliance.
not merely encouraged, to cooperate with deployers and users con-
Its relevance for LGAIMs in particular was illustrated by the tem-
cerning AI Act compliance if they have authorized the deployment.
porary limitation on the processing of Italian users’ data in April
Concerning the external dimension, the question arises of who
2023. Overall, this measure by the Italian Data Protection Authority
should be responsible for fulfilling pertinent duties and be ulti-
rightly points to the legitimate interests, and rights, of data subjects
mately liable, regarding administrative fines and civil damages, if
to be informed about how their personal data is used in training and
high-risk rules are violated. Here, we may draw inspiration from
fine-tuning generative AI models (for a more detailed discussion,
Art. 26 GDPR (see also [12]): this mechanism could, mutatis mutan-
see the technical report). It should be taken as a welcome wake-up
dis, be transferred to the AI value chain. Collaboration should be
call to the community of developers to share crucial information–
documented in writing to facilitate ex post accountability. Disclos-
on training, personal data, and pertinent risks–with the general
ing the core parts of the document, sparing trade secrets, should
public, instead of guarding secrets under the misnomer of OpenAI
help potential plaintiffs choosing the right party for following dis-
et al.
closure of evidence requests under the AI liability regime. Finally,
joint and several liability ensures collaboration and serves the com-
5 GENERATIVE MODEL CONTENT
pensation interests of injured persons. Internally, parties held liable
by injured persons can then turn around and seek reimbursement MODERATION: THE EUROPEAN DIGITAL
from others in the AI value chain. For example, if the developers SERVICES ACT
essentially retain control via an API distribution model, the internal The fourth large regulatory frontier concerning LGAIMs is content
liability burden will often fall on them. Developers’ and deployers’ moderation. Generative models, as virtually any novel technology,
liability, however, must end where their influence over the deployed may be used for better (think: birthday cards) or worse purposes
model ends. Beyond this point, only the users should be the sub- (think: shitstorm) [97]. The developers of ChatGPT, specifically, an-
ject of regulation and civil liability (and vice versa, for example in ticipated the potential for abuse and trained an internal AI modera-
control-via-API cases): incentives for action only make sense where tor, with controversial help from Kenyan contractors [62] , to detect
the person incentivized is actually in a position to act [89, 90]. In and block harmful content [12]. AI Research has made progress in
the GDPR setting, this was effectively decided by the CJEU in the this area recently [98-100]. OpenAI has released a content filtering
Fashion ID case (CJEU, C-40/17, para. 85). The sole responsibility mechanism which users may apply to analyze and flag potentially
of the users for certain areas should then also be included in the problematic content along several categories (violence; hate; sexual
disclosed agreement to inform potential plaintiffs and foreclose content etc.).6 Other large generative models have similar function-
non-meritorious claims against the developer and deployer. Such alities. However, actors intent on using ChatGPT, and other models,
a system, in our view, would strike an adequate balance of inter- to generate fake or harmful content will find ways to prompt them
ests and power between LGAIM developers, deployers, users, and to do just that. Prompt engineering is becoming a new art to elicit
affected persons. any content from LGAIMs [101] and fake news is harder to detect
The EP version of the AI Act now rightly contains rules on the AI than hate speech, even though industry efforts are underway via in-
value chain [68]. However, these need to be rendered more specific, creased model and source transparency [102]. As could be expected,
as laid out in the preceding sections, to function effectively. Ulti- DIY instructions for circumventing content filters are already popu-
mately, allocating responsibility and liability along the value chain lating YouTube and reddit,7 and researchers have already generated
is crucial if the AI Act seeks to maintain its spirit of a technology- an entire hate-filled shitstorm, along with code for proliferation,
specific instrument that does not, however, regulate models per se, using ChatGPT [8].
but primarily models in concrete use cases.
6 Seehttps://platform.openai.com/docs/api-reference/moderations.
7 See,e.g., https://www.youtube.com/watch?v=qpKlnYLtPjc;
https://www.reddit.com/r/OpenAI/comments/zjyrvw/a_tutorial_on_how_to_
use_chatgpt_to_make_any/.
1117
Regulating ChatGPT and other Large Generative AI Models FAccT ’23, June 12–15, 2023, Chicago, IL, USA
To stem the tide of fake news and hateful content–, the EU has DSA). According to Recital 14 DSA, closed groups on WhatsApp and
recently enacted the DSA. However, LGAIMs were not in the focus Telegram, on which problematic content particularly proliferates,
of public attention at the time when the DSA was being drafted. are explicitly excluded from the DSA’s online platform regulation
Hence, the DSA was designed to mitigate illegal content on social (Art. 19 et seqq. DSA) as messages are not distributed to the general
networks, built by human actors or the occasional Twitter bot, not to public. With the right lines of codes, potentially supplied by an
counter LGAIMs. The problem lies not in the territorial applicability LGAIM as well [8], malicious actors posting content in such groups
of the provision: the DSA, like the AI Act, covers services offered may therefore fully escape the ambit, and the enforcement tools, of
to users in the EU, irrespective of where the providers have their the DSA.
place of establishment (Art. 2(1), 3 (d) and (e) DSA). Hence, the only action to which the full range of the DSA mech-
Yet, the DSA seems outdated at the moment of its enactment anisms continues to apply is the posting of LGAIM-generated con-
due to two crucial limitations in its scope of application. First, it tent on traditional social networks. However, at this point in time,
covers only so-called intermediary services (Art. 2(1) and (2) DSA). Pandora’s box has already been opened. Misinformation may also
Art. 3(g) DSA defines them as “mere conduit” (e.g., Internet access be spread effectively and widely via interpersonal communication.
providers), “caching” or “hosting” services (e.g., social media plat- Even if the EU legislator has decided to exclude closed groups from
forms, see also Recital 28 DSA). Arguably, however, LGAIMs do the scope of the DSA [106], this balance needs to be reassessed in
not fall into any of these categories. Clearly, they are not compara- the context of readily available LGAIM output, which exacerbates
ble to access or caching service providers, which power Internet risks. Even the most stringent application of DSA enforcement
connections. Hosting services, in turn, are defined as providers mechanisms, potentially coupled with GDPR provisions on erasure
storing information provided by, and at the request of, a user (Art. of data (Art. 17(2) and 19 GDPR), cannot undo the harm done, and
3(g)(iii) DSA) [see also 103].While users do request information often cannot prevent the forward replication of problematic content
from LGAIMs via prompts, they can hardly be said to provide this [107]. Overall, current EU law, despite the laudable efforts in the
information. Rather, other than in traditional social media constel- DSA to mitigate the proliferation of fake news and hate speech,
lations, it is the LGAIM, not the user, who produces the text. To the fails to adequately address the dark side of LGAIMs.
contrary, CJEU jurisprudence shows that even platforms merely
storing user-generated content may easily lose their status as host-
ing providers, and concomitant liability privileges under the DSA 6 POLICY PROPOSALS
(and its predecessor in this respect, the E-Commerce Directive), if The preceding discussion has shown that regulation of LGAIMs
they “provide assistance” and thus leave their “neutral position”, is necessary, but must be better tailored to the concrete risks they
which may even mean merely promoting user-generated content entail. Hence, we suggest a shift away from the wholesale AI Act
(CJEU, Case C-324/09, L’Oréal para 116). A fortiori, systems gen- regulation envisioned in the general approach of the Council of EU
erating the content themselves cannot reasonably be qualified as toward specific regulatory duties and content moderation. Impor-
hosting service providers. Hence, the DSA does not apply. tantly, regulatory compliance must be feasible for LGAIM develop-
This does not imply that LGAIM content generation is not cov- ers large and small to avoid a winner-takes-all scenario and further
ered by content liability laws. Rather, its output may be covered market concentration [67]. This is crucial not only for innovation
by speech regulation, similar to comments made by human users and or consumer welfare [29, 108, 109], but also for environmental
online. However, this branch of the law is largely left to Member sustainability. While the carbon footprint of IT and AI is significant
State tort law, with the exception of Art. 82 GDPR in the case of and steadily rising [48-52], and training of LGAIMs is particularly
processing personal data of victims, which seems rather far-fetched resource intensive [110], large models may ultimately create fewer
in LGAIM constellations. Not only does such direct speech reg- greenhouse gas emissions than their smaller brethren if they can
ulation vary considerably between Member States [104], it also be adapted to multiple uses.
often lacks precisely the instruments the DSA has introduced to Against this background, we envision three layers of obligations
facilitate the rapid yet procedurally adequate removal of harmful for LGAIMs: the first set of minimum standards for all LGAIMs; a
speech and fake news from the online world: notice and action second set of specific high-risk rules applying only to LGAIMs used
mechanisms flanked by procedural safeguards; trusted flaggers; in concrete high-risk use cases; and the third set of rules governing
obligatory dispute resolution; and comprehensive compliance and collaboration along the AI value chain (see Section 3.2.2) to enable
risk management regimes for large platforms. effective compliance with the first two sets of rules.
The risk of a regulatory loophole might be partially closed, one Concerning minimum standards, first and foremost, the EU ac-
might object, by the applicability of the DSA to LGAIM-generated quis applies to developers of LGAIMs as well, putting the GDPR
posts that human users, or bots, publish on social networks. Here, and non-discrimination law (Section 4 and Technical Report), as
the DSA generally applies, as Twitter et al. qualify as hosting service well as product liability [24], center stage. In addition, transparency
providers. However, a second important gap looms: Recital 14 DSA rules, now also proposed by the EP [65], must apply (see below,
specifies that the main part of the regulation does not cover “private Section 6.1). Furthermore, specific risks of such outstanding rele-
messaging services.” While the notice and action mechanism applies vance that they should be addressed at the upstream level, rather
to all hosting services, instruments like trusted flaggers, obligatory than delegated to deployers in specific use cases, must be allocated
dispute resolution, and risk management systems are reserved for to developers as part of the minimum standards. This concerns, in
the narrower group of “online platforms” [105]. To qualify, these our view, selected data governance duties (Art. 10 AI Act, see Sec-
entities must disseminate information to the public (Art. 3(g)(iii), (k) tion 4) and rules on the ever more important issue of cybersecurity
1118
FAccT ’23, June 12–15, 2023, Chicago, IL, USA Philipp Hacker et al.
(Art. 15 AI Act). Finally, sustainability rules [24] as well as content likely be disregarded by malicious actors seeking to post harmful
moderation (see below, Section 6.4) should also form part of the content. Eventually, however, one might consider including social
minimum standards applicable to all LGAIMs. media scenarios into the domain of application of the transparency
In the following, we make four concrete, workable suggestions rule if AI detection tools are sufficiently reliable. In these cases,
for LGAIM regulation on the first and second level: (i) transparency malicious posts could be uncovered, and actors would face not only
obligations (first and second level); (ii) mandatory yet limited risk the traditional civil and criminal charges, but additionally AI Act
management (second level); (iii) non-discrimination data audits; enforcement, which could be financially significant (administrative
and (iv) expanded content moderation. fines) and hence create even greater incentives to comply with the
transparency rule, or refrain from harmful content propagation.
6.1 Transparency The enforcement of any user-focused transparency rule being
arduous, it must be supported by technical measures such as dig-
The AI Act contains a wide range of disclosure obligations (Art. 11,
ital rights management and watermarks imprinted by the model
Annex IV AI Act) that apply, however, only to high-risk systems. In
[112]. The EP is currently pondering a watermark obligation for
our view, given the vast potential and growing relevance of LGAIMs
generative AI [111]. Importantly, more interdisciplinary research is
for many sectors of society, LGAIMs should — irrespective of their
necessary to develop markings that are easy to use and recognize,
categorization as high-risk or non-high-risk — be subject to two
but hard to remove by average users [113]. This should be coupled
distinct transparency duties.
with research on AI-content detection to highlight such output
6.1.1 Transparency requirements for developers and deployers. First, where watermarks fail [99, 114].8
LGAIM developers and deployers should be required to report on
the provenance and curation of the training data, the model’s per- 6.2 Risk Management and Staged Release
formance metrics, and any incidents and mitigation strategies con- As mentioned, one major obstacle to the effective application of the
cerning harmful content. Ideally, to the extent technically feasible AI Act to LGAIMs proper is comprehensive risk management. Here,
[48, p. 28, Annex A], they should also disclose the model’s green- novel approaches are needed. Scholars have rightly suggested that
house gas (GHG) emissions, to allow for comparison and analysis powerful models should be released consciously, trading off the
by regulatory agencies, watchdog organizations, and other inter- added benefit of public scrutiny with the added risk of misuse in the
ested parties. This information could also serve as the basis for an case of full public releases [69, 115]. Additional factors, such as the
AI Sustainability Impact Assessment [24, p. 65 f., see also 111]. balance of power among developers, must also be considered [115].
In our view, a limited, staged release, coupled with only access
6.1.2 Transparency requirements for users. Second, professional for security researchers and selected stakeholders, may often be
users should be obligated to disclose which parts of their publicly preferable [see also 9, 69, 116, 117]. This adds a nuanced, community-
available content were generated by LGAIMs, or adapted based on based risk management strategy by way of codes of conduct to the
their output. Specifically, this entails that in adidas example, adidas regulatory mix [cf. also 117]. Regulatory oversight could be added
needs to adequately inform users that the design was generated by way of “regulated self-regulation;” an approach with potentially
using, e.g., Stable Diffusion. While the added value of such infor- binding effect of the code of conduct, à la Art. 40 GDPR, seems
mation may be limited in sales cases, such information is arguably preferable to the purely voluntary strategy envisioned in Art. 69
crucial in any cases involving content in the realm of journalism, AI Act.
academic research, or education. Here, the recipients will benefit Importantly, the full extent of the high-risk section of the AI
from insight into generation pipeline. They may use such a disclo- Act, including formal risk management, should only apply if and
sure as a warning signal and engage in additional fact checking or when a particular LGAIM (or GPAIS) is indeed used for high-risk
to at least take the content cum grano salis. Eventually, we imagine purposes (see Part 3.2). This strategy aligns with a general princi-
differentiating between specific use cases in which AI output trans- ple of product safety law [13]: not every screw and bolt must be
parency vis-à-vis recipients is warranted (e.g., journalism, academic manufactured to the highest standards. For example, only if they
research or education) and others where, based on further analysis are used for spaceships, stringent product safety regulations for
and market scrutiny, such disclosures may not be warranted (cer- producing aeronautics material apply9 –but not if they are sold in
tain sales, production and B2B scenarios, for example). For the time the local DIY store for generic use. The same principle should be
being, however, we would advocate a general disclosure obligation applied to LGAIMs.
for professional users to generate further information and insight
into the reception of such disclosures by other market participants 6.3 Non-Discrimination and training data
or recipients.
Furthermore, we suggest that, as an exception to the focus on
Conversely, we submit that non-professional users should not be
LGAIM deployers, certain data curation duties, for example repre-
required to inform about the use of AI. In the birthday example,
sentativeness and approximate balance between protected groups
hence, a parent would not need to inform the parents that the
(cf. Art. 10 AI Act), should apply to LGAIM developers. Discrimina-
invitation or the entire design of the birthday party was rendered
tion, arguably, is too important a risk to be delegated to the user
possible by, e.g., Aleph Alpha’s Luminous or ChatGPT. One might
push back against this in cases involving the private use of social 8 Seealso https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text/.
media, particularly harmful content generated with the help of 9 See,
e.g., product standards, aerospace series, DIN EN 4845–4851 (December 2022) on
LGAIMs. However, any rule to disclose AI-generated content would screws.
1119
Regulating ChatGPT and other Large Generative AI Models FAccT ’23, June 12–15, 2023, Chicago, IL, USA
stage and must be tackled during development and deployment. given to a specific group of users, trusted flaggers (cf. Art. 22 DSA),
Wherever possible, discrimination AI systems should be addressed who could be private individuals, technologies savvy NGOs, or
at its roots (often the training data) and not propagated down the volunteer coders. After registering with the competent authority,
ML pipeline or AI value chain. After all, discriminatory output they would essentially function as a decentralized content moni-
should, in our view, be avoided in all use cases, even on birthday toring team. They could experiment with different prompts and
cards. The regulatory burden, however, must be adapted to the see if they manage to generate harmful or otherwise problematic
abstract risk level and the compliance capacities (i.e., typically the content. They could also scan the internet for tools to circumvent
size) of the company. For example, LGAIM developers should have content moderation policies and instruments at LGAIMs.10 If they
to pro-actively audit the training data set for misrepresentations of find something, trusted flaggers would send a notice containing the
protected groups, in ways proportionate to their size and the type prompt and the output to a content moderation check-in point of
of training material (curated data vs. Twitter feeds scraped from the respective LGAIM system, which would forward the notice to
the Internet), and implement feasible mitigation measures. At the developers and/or deployers.
very least, real-world training data ought to be complemented with Here, the second component enters the scene, geared toward
synthetic data to balance historical and societal biases contained tech engineers working with developers or deployers. They would
in online sources. For example, content concerning professions have to respond to notices; those submitted by trusted flaggers
historically reserved for one gender (nurse; doctor) could be auto- would have to be prioritized by the content moderation team. Their
matically copied and any female first names or images exchanged job, essentially, is to modify the AI system, or to block its output, so
by male ones, and vice versa, creating a training corpus with more that the flagged prompt does not generate problematic output any-
gender-neutral professions for text and image generation. more, and to generally search for ways to block easy workarounds
likely tried by malicious actors. Furthermore, if the LGAIM system
6.4 Content Moderation is large enough, they would be tasked with establishing a more com-
prehensive compliance system (cf. Art. 34-35 DSA). Overall, such
One of the biggest challenges for LGAIMs is, arguably, their poten-
a combination of centralized and decentralized monitoring could
tial misuse for disinformation, manipulation, and harmful speech.
prove more effective and efficient than current systems relying
In our view, the DSA rules conceived for traditional social networks
essentially on goodwill to handle the expected flood of hate speech,
must be expanded and adapted accordingly.
fake news and other problematic content generated by LGAIMs.
6.4.1 Selective expansion of the DSA to LGAIMs. The EP has par-
tially addressed this challenge by stipulating that foundation mod- 6.5 Outlook: Technology-specific vs.
els must not violate EU law [76]. In our view, however, regulation technology-neutral regulation
should go one step further by selectively expanding DSA rules Overall, we have added several policy proposals. As a matter of reg-
to LGAIM developers and deployers. LGAIMs, and society, would ulatory technique, the legislator should, in our view, strive to shift
benefit from mandatory notice and action mechanisms, trusted flag- its strategy from technology-specific regulation–which will often
gers, and comprehensive audits for models with particularly many be outdated before eventually enacted–toward more technology-
users. The regulatory loophole is particularly virulent for LGAIMs neutral regulation wherever possible. Due to space constraints,
offered as standalone software, as is currently the case. In the future, we cannot elaborate on this point. However, future analysis may
one may expect an increasing integration into platforms of various show that non-discrimination law, formulated in a technology-
kinds, such as search engines or social networks, as evidenced by neutral way, continues to grapple with various challenges, but
LGAIM development or acquisition by Microsoft, Meta, or Google. arguably does a better job capturing the dynamics of LGAIM de-
While the DSA would then technically apply, it would still have velopment than the AI Act or the DSA, at least in the way they
to be updated to ensure that LGAIM-generated content is covered are currently enacted and proposed (see also the Technical Report).
just like user-generated content. In particular, as LGAIM output While technology-neutral regulation must be tailored, via agency
currently is particularly susceptible to being used for the spread decisions, regulatory guidelines, and court judgments, to specific
of misinformation, it seems advisable to require LGAIM-generated technologies, such “small-scale” adaptations are, arguably, often
content to be flagged as such–if technically feasible. Doctrinally, faster to produce than changes to a formal, technology-specific, leg-
this could be achieved via an amendment of the DSA or of Art. 29 islative act. For example, to extend the DSA to LGAIMs in specific
AI Act, which already contains notification duties in its para. 4 (see ways, one would have to update the DSA or include a reference in
Part. 4). Given the current political process in the EU, the latter the AI Act. Both modifications require concurring decisions by the
option seems more realistic. EP and the Council (Art. 289 TFEU). In non-discrimination law, by
contrast, all that is needed, in principle, is an adequate interpreta-
6.4.2 Implementation in practice. How could DSA-style content
tion of existing law by agencies and courts. Their decisions, at least
moderation applied to ChatGPT et al. look like in practice? We
in lower courts, can potentially be rendered faster and be used more
envision it to have two components. These components would
flexibly to carve out (preliminary11 ) safe harbors for developers,
combine centralized and decentralized monitoring within a notice-
and-action mechanism (cf. Art. 16 DSA).
10 See Fn. 7.
The first component harnesses the wisdom of the crowd, as it 11 Ultimately, we agree that it may take a substantial amount of time for final decisions
were, to correct LGAIM output. Users should be enabled to flag to emerge from the court system. Only these can deliver a higher degree of legal
problematic content and give notice. A special status should be certainty. However, even lower-court judgments or agency decisions may, arguably,
1120
FAccT ’23, June 12–15, 2023, Chicago, IL, USA Philipp Hacker et al.
deployers, and users, and to establish red lines to protect affected The last section makes concrete policy proposals. For example,
persons. detailed transparency obligations are warranted. This concerns both
LGAIM developers and deployers (performance metrics; harmful
7 CONCLUSION speech issues arisen during pre-training) as well as users (disclosure
of the use of LGAIM-generated content).
Scholars and regulators have long suggested that technology-
Finally, the core of the DSA’s content moderation rules should
neutral laws may be better prepared to tackle emerging risks
be expanded to cover LGAIMs. Art. 28b(4)(b) and generative AI
given the rapid pace of innovation in machine learning [118-120].
(Article 28b(4) AI Act EP Version) moves in this direction. More
While this claim, arguably, cannot be generally affirmed or refuted,
specifically, however, rules must also include notice and action
LGAIMs offer a cautionary example for regulation focused specif-
mechanisms, trusted flaggers, and, for very large LGAIM developers,
ically on certain technologies. As our study shows, technology-
comprehensive risk management systems and audits concerning
neutral laws sometimes fare better because technology-specific
content regulation. Arguably, it is insufficient to tackle AI-generated
regulation (on platforms; AI systems) may be outdated before (AI
hate speech and fake news ex post, once they are posted to social
Act, AI liability regime) or at the moment of its enactment (DSA).
media. At this point, their effect will be difficult to stop. Rather, AI
Overall, we add several policy proposals to the emerging regulatory
generation itself must be moderated by an adequate combination
landscape surrounding LGAIMs.
of AI tools, developer and user interventions, and law.
To start with, we argue for a new, differentiated terminology to
In all areas, regulators and lawmakers need to act fast to keep
capture the relevant actors in the AI value chain, in LGAIM settings
track with the unchained dynamics of GPT-4 et al. Updating regula-
and beyond. These include: LGAIM developers, deployers, profes-
tion is necessary both to maintain the civility of online discourses
sional and non-professional users, as well as recipients of LGAIM
and to create a level playing field for developing and deploying the
output. Such a nuanced understanding is necessary to allocate reg-
next generation of AI models, in the EU and beyond.
ulatory duties to specific actors and activities in the AI value chain.
The general approach adopted by the Council of the EU failed to
address the specificities of the LGAIM value chain. Rules in the
ACKNOWLEDGMENTS
AI Act and other direct regulation must match the specificities of Passages taken over from ChatGPT are found in the section on
pre-trained models. technical foundations, all italicized and marked with “”, and refer-
More concretely, we propose three layers of rules applicable to enced by the prompt used. They were all collected on January 17,
LGAIMs. The first layer applies directly to all LGAIMs. It comprises 2023. We deem them factually correct unless otherwise noted. This
existing, technology neutral regulation such as the GDPR or non- paper benefitted from comments by Johannes Otterbach, Sandra
discrimination provisions. Arguably, a version of Art. 10 AI Act and Wachter, and audiences at AI Campus Berlin, Bucerius Law School
of the cybersecurity rules in Art. 15 AI Act should also apply to (Hamburg), Magdalen College (Oxford), University of Hamburg,
LGAIM developers. Furthermore, sustainability and content moder- and Weizenbaum Institute of the Connected Society. All errors
ation instruments also form part of this first layer. Art. 28b AI Act remain entirely our own.
EP Version represents an imperfect step into this direction.
On the second layer, we suggest generally singling out concrete REFERENCES
high-risk applications, and not the pre-trained model itself, as the [1] Glaese, A., McAleese, N., Trębacz, M., Aslanides, J., Firoiu, V., Ewalds, T., Rauh,
M., Weidinger, L., Chadwick, M. and Thacker, P. Improving alignment of dialogue
object of high-risk obligations. For example, it seems inefficient agents via targeted human judgements. arXiv preprint arXiv:2209.14375 (2022).
and practically infeasible to compel the developers of ChatGPT [2] Shuster, K., Xu, J., Komeili, M., Ju, D., Smith, E. M., Roller, S., Ung, M., Chen,
M., Arora, K. and Lane, J. Blenderbot 3: a deployed conversational agent that
et al. to draw up a comprehensive risk management system cov- continually learns to responsibly engage. arXiv preprint arXiv:2208.03188 (2022).
ering, and mitigating, all the hypothetical risks to health, safety [3] Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ilić, S., Hesslow, D., Castagné, R.,
and fundamental rights such LGAIMs may pose – as the AI Act Luccioni, A. S., Yvon, F. and Gallé, M. Bloom: A 176b-parameter open-access
multilingual language model. arXiv preprint arXiv:2211.05100 (2022).
EP Version still does (Art. 28b(1)(a) and (f)). Rather, if used for a [4] Zuiderveen Borgesius, F. J. Strengthening legal protection against discrimination
concrete high-risk purpose (e.g., summarizing or grading résumés by algorithms and artificial intelligence. The International Journal of Human
in employment decisions), the specific deployer and user should Rights, 24, 10 (2020), 1572-1593.
[5] Lee, D. and Yoon, S. N. Application of artificial intelligence-based technologies
have to comply with the AI Act’s high-risk obligations, including in the healthcare industry: Opportunities and challenges. International Journal
the risk management system. of Environmental Research and Public Health, 18, 1 (2021), 271.
[6] Aung, Y. Y., Wong, D. C. and Ting, D. S. The promise of artificial intelligence: a
The devil, however, is in the detail: providers need to cooperate review of the opportunities and challenges of artificial intelligence in healthcare.
with deployers to comply with even such narrower regulatory re- British medical bulletin, 139, 1 (2021), 4-15.
quirements. Hence, a third layer mandating collaboration between [7] Marcus, G. A Skeptical Take on the A.I. Revolution. The Ezra Klein Show, The
New York Times, 2023.
actors in the AI value chain for compliance purposes is necessary. [8] Beuth, P. Wie sich ChatGPT mit Worten hacken lässt. Der Spiegel, 2023.
Here, we suggest drawing on experience from the US pretrial dis- [9] Bergman, A. S., Abercrombie, G., Spruit, S., Hovy, D., Dinan, E., Boureau, Y.-L.
covery system and Art. 26 GDPR to balance interests in the access and Rieser, V. Guiding the release of safer E2E conversational AI through value
sensitive design. Association for Computational Linguistics, 2022.
to information with trade secret protection. Art. 28(2a) AI Act EP [10] Mirsky, Y., Demontis, A., Kotak, J., Shankar, R., Gelei, D., Yang, L., Zhang, X.,
Version has partly taken up this proposal. Pintor, M., Lee, W. and Elovici, Y. The threat of offensive ai to organizations.
Computers & Security (2022), 103006.
[11] Satariano, A. and Mozur, P. The People Onscreen Are Fake. The Disinformation Is
Real., 2023.
indicate useful directions and, at least, be used to model compliance tools accordingly, [12] Edwards, L. Regulating AI in Europe: four problems and four solutions (2022),
even if policies may have to be revised if decisions are reversed in higher instances. 2022.
1121
Regulating ChatGPT and other Large Generative AI Models FAccT ’23, June 12–15, 2023, Chicago, IL, USA
[13] Hacker, P., Engel, A. and List, T. Understanding and regulating ChatGPT, and [46] Dao, T., Fu, D. Y., Ermon, S., Rudra, A. and Ré, C. Flashattention: Fast and
other large generative AI models. 2023. memory-efficient exact attention with io-awareness. arXiv preprint arXiv:2205.
[14] Gutierrez, C. I., Aguirre, A., Uuk, R., Boine, C. C. and Franklin, M. A Proposal for 14135 (2022).
a Definition of General Purpose Artificial Intelligence Systems. Working Paper, [47] Geiping, J. and Goldstein, T. Cramming: Training a Language Model on a Single
https://ssrn.com/abstract$=$4238951 (2022). GPU in One Day. arXiv preprint arXiv:2212.14034 (2022).
[15] Heikkilä, M. The EU wants to regulate your favorite AI tools. 2023. [48] OECD Measuring the Environmental Impacts of AI Compute and Applications:
[16] KI-Bundesverband Large European AI Models (LEAM) as Leuchtturmprojekt für The AI Footprint. 2022.
Europa. 2023. [49] Freitag, C., Berners-Lee, M., Widdicks, K., Knowles, B., Blair, G. S. and Friday,
[17] Goldstein, J. A., Sastry, G., Musser, M., DiResta, R., Gentzel, M. and Sedova, K. A. The real climate and transformative impact of ICT: A critique of estimates,
Generative Language Models and Automated Influence Operations: Emerging trends, and regulations. Patterns, 2, 9 (2021), 100340.
Threats and Potential Mitigations. arXiv preprint arXiv:2301.04246 (2023). [50] ACM, T. P. C. ACM TechBrief: Computing and Climate Change. 2021.
[18] Chee, F. Y. and Mukherjee, S. Exclusive: ChatGPT in spotlight as EU’s Breton bats [51] Cowls, J., Tsamados, A., Taddeo, M. and Floridi, L. The AI gambit: leveraging
for tougher AI rules. Reuters, 2023. artificial intelligence to combat climate change—opportunities, challenges, and
[19] Smith, B. Meeting the AI moment: advancing the future through responsible AI. recommendations. AI & Society (2021), 1-25.
2023. [52] Taddeo, M., Tsamados, A., Cowls, J. and Floridi, L. Artificial intelligence and
[20] Lieu, T. I’m a Congressman Who Codes. A.I. Freaks Me Out., 2023. the climate emergency: Opportunities, challenges, and recommendations. One
[21] An Act drafted with the help of ChatGPT to regulate generative artificial intelli- Earth, 4, 6 (2021), 776-779.
gence models like ChatGPT., 2023. [53] Balestriero, R., Ibrahim, M., Sobal, V., Morcos, A., Shekhar, S., Goldstein, T.,
[22] Helberger, N. and Diakopoulos, N. ChatGPT and the AI Act. Internet Policy Bordes, F., Bardes, A., Mialon, G. and Tian, Y. A Cookbook of Self-Supervised
Review, 12, 1 (2023). Learning. arXiv preprint arXiv:2304.12210 (2023).
[23] Veale, M. and Borgesius, F. Z. Demystifying the Draft EU Artificial Intelligence [54] Ananthaswamy, A. The Physics Principle That Inspired Modern AI Art. 2023.
Act—Analysing the good, the bad, and the unclear elements of the proposed [55] Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N. and Ganguli, S. Deep unsu-
approach. Computer Law Review International, 22, 4 (2021), 97-112. pervised learning using nonequilibrium thermodynamics. PMLR, 2015.
[24] Hacker, P. The European AI Liability Directives - Critique of a Half-Hearted [56] Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H. and Neubig, G. Pre-train, Prompt,
Approach and Lessons for the Future. Working Paper, https://arxiv.org/abs/2211. and Predict: A Systematic Survey of Prompting Methods in Natural Language
13960 (2022). Processing. ACM Computing Surveys, 55 (2021), 1 - 35.
[25] Douek, E. Content Moderation as Systems Thinking. Harv. L. Rev., 136 (2022), [57] Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang,
526. C., Agarwal, S., Slama, K. and Ray, A. Training language models to follow
[26] De Gregorio, G. Democratising online content moderation: A constitutional instructions with human feedback. arXiv preprint arXiv:2203.02155 (2022).
framework. Computer Law & Security Review, 36 (2020), 105374. [58] Luccioni, A. S. and Viviano, J. D. What’s in the Box? A Preliminary Analysis
[27] Heldt, A. P. EU Digital Services Act: The white hope of intermediary regulation. of Undesirable Content in the Common Crawl Corpus. arXiv preprint arXiv:
Palgrave, 2022. 2105.02732 (2021).
[28] Meyer, P. ChatGPT: How Does It Work Internally? , 2022. [59] Nadeem, M., Bethke, A. and Reddy, S. StereoSet: Measuring stereotypical bias
[29] Eifert, M., Metzger, A., Schweitzer, H. and Wagner, G. Taming the giants: The in pretrained language models. arXiv preprint arXiv:2004.09456 (2020).
DMA/DSA package. Common Market Law Review, 58, 4 (2021), 987-1028. [60] Zhao, Z., Wallace, E., Feng, S., Klein, D. and Singh, S. Calibrate before use:
[30] Laux, J., Wachter, S. and Mittelstadt, B. Taming the few: Platform regulation, Improving few-shot performance of language models. PMLR, 2021.
independent audits, and the risks of capture created by the DMA and DSA. [61] Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie,
Computer Law & Security Review, 43 (2021), 105613. A., Mirhoseini, A. and McKinnon, C. Constitutional AI: Harmlessness from AI
[31] Kasy, M. and Abebe, R. Fairness, equality, and power in algorithmic decision- Feedback. arXiv preprint arXiv:2212.08073 (2022).
making. 2021. [62] Perrigo, B. OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make
[32] Barabas, C., Doyle, C., Rubinovitz, J. and Dinakar, K. Studying up: reorienting ChatGPT Less Toxic. 2023.
the study of algorithmic fairness around issues of power. 2020. [63] Bertuzzi, L. Leading MEPs exclude general-purpose AI from high-risk categories
[33] Koops, E. Should ICT Regulation Be Technology-Neutral? TMC Asser Press, 2006. – for now. 2022.
[34] Bhuta, N., Beck, S. and Gei𝛽 , R. Autonomous weapons systems: law, ethics, policy. [64] Bertuzzi, L. AI Act: EU Parliament’s crunch time on high-risk categorisation,
Cambridge University Press, 2016. prohibited practices. 2023.
[35] Sassoli, M. Autonomous weapons and international humanitarian law: Advan- [65] Bertuzzi, L. AI Act: MEPs close in on rules for general purpose AI, foundation
tages, open technical questions and legal issues to be clarified. International models. 2023.
Law Studies, 90, 1 (2014), 1. [66] Bennett, C. C. and Hauser, K. Artificial intelligence framework for simulat-
[36] Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., ing clinical decision-making: A Markov decision process approach. Artificial
Bernstein, M. S., Bohg, J., Bosselut, A. and Brunskill, E. On the opportunities intelligence in medicine, 57, 1 (2013), 9-19.
and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021). [67] Geradin, D., Karanikioti, T. and Katsifis, D. GDPR Myopia: how a well-intended
[37] Ganguli, D., Hernandez, D., Lovitt, L., Askell, A., Bai, Y., Chen, A., Conerly, regulation ended up favouring large online platforms. European Competition
T., Dassarma, N., Drain, D. and Elhage, N. Predictability and surprise in large Journal, 17, 1 (2021), 47-92.
generative models. ACM Conference on Fairness, Accountability, and Transparency [68] Bertuzzi, L. MEPs seal the deal on Artificial Intelligence Act. 2023.
(2022), 1747-1764. [69] Liang, P., Bommasani, R., Creel, K. and Reich, R. The time is now to develop
[38] Hoffmann, J., Borgeaud, S., Mensch, A., Buchatskaya, E., Cai, T., Rutherford, E., community norms for the release of foundation models. 2022.
Casas, D. d. L., Hendricks, L. A., Welbl, J. and Clark, A. Training compute-optimal [70] Bornstein, M., Appenzeller, G. and Casado, M. Who Owns the Generative AI
large language models. arXiv preprint arXiv:2203.15556 (2022). Platform? , 2023.
[39] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, [71] Stuyck, J. Consumer Concepts in EU Secondary Law. De Gruyter, 2015.
Ł. and Polosukhin, I. Attention is all you need. Advances in neural information [72] Micklitz, H.-W., Stuyck, J., Terryn, E. and School, I. C. Consumer law. Hart
processing systems, 30 (2017). London, 2010.
[40] Devlin, J., Chang, M.-W., Lee, K. and Toutanova, K. Bert: Pre-training of deep [73] Blaschke, T. and Bajorath, J. Fine-tuning of a generative neural network for
bidirectional transformers for language understanding. arXiv preprint arXiv: designing multi-target compounds. Journal of Computer-Aided Molecular Design,
1810.04805 (2018). 36, 5 (2022/05/01 2022), 363-371.
[41] Radford, A., Narasimhan, K., Salimans, T. and Sutskever, I. Improving language [74] Ziegler, D. M., Stiennon, N., Wu, J., Brown, T. B., Radford, A., Amodei, D., Chris-
understanding by generative pre-training (2018). tiano, P. and Irving, G. Fine-tuning language models from human preferences.
[42] Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, arXiv preprint arXiv:1909.08593 (2019).
V. and Zettlemoyer, L. Bart: Denoising sequence-to-sequence pre-training for [75] Widder, D. G. and Nafus, D. Dislocated Accountabilities in the AI Supply Chain:
natural language generation, translation, and comprehension. arXiv preprint Modularity and Developers’ Notions of Responsibility. arXiv preprint arXiv:
arXiv:1910.13461 (2019). 2209.09780 (2022).
[43] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Nee- [76] Meyers, J. M. Artificial intelligence and trade secrets. Landslide, 11 (2018), 17.
lakantan, A., Shyam, P., Sastry, G. and Askell, A. Language models are few-shot [77] Bertuzzi, L. Leading EU lawmakers propose obligations for General Purpose AI.
learners. Advances in neural information processing systems, 33 (2020), 1877-1901. 2023.
[44] Kim, B., Kim, H., Lee, S.-W., Lee, G., Kwak, D., Jeon, D. H., Park, S., Kim, S., Kim, [78] Drexl, J., Hilty, R., Desaunettes-Barbero, L., Globocnik, J., Gonzalez Otero, B.,
S. and Seo, D. What changes can large-scale language models bring? intensive Hoffmann, J., Kim, D., Kulhari, S., Richter, H. and Scheuerer, S. Artificial Intel-
study on hyperclova: Billions-scale korean generative pretrained transformers. ligence and Intellectual Property Law-Position Statement of the Max Planck
arXiv preprint arXiv:2109.04650 (2021). Institute for Innovation and Competition of 9 April 2021 on the Current Debate.
[45] Bienert, J. and Klös, H.-P. Große KI-Modelle als Basis für Forschung und
wirtschaftliche Entwicklung. IW-Kurzbericht, 2022.
1122
FAccT ’23, June 12–15, 2023, Chicago, IL, USA Philipp Hacker et al.
Max Planck Institute for Innovation & Competition Research Paper, 21-10 (2021). [103] European, C., Directorate-General for Communications Networks, C., Technol-
[79] Calvin, N. and Leung, J. Who owns artificial intelligence? A preliminary analysis ogy, Hoboken, J., Quintais, J., Poort, J. and Eijk, N. Hosting intermediary services
of corporate intellectual property strategies and why they matter. Future of and illegal content online : an analysis of the scope of article 14 ECD in light of
Humanity Institute, February (2020). developments in the online service landscape : final report. Publications Office,
[80] Deeks, A. The judicial demand for explainable artificial intelligence. Columbia 2019.
Law Review, 119, 7 (2019), 1829-1850. [104] Brüggemeier, G., Ciacchi, A. C. and O’Callaghan, P. Personality rights in european
[81] Spindler, G. Die Vorschläge der EU-Kommission zu einer neuen Produkthaf- tort law. cambridge university press, 2010.
tung und zur Haftung von Herstellern und Betreibern Künstlicher Intelligenz. [105] Wilman, F. The Digital Services Act (DSA)-An Ooverview. Available at SSRN
Computer und Recht (2022), 689-704. 4304586 (2022).
[82] Wagner, G. Liability Rules for the Digital Age - Aiming for the Brussels Effect. [106] Gerdemann, S. and Spindler, G. Das Gesetz über digitale Dienste (Digital Services
European Journal of Tort Law (forthcoming) (2023), , https://ssrn.com/abstract$= Act) (Part 2). Gewerblicher Rechtsschutz und Urheberrecht (2023), 115-125.
$4320285. [107] Korenhof, P. and Koops, B.-J. Gender Identity and Privacy: Could a Right to
[83] McKown, J. R. Discovery of Trade Secrets. Santa Clara Computer & High Tech. Be Forgotten Help Andrew Agnes Online? Working Paper, https://ssrn.com/
LJ, 10 (1994), 35. abstract$=$2304190 (2014).
[84] Roberts, J. Too little, too late: Ineffective assistance of counsel, the duty to [108] Lianos, I. and Motchenkova, E. Market dominance and search quality in the
investigate, and pretrial discovery in criminal cases. Fordham Urb. LJ, 31 (2003), search engine market. Journal of Competition Law & Economics, 9, 2 (2013),
1097. 419-455.
[85] Shepherd, G. B. An empirical study of the economics of pretrial discovery. [109] Geroski, P. A. and Pomroy, R. Innovation and the evolution of market structure.
International Review of Law and Economics, 19, 2 (1999), 245-263. The journal of industrial economics (1990), 299-314.
[86] Subrin, S. N. Discovery in Global Perspective: Are We Nuts. DePaul L. Rev., 52 [110] Patterson, D., Gonzalez, J., Le, Q., Liang, C., Munguia, L.-M., Rothchild, D., So,
(2002), 299. D., Texier, M. and Dean, J. Carbon emissions and large neural network training.
[87] Kötz, H. Civil justice systems in Europe and the United States. Duke J. Comp. & arXiv preprint arXiv:2104.10350 (2021).
Int’l L., 13 (2003), 61. [111] Bertuzzi, L. AI Act: MEPs want fundamental rights assessments, obligations for
[88] Daniel, P. F. Protecting Trade Secrets from Discovery. Tort & Ins. LJ, 30 (1994), high-risk users. 2023.
1033. [112] Grinbaum, A. and Adomaitis, L. The Ethical Need for Watermarks in Machine-
[89] Shavell, S. Foundations of Economic Analysis of Law. Harvard U Press, 2004. Generated Language. arXiv preprint arXiv:2209.03118 (2022).
[90] Shavell, S. On liability and insurance. Bell Journal of Economics, 13 (1982), 120- [113] Kirchenbauer, J., Geiping, J., Wen, Y., Katz, J., Miers, I. and Goldstein, T. A
132. Watermark for Large Language Models. arXiv preprint arXiv:2301.10226 (2023).
[91] Hacker, P. Teaching fairness to artificial intelligence: existing and novel strate- [114] Mitchell, E., Lee, Y., Khazatsky, A., Manning, C. D. and Finn, C. DetectGPT: Zero-
gies against algorithmic discrimination under EU law. Common Market Law Shot Machine-Generated Text Detection using Probability Curvature. arXiv
Review, 55, 4 (2018), 1143-1186. preprint arXiv:2301.11305 (2023).
[92] Adams-Prassl, J., Binns, R. and Kelly-Lyth, A. Directly Discriminatory Algo- [115] Solaiman, I. The Gradient of Generative AI Release: Methods and Considerations.
rithms. The Modern Law Review (2022). arXiv preprint arXiv:2302.04844 (2023).
[93] Wachter, S. The Theory of Artificial Immutability: Protecting Algorithmic [116] Solaiman, I., Brundage, M., Clark, J., Askell, A., Herbert-Voss, A., Wu, J., Radford,
Groups Under Anti-Discrimination Law. arXiv preprint arXiv:2205.01166 (2022). A., Krueger, G., Kim, J. W. and Kreps, S. Release strategies and the social impacts
[94] Wachter, S., Mittelstadt, B. and Russell, C. Why fairness cannot be automated: of language models. arXiv preprint arXiv:1908.09203 (2019).
Bridging the gap between EU non-discrimination law and AI. Computer Law & [117] Crootof, R. Artificial intelligence research needs responsible publication norms.
Security Review, 41 (2021), 105567. Lawfare Blog (2019).
[95] Barocas, S. and Selbst, A. D. Big data’s disparate impact. California Law Review [118] Hoffmann-Riem, W. Innovation und Recht-Recht und Innovation: Recht im En-
(2016), 671-732. semble seiner Kontexte. Mohr Siebeck, 2016.
[96] Wachter, S. Affinity profiling and discrimination by association in online behav- [119] Bennett Moses, L. Regulating in the face of sociotechnical change (2016).
ioral advertising. Berkeley Tech. LJ, 35 (2020), 367. [120] Bennett Moses, L. Recurring dilemmas: The law’s race to keep up with techno-
[97] Brundage, M., Avin, S., Clark, J., Toner, H., Eckersley, P., Garfinkel, B., Dafoe, A., logical change. U. Ill. JL Tech. & Pol’y (2007), 239.
Scharre, P., Zeitzoff, T. and Filar, B. The malicious use of artificial intelligence:
Forecasting, prevention, and mitigation. arXiv preprint arXiv:1802.07228 (2018).
[98] Kiela, D., Firooz, H., Mohan, A., Goswami, V., Singh, A., Fitzpatrick, C. A., Bull, PROMPTS
P., Lipstein, G., Nelli, T. and Zhu, R. The hateful memes challenge: Competition Prompt 1: What are large generative AI models?
report. PMLR, 2021.
[99] Kiela, D., Firooz, H., Mohan, A., Goswami, V., Singh, A., Ringshia, P. and Tes- Prompt 2: What distinguishes large generative AI models from
tuggine, D. The hateful memes challenge: Detecting hate speech in multimodal other AI systems?
memes. Advances in Neural Information Processing Systems, 33 (2020), 2611-2624.
[100] Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y., Farhadi, A., Roesner, F. and
Prompt 3: Can you explain the technical foundations of large
Choi, Y. Defending against neural fake news. Advances in neural information generative models in simple terms, so that an inexperienced reader
processing systems, 32 (2019). understands it?
[101] Seeha, S. Prompt Engineering and Zero-Shot/Few-Shot Learning [Guide]. 2022.
[102] Deb, M., Deiseroth, B., Weinbach, S., Schramowski, P. and Kersting, K. AtMan: Prompt 4: What are the objectives, what are the obstacles when
Understanding Transformer Predictions Through Memory Efficient Attention it comes to content moderation within large generative AI models?
Manipulation. arXiv preprint arXiv:2301.08110 (2023). Prompt 5: How does content moderation work at ChatGPT?
1123