The Evolution of Lua: Roberto Ierusalimschy Luiz Henrique de Figueiredo Waldemar Celes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

The Evolution of Lua

Roberto Ierusalimschy Luiz Henrique de Figueiredo Waldemar Celes


Department of Computer Science, IMPA–Instituto Nacional de Department of Computer Science,
PUC-Rio, Rio de Janeiro, Brazil Matemática Pura e Aplicada, Brazil PUC-Rio, Rio de Janeiro, Brazil
[email protected] [email protected] [email protected]

Abstract ing languages offer associative arrays, in no other language


We report on the birth and evolution of Lua and discuss how do associative arrays play such a central role. Lua tables
it moved from a simple configuration language to a versatile, provide simple and efficient implementations for modules,
widely used language that supports extensible semantics, prototype-based objects, class-based objects, records, arrays,
anonymous functions, full lexical scoping, proper tail calls, sets, bags, lists, and many other data structures [28].
and coroutines. In this paper, we report on the birth and evolution of Lua.
We discuss how Lua moved from a simple configuration
Categories and Subject Descriptors K.2 [HISTORY OF language to a powerful (but still simple) language that sup-
COMPUTING]: Software; D.3 [PROGRAMMING LAN- ports extensible semantics, anonymous functions, full lexical
GUAGES] scoping, proper tail calls, and coroutines. In §2 we give an
overview of the main concepts in Lua, which we use in the
1. Introduction other sections to discuss how Lua has evolved. In §3 we re-
late the prehistory of Lua, that is, the setting that led to its
Lua is a scripting language born in 1993 at PUC-Rio, the
creation. In §4 we relate how Lua was born, what its original
Pontifical Catholic University of Rio de Janeiro in Brazil.
design goals were, and what features its first version had. A
Since then, Lua has evolved to become widely used in all
discussion of how and why Lua has evolved is given in §5.
kinds of industrial applications, such as robotics, literate
A detailed discussion of the evolution of selected features
programming, distributed business, image processing, exten-
is given in §6. The paper ends in §7 with a retrospective of
sible text editors, Ethernet switches, bioinformatics, finite-
the evolution of Lua and in §8 with a brief discussion of the
element packages, web development, and more [2]. In par-
reasons for Lua’s success, especially in games.
ticular, Lua is one of the leading scripting languages in game
development.
Lua has gone far beyond our most optimistic expecta-
2. Overview
tions. Indeed, while almost all programming languages come In this section we give a brief overview of the Lua language
from North America and Western Europe (with the notable and introduce the concepts discussed in §5 and §6. For a
exception of Ruby, from Japan) [4], Lua is the only language complete definition of Lua, see its reference manual [32].
created in a developing country to have achieved global rel- For a detailed introduction to Lua, see Roberto’s book [28].
evance. For concreteness, we shall describe Lua 5.1, which is the
From the start, Lua was designed to be simple, small, current version at the time of this writing (April 2007), but
portable, fast, and easily embedded into applications. These most of this section applies unchanged to previous versions.
design principles are still in force, and we believe that they Syntactically, Lua is reminiscent of Modula and uses
account for Lua’s success in industry. The main characteris- familiar keywords. To give a taste of Lua’s syntax, the code
tic of Lua, and a vivid expression of its simplicity, is that it below shows two implementations of the factorial function,
offers a single kind of data structure, the table, which is the one recursive and another iterative. Anyone with a basic
Lua term for an associative array [9]. Although most script- knowledge of programming can probably understand these
examples without explanation.
function factorial(n) function factorial(n)
if n == 0 then local a = 1
return 1 for i = 1,n do
else a = a*i
return n*factorial(n-1) end
end return a
end end
Semantically, Lua has many similarities with Scheme, article{"spe96",
even though these similarities are not immediately clear be- authors = {"Roberto Ierusalimschy",
cause the two languages are syntactically very different. The "Luiz Henrique de Figueiredo",
influence of Scheme on Lua has gradually increased during "Waldemar Celes"},
Lua’s evolution: initially, Scheme was just a language in the title = "Lua: an Extensible Extension Language",
journal = "Software: Practice & Experience",
background, but later it became increasingly important as
year = 1996,
a source of inspiration, especially with the introduction of }
anonymous functions and full lexical scoping.
Like Scheme, Lua is dynamically typed: variables do not Although such a database seems to be an inert data file,
have types; only values have types. As in Scheme, a variable it is actually a valid Lua program: when the database is
in Lua never contains a structured value, only a reference to loaded into Lua, each item in it invokes a function, because
one. As in Scheme, a function name has no special status in ‘article{· · ·}’ is syntactic sugar for ‘article({· · ·})’,
Lua: it is just a regular variable that happens to refer to a that is, a function call with a table as its single argument.
function value. Actually, the syntax for function definition It is in this sense that such files are called procedural data
‘function foo() · · · end’ used above is just syntactic files.
sugar for the assignment of an anonymous function to a We say that Lua is an extensible extension language [30].
variable: ‘foo = function () · · · end’. Like Scheme, Lua It is an extension language because it helps to extend ap-
has first-class functions with lexical scoping. Actually, all plications through configuration, macros, and other end-user
values in Lua are first-class values: they can be assigned customizations. Lua is designed to be embedded into a host
to global and local variables, stored in tables, passed as application so that users can control how the application be-
arguments to functions, and returned from functions. haves by writing Lua programs that access application ser-
One important semantic difference between Lua and vices and manipulate application data. It is extensible be-
Scheme — and probably the main distinguishing feature of cause it offers userdata values to hold application data and
Lua — is that Lua offers tables as its sole data-structuring extensible semantics mechanisms to manipulate these values
mechanism. Lua tables are associative arrays [9], but with in natural ways. Lua is provided as a small core that can be
some important features. Like all values in Lua, tables are extended with user functions written in both Lua and C. In
first-class values: they are not bound to specific variable particular, input and output, string manipulation, mathema-
names, as they are in Awk and Perl. A table can have any tical functions, and interfaces to the operating system are all
value as key and can store any value. Tables allow sim- provided as external libraries.
ple and efficient implementation of records (by using field Other distinguishing features of Lua come from its im-
names as keys), sets (by using set elements as keys), generic plementation:
linked structures, and many other data structures. Moreover,
we can use a table to implement an array by using natural Portability: Lua is easy to build because it is implemented
numbers as indices. A careful implementation [31] ensures in strict ANSI C.1 It compiles out-of-the-box on most
that such a table uses the same amount of memory that an platforms (Linux, Unix, Windows, Mac OS X, etc.), and
array would (because it is represented internally as an actual runs with at most a few small adjustments in virtually
array) and performs better than arrays in similar languages, all platforms we know of, including mobile devices (e.g.,
as independent benchmarks show [1]. handheld computers and cell phones) and embedded mi-
Lua offers an expressive syntax for creating tables in croprocessors (e.g., ARM and Rabbit). To ensure porta-
the form of constructors. The simplest constructor is the bility, we strive for warning-free compilations under as
expression ‘{}’, which creates a new, empty table. There are many compilers as possible.
also constructors to create lists (or arrays), such as Ease of embedding: Lua has been designed to be easily
{"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"} embedded into applications. An important part of Lua is
a well-defined application programming interface (API)
and to create records, such as that allows full communication between Lua code and
{lat= -22.90, long= -43.23, city= "Rio de Janeiro"} external code. In particular, it is easy to extend Lua by
exporting C functions from the host application. The API
These two forms can be freely mixed. Tables are indexed
allows Lua to interface not only with C and C++, but also
using square brackets, as in ‘t[2]’, with ‘t.x’ as sugar for
with other languages, such as Fortran, Java, Smalltalk,
‘t["x"]’.
Ada, C# (.Net), and even with other scripting languages
The combination of table constructors and functions
(e.g., Perl and Ruby).
turns Lua into a powerful general-purpose procedural data-
description language. For instance, a bibliographic database
in a format similar to the one used in BibTEX [34] can be 1 Actually,
Lua is implemented in “clean C”, that is, the intersection of C
written as a series of table constructors such as this: and C++. Lua compiles unmodified as a C++ library.
Small size: Adding Lua to an application does not bloat it. icy of strong trade barriers (called a “market reserve”) for
The whole Lua distribution, including source code, doc- computer hardware and software motivated by a national-
umentation, and binaries for some platforms, has always istic feeling that Brazil could and should produce its own
fit comfortably on a floppy disk. The tarball for Lua 5.1, hardware and software. In that atmosphere, Tecgraf’s clients
which contains source code, documentation, and exam- could not afford, either politically or financially, to buy cus-
ples, takes 208K compressed and 835K uncompressed. tomized software from abroad: by the market reserve rules,
The source contains around 17,000 lines of C. Under they would have to go through a complicated bureaucratic
Linux, the Lua interpreter built with all standard Lua li- process to prove that their needs could not be met by Brazil-
braries takes 143K. The corresponding numbers for most ian companies. Added to the natural geographical isolation
other scripting languages are more than an order of mag- of Brazil from other research and development centers, those
nitude larger, partially because Lua is primarily meant to reasons led Tecgraf to implement from scratch the basic
be embedded into applications and so its official distri- tools it needed.
bution includes only a few libraries. Other scripting lan- One of Tecgraf’s largest partners was (and still is) Petro-
guages are meant to be used standalone and include many bras, the Brazilian oil company. Several Tecgraf products
libraries. were interactive graphical programs for engineering appli-
Efficiency: Independent benchmarks [1] show Lua to be cations at Petrobras. By 1993, Tecgraf had developed little
one of the fastest languages in the realm of interpreted languages for two of those applications: a data-entry appli-
scripting languages. This allows application developers cation and a configurable report generator for lithology pro-
to write a substantial fraction of the whole application files. These languages, called DEL and SOL, were the an-
in Lua. For instance, over 40% of Adobe Lightroom is cestors of Lua. We describe them briefly here to show where
written in Lua (that represents around 100,000 lines of Lua came from.
Lua code). 3.1 DEL
Although these are features of a specific implementation, The engineers at Petrobras needed to prepare input data files
they are possible only due to the design of Lua. In particular, for numerical simulators several times a day. This process
Lua’s simplicity is a key factor in allowing a small, efficient was boring and error-prone because the simulation programs
implementation [31]. were legacy code that needed strictly formatted input files —
typically bare columns of numbers, with no indication of
what each number meant, a format inherited from the days
3. Prehistory of punched cards. In early 1992, Petrobras asked Tecgraf to
Lua was born in 1993 inside Tecgraf, the Computer Graph- create at least a dozen graphical front-ends for this kind of
ics Technology Group of PUC-Rio in Brazil. The cre- data entry. The numbers would be input interactively, just
ators of Lua were Roberto Ierusalimschy, Luiz Henrique by clicking on the relevant parts of a diagram describing
de Figueiredo, and Waldemar Celes. Roberto was an assis- the simulation — a much easier and more meaningful task
tant professor at the Department of Computer Science of for the engineers than editing columns of numbers. The
PUC-Rio. Luiz Henrique was a post-doctoral fellow, first at data file, in the correct format for the simulator, would be
IMPA and later at Tecgraf. Waldemar was a Ph.D. student in generated automatically. Besides simplifying the creation of
Computer Science at PUC-Rio. All three were members of data files, such front-ends provided the opportunity to add
Tecgraf, working on different projects there before getting data validation and also to compute derived quantities from
together to work on Lua. They had different, but related, the input data, thus reducing the amount of data needed from
backgrounds: Roberto was a computer scientist interested the user and increasing the reliability of the whole process.
mainly in programming languages; Luiz Henrique was a To simplify the development of those front-ends, a team
mathematician interested in software tools and computer led by Luiz Henrique de Figueiredo and Luiz Cristovão
graphics; Waldemar was an engineer interested in appli- Gomes Coelho decided to code all front-ends in a uni-
cations of computer graphics. (In 2001, Waldemar joined form way, and so designed DEL (“data-entry language”),
Roberto as faculty at PUC-Rio and Luiz Henrique became a a simple declarative language to describe each data-entry
researcher at IMPA.) task [17]. DEL was what is now called a domain-specific lan-
Tecgraf is a large research and development laboratory guage [43], but was then simply called a little language [10].
with several industrial partners. During the first ten years A typical DEL program defined several “entities”. Each
after its creation in May 1987, Tecgraf focused mainly on entity could have several fields, which were named and
building basic software tools to enable it to produce the inter- typed. For implementing data validation, DEL had predi-
active graphical programs needed by its clients. Accordingly, cate statements that imposed restrictions on the values of
the first Tecgraf products were drivers for graphical termi- entities. DEL also included statements to specify how data
nals, plotters, and printers; graphical libraries; and graphical was to be input and output. An entity in DEL was essen-
interface toolkits. From 1977 until 1992, Brazil had a pol- tially what is called a structure or record in conventional
programming languages. The important difference — and of SOL was strongly influenced by BibTEX [34] and UIL, a
what made DEL suitable for the data-entry problem — is that language for describing user interfaces in Motif [39].
entity names also appeared in a separate graphics metafile, The main task of the SOL interpreter was to read a report
which contained the associated diagram over which the en- description, check whether the given objects and attributes
gineer did the data entry. A single interactive graphical inter- were correctly typed, and then present the information to the
preter called ED (an acronym for ‘entrada de dados’, which main program (PGM). To allow the communication between
means ‘data entry’ in Portuguese) was written to interpret the main program and the SOL interpreter, the latter was
DEL programs. All those data-entry front-ends requested implemented as a C library that was linked to the main
by Petrobras were implemented as DEL programs that ran program. The main program could access all configuration
under this single graphical application. information through an API in this library. In particular, the
DEL was a success both among the developers at Tec- main program could register a callback function for each
graf and among the users at Petrobras. At Tecgraf, DEL type, which the SOL interpreter would call to create an
simplified the development of those front-ends, as originally object of that type.
intended. At Petrobras, DEL allowed users to tailor data-
entry applications to their needs. Soon users began to de-
4. Birth
mand more power from DEL, such as boolean expressions
for controlling whether an entity was active for input or not, The SOL team finished an initial implementation of SOL
and DEL became heavier. When users began to ask for con- in March 1993, but they never delivered it. PGM would
trol flow, with conditionals and loops, it was clear that ED soon require support for procedural programming to allow
needed a real programming language instead of DEL. the creation of more sophisticated layouts, and SOL would
have to be extended. At the same time, as mentioned before,
ED users had requested more power from DEL. ED also
3.2 SOL
needed further descriptive facilities for programming its user
At about the same time that DEL was created, a team lead by interface. Around mid-1993, Roberto, Luiz Henrique, and
Roberto Ierusalimschy and Waldemar Celes started working Waldemar got together to discuss DEL and SOL, and con-
on PGM, a configurable report generator for lithology pro- cluded that the two languages could be replaced by a single,
files, also for Petrobras. The reports generated by PGM con- more powerful language, which they decided to design and
sisted of several columns (called “tracks”) and were highly implement. Thus the Lua team was born; it has not changed
configurable: users could create and position the tracks, and since.
could choose colors, fonts, and labels; each track could have Given the requirements of ED and PGM, we decided that
a grid, which also had its set of options (log/linear, verti- we needed a real programming language, with assignments,
cal and horizontal ticks, etc.); each curve had its own scale, control structures, subroutines, etc. The language should
which had to be changed automatically in case of overflow; also offer data-description facilities, such as those offered
etc. All this configuration was to be done by the end-users, by SOL. Moreover, because many potential users of the
typically geologists and engineers from Petrobras working language were not professional programmers, the language
in oil plants and off-shore platforms. The configurations had should avoid cryptic syntax and semantics. The implemen-
to be stored in files, for reuse. The team decided that the best tation of the new language should be highly portable, be-
way to configure PGM was through a specialized description cause Tecgraf’s clients had a very diverse collection of com-
language called SOL, an acronym for Simple Object Lan- puter platforms. Finally, since we expected that other Tec-
guage. graf products would also need to embed a scripting lan-
Because PGM had to deal with many different objects, guage, the new language should follow the example of SOL
each with many different attributes, the SOL team decided and be provided as a library with a C API.
not to fix those objects and attributes into the language. In- At that point, we could have adopted an existing scripting
stead, SOL allowed type declarations, as in the code below: language instead of creating a new one. In 1993, the only real
contender was Tcl [40], which had been explicitly designed
type @track{ x:number, y:number=23, id=0 }
to be embedded into applications. However, Tcl had unfa-
type @line{ t:@track=@track{x=8}, z:number* }
T = @track{ y=9, x=10, id="1992-34" } miliar syntax, did not offer good support for data description,
L = @line{ t=@track{x=T.y, y=T.x}, z=[2,3,4] } and ran only on Unix platforms. We did not consider LISP
or Scheme because of their unfriendly syntax. Python was
This code defines two types, track and line, and creates still in its infancy. In the free, do-it-yourself atmosphere that
two objects, a track T and a line L. The track type contains then reigned in Tecgraf, it was quite natural that we should
two numeric attributes, x and y, and an untyped attribute, id; try to develop our own scripting language. So, we started
attributes y and id have default values. The line type con- working on a new language that we hoped would be simpler
tains a track t and a list of numbers z. The track t has as to use than existing languages. Our original design decisions
default value a track with x=8, y=23, and id=0. The syntax were: keep the language simple and small, and keep the im-
plementation simple and portable. Because the new language A polemic point was the use of semicolons. We thought
was partially inspired by SOL (sun in Portuguese), a friend that requiring semicolons could be a little confusing for en-
at Tecgraf (Carlos Henrique Levy) suggested the name ‘Lua’ gineers with a Fortran background, but not allowing them
(moon in Portuguese), and Lua was born. (DEL did not in- could confuse those with a C or Pascal background. In typi-
fluence Lua as a language. The main influence of DEL on cal committee fashion, we settled on optional semicolons.
the birth of Lua was rather the realization that large parts Initially, Lua had seven types: numbers (implemented
of complex applications could be written using embeddable solely as reals), strings, tables, nil, userdata (pointers to
scripting languages.) C objects), Lua functions, and C functions. To keep the lan-
We wanted a light full language with data-description fa- guage small, we did not initially include a boolean type:
cilities. So we took SOL’s syntax for record and list con- as in Lisp, nil represented false and any other value repre-
struction (but not type declaration), and unified their imple- sented true. Over 13 years of continuous evolution, the only
mentation using tables: records use strings (the field names) changes in Lua types were the unification of Lua functions
as indices; lists use natural numbers. An assignment such as and C functions into a single function type in Lua 3.0 (1997)
and the introduction of booleans and threads in Lua 5.0
T = @track{ y=9, x=10, id="1992-34" } (2003) (see §6.1). For simplicity, we chose to use dynamic
typing instead of static typing. For applications that needed
type checking, we provided basic reflective facilities, such
which was valid in SOL, remained valid in Lua, but with as run-time type information and traversal of the global en-
a different meaning: it created an object (that is, a table) vironment, as built-in functions (see §6.11).
with the given fields, and then called the function track on By July 1993, Waldemar had finished the first implemen-
this table to validate the object or perhaps to provide default tation of Lua as a course project supervised by Roberto.
values to some of its fields. The final value of the expression The implementation followed a tenet that is now central to
was that table. Extreme Programming: “the simplest thing that could pos-
Except for its procedural data-description constructs, Lua sibly work” [7]. The lexical scanner was written with lex
introduced no new concepts: Lua was created for production and the parser with yacc, the classic Unix tools for imple-
use, not as an academic language designed to support re- menting languages. The parser translated Lua programs into
search in programming languages. So, we simply borrowed instructions for a stack-based virtual machine, which were
(even unconsciously) things that we had seen or read about then executed by a simple interpreter. The C API made it
in other languages. We did not reread old papers to remem- easy to add new functions to Lua, and so this first version
ber details of existing languages. We just started from what provided only a tiny library of five built-in functions (next,
we knew about other languages and reshaped that according nextvar, print, tonumber, type) and three small exter-
to our tastes and needs. nal libraries (input and output, mathematical functions, and
We quickly settled on a small set of control structures, string manipulation).
with syntax mostly borrowed from Modula (while, if, and Despite this simple implementation — or possibly be-
repeat until). From CLU we took multiple assignment cause of it — Lua surpassed our expectations. Both PGM
and multiple returns from function calls. We regarded mul- and ED used Lua successfully (PGM is still in use today;
tiple returns as a simpler alternative to reference parameters ED was replaced by EDG [12], which was mostly written
used in Pascal and Modula and to in-out parameters used in in Lua). Lua was an immediate success in Tecgraf and soon
Ada; we also wanted to avoid explicit pointers (used in C). other projects started using it. This initial use of Lua at Tec-
From C++ we took the neat idea of allowing a local vari- graf was reported in a brief talk at the VII Brazilian Sympo-
able to be declared only where we need it. From SNOBOL sium on Software Engineering, in October 1993 [29].
and Awk we took associative arrays, which we called tables; The remainder of this paper relates our journey in im-
however, tables were to be objects in Lua, not attached to proving Lua.
variables as in Awk.
One of the few (and rather minor) innovations in Lua was
the syntax for string concatenation. The natural ‘+’ operator 5. History
would be ambiguous, because we wanted automatic coer- Figure 1 shows a timeline of the releases of Lua. As can be
cion of strings to numbers in arithmetic operations. So, we seen, the time interval between versions has been gradually
invented the syntax ‘..’ (two dots) for string concatenation. increasing since Lua 3.0. This reflects our perception that

1.0 1.1 2.1 2.2 2.4 2.5 3.0 3.1 3.2 4.0 5.0 5.1

1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Figure 1. The releases of Lua.


1.0 1.1 2.1 2.2 2.4 2.5 3.0 3.1 3.2 4.0 5.0 5.1
constructors • • • • • • • • • • • •
garbage collection • • • • • • • • • • • •
extensible semantics ◦ ◦ • • • • • • • • • •
support for OOP ◦ ◦ • • • • • • • • • •
long strings ◦ ◦ ◦ • • • • • • • • •
debug API ◦ ◦ ◦ • • • • • • • • •
external compiler ◦ ◦ ◦ ◦ • • • • • • • •
vararg functions ◦ ◦ ◦ ◦ ◦ • • • • • • •
pattern matching ◦ ◦ ◦ ◦ ◦ • • • • • • •
conditional compilation ◦ ◦ ◦ ◦ ◦ ◦ • • • ◦ ◦ ◦
anonymous functions, closures ◦ ◦ ◦ ◦ ◦ ◦ ◦ • • • • •
debug library ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ • • • •
multi-state API ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ • • •
for statement ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ • • •
long comments ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ • •
full lexical scoping ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ • •
booleans ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ • •
coroutines ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ • •
incremental garbage collection ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ •
module system ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ •
1.0 1.1 2.1 2.2 2.4 2.5 3.0 3.1 3.2 4.0 5.0 5.1
libraries 4 4 4 4 4 4 4 4 5 6 8 9
built-in functions 5 7 11 11 13 14 25 27 35 0 0 0
API functions 30 30 30 30 32 32 33 47 41 60 76 79
vm type (stack × register) S S S S S S S S S S R R
vm instructions 64 65 69 67 67 68 69 128 64 49 35 38
keywords 16 16 16 16 16 16 16 16 16 18 21 21
other tokens 21 21 23 23 23 23 24 25 25 25 24 26
Table 1. The evolution of features in Lua.

Lua was becoming a mature product and needed stability for In the remainder of this section we discuss some mile-
the benefit of its growing community. Nevertheless, the need stones in the evolution of Lua. Details on the evolution of
for stability has not hindered progress. Major new versions several specific features are given in §6. Table 1 summarizes
of Lua, such as Lua 4.0 and Lua 5.0, have been released since this evolution. It also contains statistics about the size of Lua,
then. which we now discuss briefly.
The long times between versions also reflects our release The number of standard libraries has been kept small be-
model. Unlike other open-source projects, our alpha versions cause we expect that most Lua functions will be provided by
are quite stable and beta versions are essentially final, except the host application or by third-party libraries. Until Lua 3.1,
for uncovered bugs.2 This release model has proved to be the only standard libraries were for input and output, string
good for Lua stability. Several products have been shipped manipulation, mathematical functions, and a special library
with alpha or beta versions of Lua and worked fine. How- of built-in functions, which did not use the C API but directly
ever, this release model did not give users much chance to accessed the internal data structures. Since then, we have
experiment with new versions; it also deprived us of timely added libraries for debugging (Lua 3.2), interfacing with the
feedback on proposed changes. So, during the development operating system (Lua 4.0), tables and coroutines (Lua 5.0),
of Lua 5.0 we started to release “work” versions, which are and modules (Lua 5.1).
just snapshots of the current development of Lua. This move The size of C API changed significantly when it was re-
brought our current release model closer to the “Release designed in Lua 4.0. Since then, it has moved slowly toward
Early, Release Often” motto of the open-source community. completeness. As a consequence, there are no longer any
built-in functions: all standard libraries are implemented on
2 The
top the C API, without accessing the internals of Lua.
number of bugs found after final versions were released has been
consistently small: only 10 in Lua 4.0, 17 in Lua 5.0, and 10 in Lua 5.1
The virtual machine, which executes Lua programs, was
so far, none of them critical bugs. stack-based until Lua 4.0. In Lua 3.1 we added variants
for many instructions, to try to improve performance. How- scripting languages (e.g, Tcl) were free made us realize that
ever, this turned out to be too complicated for little per- restrictions on commercial uses might even discourage aca-
formance gain and we removed those variants in Lua 3.2. demic uses, since some academic projects plan to go to mar-
Since Lua 5.0, the virtual machine is register-based [31]. ket eventually. So, when the time came to release the next
This change gave the code generator more opportunities for version (Lua 2.1), we chose to release it as unrestricted free
optimization and reduced the number of instructions of typi- software. Naively, we wrote our own license text as a slight
cal Lua programs. (Instruction dispatch is a significant frac- collage and rewording of existing licenses. We thought it
tion of the time spent in the virtual machine [13].) As far was clear that the new license was quite liberal. Later, how-
as we know, the virtual machine of Lua 5.0 was the first ever, with the spread of open-source licenses, our license text
register-based virtual machine to have wide use. became a source of noise among some users; in particular,
it was not clear whether our license was compatible with
5.1 Lua 1
GPL. In May 2002, after a long discussion in the mailing
The initial implementation of Lua was a success in Tec- list, we decided to release future versions of Lua (starting
graf and Lua attracted users from other Tecgraf projects. with Lua 5.0) under the well-known and very liberal MIT
New users create new demands. Several users wanted to use license [3]. In July 2002, the Free Software Foundation con-
Lua as the support language for graphics metafiles, which firmed that our previous license was compatible with GPL,
abounded in Tecgraf. Compared with other programmable but we were already committed to adopting the MIT license.
metafiles, Lua metafiles have the advantage of being based Questions about our license have all but vanished since then.
on a truly procedural language: it is natural to model com-
plex objects by combining procedural code fragments with 5.2 Lua 2
declarative statements. In contrast, for instance, VRML [8] Despite all the hype surrounding object-oriented program-
must use another language (Javascript) to model procedural ming (which in the early 1990s had reached its peak) and
objects. the consequent user pressure to add object-oriented features
The use of Lua for this kind of data description, especially to Lua, we did not want to turn Lua into an object-oriented
large graphics metafiles, posed challenges that were unusual language because we did not want to fix a programming
for typical scripting languages. For instance, it was not un- paradigm for Lua. In particular, we did not think that Lua
common for a diagram used in the data-entry program ED needed objects and classes as primitive language concepts,
to have several thousand parts described by a single Lua ta- especially because they could be implemented with tables if
ble constructor with several thousand items. That meant that needed (a table can hold both object data and methods, since
Lua had to cope with huge programs and huge expressions. functions are first-class values). Despite recurring user pres-
Because Lua precompiled all programs to bytecode for a vir- sure, we have not changed our minds to this day: Lua does
tual machine on the fly, it also meant that the Lua compiler not force any object or class model onto the programmer.
had to run fast, even for large programs. Several object models have been proposed and implemented
By replacing the lex-generated scanner used in the first by users; it is a frequent topic of discussion in our mailing
version by a hand-written one, we almost doubled the speed list. We think this is healthy.
of the Lua compiler on typical metafiles. We also modified On the other hand, we wanted to allow object-oriented
Lua’s virtual machine to handle a long constructor by adding programming with Lua. Instead of fixing a model, we de-
key-value pairs to the table in batches, not individually as in cided to provide flexible mechanisms that would allow the
the original virtual machine. These changes solved the initial programmer to build whatever model was suitable to the ap-
demands for better performance. Since then, we have always plication. Lua 2.1, released in February 1995, marked the in-
tried to reduce the time spent on precompilation. troduction of these extensible semantics mechanisms, which
In July 1994, we released a new version of Lua with those have greatly increased the expressiveness of Lua. Extensible
optimizations. This release coincided with the publication of semantics has become a hallmark of Lua.
the first paper describing Lua, its design, and its implementa- One of the goals of extensible semantics was to allow ta-
tion [15]. We named the new version ‘Lua 1.1’. The previous bles to be used as a basis for objects and classes. For that,
version, which was never publicly released, was then named we needed to implement inheritance for tables. Another goal
‘Lua 1.0’. (A snapshot of Lua 1.0 taken in July 1993 was was to turn userdata into natural proxies for application data,
released in October 2003 to celebrate 10 years of Lua.) not merely handles meant to be used solely as arguments to
Lua 1.1 was publicly released as software available in functions. We wanted to be able to index userdata as if they
source code by ftp, before the open-source movement got its were tables and to call methods on them. This would allow
current momentum. Lua 1.1 had a restrictive user license: it Lua to fulfill one of its main design goals more naturally:
was freely available for academic purposes but commercial to extend applications by providing scriptable access to ap-
uses had to be negotiated. That part of the license did not plication services and data. Instead of adding mechanisms
work: although we had a few initial contacts, no commer- to support all these features directly in the language, we de-
cial uses were ever negotiated. This and the fact that other cided that it would be conceptually simpler to define a more
general fallback mechanism to let the programmer intervene ticipated the possibility of an external compiler.) The format
whenever Lua did not know how to proceed. of this file was chosen to be easily loaded and reasonably
We introduced fallbacks in Lua 2.1 and defined them for portable. With luac, programmers could avoid parsing and
the following operations: table indexing, arithmetic oper- code generation at run time, which in the early days were
ations, string concatenation, order comparisons, and func- costly. Besides faster loading, luac also allowed off-line
tion calls.3 When one of these operations was applied to syntax checking and protection from casual user changes.
the “wrong” kind of values, the corresponding fallback Many products (e.g., The Sims and Adobe Lightroom) dis-
was called, allowing the programmer to determine how tribute Lua scripts in precompiled form.
Lua would proceed. The table indexing fallbacks allowed During the implementation of luac, we started to restruc-
userdata (and other values) to behave as tables, which was ture Lua’s core into clearly separated modules. As a conse-
one of our motivations. We also defined a fallback to be quence, it is now quite easy to remove the parsing modules
called when a key was absent from a table, so that we (lexer, parser, and code generator), which currently repre-
could support many forms of inheritance (through dele- sent 35% of the core code, leaving just the module that loads
gation). To complete the support for object-oriented pro- precompiled Lua programs, which is merely 3% of the core
gramming, we added two pieces of syntactic sugar: method code. This reduction can be significant when embedding Lua
definitions of the form ‘function a:foo(· · ·)’ as sugar in small devices such as mobile devices, robots and sensors.5
for ‘function a.foo(self,· · ·)’ and method calls of the Since its first version, Lua has included a library for
form ‘a:foo(· · ·)’ as sugar for ‘a.foo(a,· · ·)’. In §6.8 we string-processing. The facilities provided by this library
discuss fallbacks in detail and how they evolved into their were minimal until Lua 2.4. However, as Lua matured, it
later incarnations: tag methods and metamethods. became desirable to do heavier text processing in Lua. We
Since Lua 1.0, we have provided introspective functions thought that a natural addition to Lua would be pattern
for values: type, which queries the type of a Lua value; matching, in the tradition of Snobol, Icon, Awk, and Perl.
next, which traverses a table; and nextvar, which traverses However, we did not want to include a third-party pattern-
the global environment. (As mentioned in §4, this was par- matching engine in Lua because such engines tend to be very
tially motivated by the need to implement SOL-like type large; we also wanted to avoid copyright issues that could be
checking.) In response to user pressure for full debug fa- raised by including third-party code in Lua.
cilities, Lua 2.2 (November 1995) introduced a debug API As a student project supervised by Roberto in the second
to provide information about running functions. This API semester of 1995, Milton Jonathan, Pedro Miller
gave users the means to write in C their own introspective Rabinovitch, Pedro Willemsens, and Vinicius Almendra pro-
tools, such as debuggers and profilers. The debug API was duced a pattern-matching library for Lua. Experience with
initially quite simple: it allowed access to the Lua call stack, that design led us to write our own pattern-matching en-
to the currently executing line, and provided a function to gine for Lua, which we added to Lua 2.5 (November 1996)
find the name of a variable holding a given value. Following in two functions: strfind (which originally only found
the M.Sc. work of Tomás Gorham [22], the debug API was plain substrings) and the new gsub function (a name taken
improved in Lua 2.4 (May 1996) by functions to access local from Awk). The gsub function globally replaced substrings
variables and hooks to be called at line changes and function matching a given pattern in a larger string. It accepted either
calls. a replacement string or a function that was called each time
With the widespread use of Lua at Tecgraf, many large a match was found and was intended to return the replace-
graphics metafiles were being written in Lua as the output ment string for that match. (That was an innovation at the
of graphical editors. Loading such metafiles was taking in- time.) Aiming at a small implementation, we did not include
creasingly longer as they became larger and more complex.4 full regular expressions. Instead, the patterns understood by
Since its first version, Lua precompiled all programs to byte- our engine were based on character classes, repetitions, and
code just before running them. The load time of a large pro- captures (but not alternation or grouping). Despite its sim-
gram could be substantially reduced by saving this bytecode plicity, this kind of pattern matching is quite powerful and
to a file. This would be especially relevant for procedural was an important addition to Lua.
data files such as graphics metafiles. So, in Lua 2.4, we in- That year was a turning point in the history of Lua be-
troduced an external compiler, called luac, which precom- cause it gained international exposure. In June 1996 we pub-
piled a Lua program and saved the generated bytecode to a lished a paper about Lua in Software: Practice & Experi-
binary file. (Our first paper about Lua [15] had already an- ence [30] that brought external attention to Lua, at least in

3 We also introduced fallbacks for handling fatal errors and for monitoring 5 Crazy Ivan, a robot that won RoboCup in 2000 and 2001 in Denmark,
garbage collection, even though they were not part of extensible semantics. had a “brain” implemented in Lua. It ran directly on a Motorola Coldfire
4 Surprisingly, a substantial fraction of the load time was taken in the lexer 5206e processor without any operating system (in other words, Lua was the
for converting real numbers from text form to floating-point representation. operating system). Lua was stored on a system ROM and loaded programs
Real numbers abound in graphics metafiles. at startup from the serial port.
academic circles.6 In December 1996, shortly after Lua 2.5 5.3 Lua 3
was released, the magazine Dr. Dobb’s Journal featured The fallback mechanism introduced in Lua 2.1 to support
an article about Lua [16]. Dr. Dobb’s Journal is a popular extensible semantics worked quite well but it was a global
publication aimed directly at programmers, and that article mechanism: there was only one hook for each event. This
brought Lua to the attention of the software industry. Among made it difficult to share or reuse code because modules that
several messages that we received right after that publication defined fallbacks for the same event could not co-exist eas-
was one sent in January 1997 by Bret Mogilefsky, who was ily. Following a suggestion by Stephan Herrmann in Decem-
the lead programmer of Grim Fandango, an adventure game ber 1996, in Lua 3.0 (July 1997) we solved the fallback clash
then under development by LucasArts. Bret told us that he problem by replacing fallbacks with tag methods: the hooks
had read about Lua in Dr. Dobb’s and that they planned to re- were attached to pairs (event, tag) instead of just to events.
place their home-brewed scripting language with Lua. Grim Tags had been introduced in Lua 2.1 as integer labels that
Fandango was released in October 1998 and in May 1999 could be attached to userdata (see §6.10); the intention was
Bret told us that “a tremendous amount of the game was that C objects of the same type would be represented in Lua
written in Lua” (his emphasis) [38].7 Around that time, Bret by userdata having the same tag. (However, Lua did not force
attended a roundtable about game scripting at the Game De- any interpretation on tags.) In Lua 3.0 we extended tags to
velopers’ Conference (GDC, the main event for game pro- all values to support tag methods. The evolution of fallbacks
grammers) and at the end he related his experience with the is discussed in §6.8.
successful use of Lua in Grim Fandango. We know of several Lua 3.1 (July 1998) brought functional programming to
developers who first learned about Lua at that event. After Lua by introducing anonymous functions and function clo-
that, Lua spread by word of mouth among game developers sures via “upvalues”. (Full lexical scoping had to wait until
to become a definitely marketable skill in the game industry Lua 5.0; see §6.6.) The introduction of closures was mainly
(see §8). motivated by the existence of higher-order functions, such as
As a consequence of Lua’s international exposure, the gsub, which took functions as arguments. During the work
number of messages sent to us asking questions about Lua on Lua 3.1, there were discussions in the mailing list about
increased substantially. To handle this traffic more effi- multithreading and cooperative multitasking, mainly moti-
ciently, and also to start building a Lua community, so that vated by the changes Bret Mogilefsky had made to Lua 2.5
other people could answer Lua questions, in February 1997 and 3.1 alpha for Grim Fandango. No conclusions were
we created a mailing list for discussing Lua. Over 38,000 reached, but the topic remained popular. Cooperative multi-
messages have been posted to this list since then. The use tasking in Lua was finally provided in Lua 5.0 (April 2003);
of Lua in many popular games has attracted many people to see §6.7.
the list, which now has over 1200 subscribers. We have been The C API remained largely unchanged from Lua 1.0
fortunate that the Lua list is very friendly and at the same to Lua 3.2; it worked over an implicit Lua state. However,
time very technical. The list has become the focal point of newer applications, such as web services, needed multiple
the Lua community and has been a source of motivation states. To mitigate this problem, Lua 3.1 introduced multiple
for improving Lua. All important events occur first in the independent Lua states that could be switched at run time.
mailing list: release announcements, feature requests, bug A fully reentrant API would have to wait until Lua 4.0. In
reports, etc. the meantime, two unofficial versions of Lua 3.2 with ex-
The creation of a comp.lang.lua Usenet newsgroup plicit Lua states appeared: one written in 1998 by Roberto
was discussed twice in the list over all these years, in Ierusalimschy and Anna Hester based on Lua 3.2 alpha for
April 1998 and in July 1999. The conclusion both times CGILua [26], and one written in 1999 by Erik Hougaard
was that the traffic in the list did not warrant the creation based on Lua 3.2 final. Erik’s version was publicly avail-
of a newsgroup. Moreover, most people preferred a mailing able and was used in the Crazy Ivan robot. The version for
list. The creation of a newsgroup seems no longer relevant CGILua was released only as part of the CGILua distribu-
because there are several web interfaces for reading and tion; it never existed as an independent package.
searching the complete list archives. Lua 3.2 (July 1999) itself was mainly a maintenance re-
lease; it brought no novelties except for a debug library that
allowed tools to be written in Lua instead of C. Neverthe-
6 In November 1997, that article won the First Prize (technological cate- less, Lua was quite stable by then and Lua 3.2 had a long
gory) in the II Compaq Award for Research and Development in Computer life. Because the next version (Lua 4.0) introduced a new,
Science, a joint venture of Compaq Computer in Brazil, the Brazilian Min-
istry of Science and Technology, and the Brazilian Academy of Sciences.
incompatible API, many users just stayed with Lua 3.2 and
7 Grim Fandango mentioned Lua and PUC-Rio in its final credits. Several never migrated to Lua 4.0. For instance, Tecgraf never mi-
people at PUC-Rio first learned about Lua from that credit screen, and grated to Lua 4.0, opting to move directly to Lua 5.0; many
were surprised to learn that Brazilian software was part of a hit game. It products at Tecgraf still use Lua 3.2.
has always bothered us that Lua is widely known abroad but has remained
relatively unknown in Brazil until quite recently.
5.4 Lua 4 although Pthreads was popular, there were (and still there
Lua 4.0 was released in November 2000. As mentioned are) many platforms without this library. Second, and more
above, the main change in Lua 4.0 was a fully reentrant API, important, we did not (and still do not) believe in the stan-
motivated by applications that needed multiple Lua states. dard multithreading model, which is preemptive concur-
Since making the API fully reentrant was already a major rency with shared memory: we still think that no one can
change, we took the opportunity and completely redesigned write correct programs in a language where ‘a=a+1’ is not
the API around a clear stack metaphor for exchanging val- deterministic.
ues with C (see §6.9). This was first suggested by Reuben For Lua 4.1, we tried to solve those difficulties in a typi-
Thomas in July 2000. cal Lua fashion: we implemented only a basic mechanism of
Lua 4.0 also introduced a ‘for’ statement, then a top multiple stacks, which we called threads. External libraries
item in the wish-list of most Lua users and a frequent topic could use those Lua threads to implement multithreading,
in the mailing list. We had not included a ‘for’ statement based on a support library such as Pthreads. The same mech-
earlier because ‘while’ loops were more general. However, anism could be used to implement coroutines, in the form of
users complained that they kept forgetting to update the non-preemptive, collaborative multithreading. Lua 4.1 alpha
control variable at the end of ‘while’ loops, thus leading to was released in July 2001 with support for external multi-
infinite loops. Also, we could not agree on a good syntax. threading and coroutines; it also introduced support for weak
We considered the Modula ‘for’ too restrictive because tables and featured a register-based virtual machine, with
it did not cover iterations over the elements of a table or which we wanted to experiment.
over the lines of a file. A ‘for’ loop in the C tradition The day after Lua 4.1 alpha was released, John D. Rams-
did not fit with the rest of Lua. With the introduction of dell started a big discussion in the mailing list about lexi-
closures and anonymous functions in Lua 3.1, we decided cal scoping. After several dozen messages, it became clear
to use higher-order functions for implementing iterations. that Lua needed full lexical scoping, instead of the upvalue
So, Lua 3.1 provided a higher-order function that iterated mechanism adopted since Lua 3.1. By October 2001 we
over a table by calling a user-supplied function over all pairs had come up with an efficient implementation of full lexi-
in the table. To print all pairs in a table t, one simply said cal scoping, which we released as a work version in Novem-
‘foreach(t,print)’. ber 2001. (See §6.6 for a detailed discussion of lexical scop-
In Lua 4.0 we finally designed a ‘for’ loop, in two vari- ing.) That version also introduced a new hybrid representa-
ants: a numeric loop and a table-traversal loop (first sug- tion for tables that let them be implemented as arrays when
gested by Michael Spalinski in October 1997). These two appropriate (see §6.2 for further details). Because that ver-
variants covered most common loops; for a really generic sion implemented new basic algorithms, we decided to re-
loop, there was still the ‘while’ loop. Printing all pairs in a lease it as a work version, even though we had already re-
table t could then be done as follows:8 leased an alpha version for Lua 4.1.
In February 2002 we released a new work version for
for k,v in t do Lua 4.1, with three relevant novelties: a generic ‘for’ loop
print(k,v) based on iterator functions, metatables and metamethods
end
as a replacement for tags and fallbacks9 (see §6.8), and
The addition of a ‘for’ statement was a simple one but it coroutines (see §6.7). After that release, we realized that
did change the look of Lua programs. In particular, Roberto Lua 4.1 would bring too many major changes — perhaps
had to rewrite many examples in his draft book on Lua ‘Lua 5.0’ would be a better name for the next version.
programming. Roberto had been writing this book since
1998, but he could never finish it because Lua was a moving 5.5 Lua 5
target. With the release of Lua 4.0, large parts of the book The final blow to the name ‘Lua 4.1’ came a few days
and almost all its code snippets had to be rewritten. later, during the Lua Library Design Workshop organized
Soon after the release of Lua 4.0, we started working by Christian Lindig and Norman Ramsey at Harvard. One of
on Lua 4.1. Probably the main issue we faced for Lua 4.1 the main conclusions of the workshop was that Lua needed
was whether and how to support multithreading, a big is- some kind of module system. Although we had always con-
sue at that time. With the growing popularity of Java and sidered that modules could be implemented using tables, not
Pthreads, many programmers began to consider support for even the standard Lua libraries followed this path. We then
multithreading as an essential feature in any programming decided to take that step for the next version.
language. However, for us, supporting multithreading in Lua
posed serious questions. First, to implement multithread- 9 The use of ordinary Lua tables for implementing extensible semantics had
ing in C requires primitives that are not part of ANSI C — already been suggested by Stephan Herrmann in December 1996, but we
forgot all about it until it was suggested again by Edgar Toernig in Octo-
8 With the introduction of ‘for’ iterators in Lua 5.0, this syntax was marked ber 2000, as part of a larger proposal, which he called ‘unified methods’.
as obsolete and later removed in Lua 5.1. The term ‘metatable’ was suggested by Rici Lake in November 2001.
Packaging library functions inside tables had a big practi- for less and less frequent collections. Because it was getting
cal impact, because it affected any program that used at least too complicated and unpredictable, we gave up the genera-
one library function. For instance, the old strfind function tional aspect and implemented a simpler incremental collec-
was now called string.find (field ‘find’ in string library tor in Lua 5.1.
stored in the ‘string’ table); openfile became io.open; During that time, programmers had been experimenting
sin became math.sin; and so on. To make the transition with the module system introduced in Lua 5.0. New pack-
easier, we provided a compatibility script that defined the ages started to be produced, and old packages migrated to the
old functions in terms of the new ones: new system. Package writers wanted to know the best way
strfind = string.find
to build modules. In July 2005, during the development of
openfile = io.open Lua 5.1, an international Lua workshop organized by Mark
sin = math.sin Hamburg was held at Adobe in San Jose. (A similar work-
... shop organized by Wim Couwenberg and Daniel Silverstone
was held in September 2006 at Océ in Venlo.) One of the
Nevertheless, packaging libraries in tables was a major presentations was about the novelties of Lua 5.1, and there
change. In June 2002, when we released the next work were long discussions about modules and packages. As a re-
version incorporating this change, we dropped the name sult, we made a few small but significant changes in the mod-
‘Lua 4.1’ and named it ‘Lua 5.0 work0’. Progress to the ule system. Despite our “mechanisms, not policy” guideline
final version was steady from then on and Lua 5.0 was re- for Lua, we defined a set of policies for writing modules
leased in April 2003. This release froze Lua enough to allow and loading packages, and made small changes to support
Roberto to finish his book, which was published in Decem- these policies better. Lua 5.1 was released in February 2006.
ber 2003 [27]. Although the original motivation for Lua 5.1 was incremen-
Soon after the release of Lua 5.0 we started working tal garbage collection, the improvement in the module sys-
on Lua 5.1. The initial motivation was the implementation tem was probably the most visible change. On the other
of incremental garbage collection in response to requests hand, that incremental garbage collection remained invisible
from game developers. Lua uses a traditional mark-and- shows that it succeeded in avoiding long pauses.
sweep garbage collector, and, until Lua 5.0, garbage col-
lection was performed atomically. As a consequence, some
applications might experience potentially long pauses dur- 6. Feature evolution
ing garbage collection.10 At that time, our main concern was In this section, we discuss in detail the evolution of some of
that adding the write barriers needed to implement an incre- the features of Lua.
mental garbage collector would have a negative impact on
Lua performance. To compensate for that we tried to make 6.1 Types
the collector generational as well. We also wanted to keep Types in Lua have been fairly stable. For a long time, Lua
the adaptive behavior of the old collector, which adjusted the had only six basic types: nil, number, string, table, function,
frequency of collection cycles according to the total memory and userdata. (Actually, until Lua 3.0, C functions and Lua
in use. Moreover, we wanted to keep the collector simple, functions had different types internally, but that difference
like the rest of Lua. was transparent to callers.) The only real change happened
We worked on the incremental generational garbage col- in Lua 5.0, which introduced two new types: threads and
lector for over a year. But since we did not have access to booleans.
applications with strong memory requirements (like games), The type thread was introduced to represent coroutines.
it was difficult for us to test the collector in real scenarios. Like all other Lua values, threads are first-class values.
From March to December 2004 we released several work To avoid creating new syntax, all primitive operations on
versions trying to get concrete feedback on the performance threads are provided by a library.
of the collector in real applications. We finally received re- For a long time we resisted introducing boolean values in
ports of bizarre memory-allocation behavior, which we later Lua: nil was false and anything else was true. This state of
managed to reproduce but not explain. In January 2005, affairs was simple and seemed sufficient for our purposes.
Mike Pall, an active member of the Lua community, came However, nil was also used for absent fields in tables and
up with memory-allocation graphs that explained the prob- for undefined variables. In some applications, it is important
lem: in some scenarios, there were subtle interactions be- to allow table fields to be marked as false but still be seen
tween the incremental behavior, the generational behavior, as present; an explicit false value can be used for this. In
and the adaptive behavior, such that the collector “adapted” Lua 5.0 we finally introduced boolean values true and false.
10 Erik
Nil is still treated as false. In retrospect, it would probably
Hougaard reported that the Crazy Ivan robot would initially drive
have been better if nil raised an error in boolean expres-
off course when Lua performed garbage collection (which could take a half
second, but that was enough). To stay in course, they had to stop both motors sions, as it does in other expressions. This would be more
and pause the robot during garbage collection. consistent with its role as proxy for undefined values. How-
ever, such a change would probably break many existing graf, this was not a fatal move: existing programs were easily
programs. LISP has similar problems, with the empty list converted with the aid of ad-hoc tools that we wrote for this
representing both nil and false. Scheme explicitly represents task.
false and treats the empty list as true, but some implementa- The syntax for table constructors has since remained
tions of Scheme still treat the empty list as false. mostly unchanged, except for an addition introduced in
Lua 3.1: keys in the record part could be given by any ex-
6.2 Tables pression, by enclosing the expression inside brackets, as in
Lua 1.1 had three syntactical constructs to create tables: ‘{[10*x+f(y)]=47}’. In particular, this allowed keys to
‘@()’, ‘@[]’, and ‘@{}’. The simplest form was ‘@()’, be arbitrary strings, including reserved words and strings
which created an empty table. An optional size could be with spaces. Thus, ‘{function=1}’ is not valid (because
given at creation time, as an efficiency hint. The form ‘@[]’ ‘function’ is a reserved word), but ‘{["function"]=1}’
was used to create arrays, as in ‘@[2,4,9,16,25]’. In is valid. Since Lua 5.0, it is also possible to freely intermix
such tables, the keys were implicit natural numbers start- the array part and the record part, and there is no need to use
ing at 1. The form ‘@{}’ was used to create records, as in semicolons in table constructors.
‘@{name="John",age=35}’. Such tables were sets of key- While the syntax of tables has evolved, the semantics of
value pairs in which the keys were explicit strings. A table tables in Lua has not changed at all: tables are still asso-
created with any of those forms could be modified dynam- ciative arrays and can store arbitrary pairs of values. How-
ically after creation, regardless of how it had been created. ever, frequently in practice tables are used solely as arrays
Moreover, it was possible to provide user functions when (that is, with consecutive integer keys) or solely as records
creating lists and records, as in ‘@foo[]’ or ‘@foo{}’. This (that is, with string keys). Because tables are the only data-
syntax was inherited from SOL and was the expression of structuring mechanism in Lua, we have invested much ef-
procedural data description, a major feature of Lua (see §2). fort in implementing them efficiently inside Lua’s core. Un-
The semantics was that a table was created and then the til Lua 4.0, tables were implemented as pure hash tables,
function was called with that table as its single argument. with all pairs stored explicitly. In Lua 5.0 we introduced a
The function was allowed to check and modify the table at hybrid representation for tables: every table contains a hash
will, but its return values were ignored: the table was the part and an array part, and both parts can be empty. Lua de-
final value of the expression. tects whether a table is being used as an array and automat-
In Lua 2.1, the syntax for table creation was unified and ically stores the values associated to integer indices in the
simplified: the leading ‘@’ was removed and the only con- array part, instead of adding them to the hash part [31]. This
structor became ‘{· · ·}’. Lua 2.1 also allowed mixed con- division occurs only at a low implementation level; access
structors, such as to table fields is transparent, even to the virtual machine. Ta-
grades{8.5, 6.0, 9.2; name="John", major="math"}
bles automatically adapt their two parts according to their
contents.
in which the array part was separated from the record This hybrid scheme has two advantages. First, access
part by a semicolon. Finally, ‘foo{· · ·}’ became sugar for to values with integer keys is faster because no hashing is
‘foo({· · ·})’. In other words, table constructors with func- needed. Second, and more important, the array part takes
tions became ordinary function calls. As a consequence, the roughly half the memory it would take if it were stored in
function had to explicitly return the table (or whatever value the hash part, because the keys are implicit in the array part
it chose). Dropping the ‘@’ from constructors was a trivial but explicit in the hash part. As a consequence, if a table is
change, but it actually changed the feel of the language, not being used as an array, it performs as an array, as long as
merely its looks. Trivial changes that improve the feel of a its integer keys are densely distributed. Moreover, no mem-
language are not to be overlooked. ory or time penalty is paid for the hash part, because it
This simplification in the syntax and semantics of ta- does not even exist. Conversely, if the table is being used
ble constructors had a side-effect, however. In Lua 1.1, the as a record and not as an array, then the array part is likely
equality operator was ‘=’. With the unification of table con- to be empty. These memory savings are important because
structors in Lua 2.1, an expression like ‘{a=3}’ became am- it is common for a Lua program to create many small ta-
biguous, because it could mean a table with either a pair bles (e.g., when tables are used to represent objects). Lua
("a", 3) or a pair (1, b), where b is the value of the equal- tables also handle sparse arrays gracefully: the statement
ity ‘a=3’. To solve this ambiguity, in Lua 2.1 we changed the ‘a={[1000000000]=1}’ creates a table with a single entry
equality operator from ‘=’ to ‘==’. With this change, ‘{a=3}’ in its hash part, not an array with one billion elements.
meant a table with the pair ("a", 3), while ‘{a==3}’ meant Another reason for investing effort into an efficient im-
a table with the pair (1, b). plementation of tables is that we can use tables for all kinds
These changes made Lua 2.1 incompatible with Lua 1.1 of tasks. For instance, in Lua 5.0 the standard library func-
(hence the change in the major version number). Neverthe- tions, which had existed since Lua 1.1 as global variables,
less, since at that time virtually all Lua users were from Tec-
were moved to fields inside tables (see §5.5). More recently, or in other contexts, such as ‘<[!CDATA[· · ·]]>’ from XML.
Lua 5.1 brought a complete package and module system So, it was hard to reliably wrap arbitrary text as a long string.
based on tables. Lua 5.1 introduced a new form for long strings: text de-
Tables play a prominent role in Lua’s core. On two oc- limited by matching ‘[===[· · ·]===]’, where the number of
casions we have been able to replace special data structures ‘=’ characters is arbitrary (including zero). These new long
inside the core with ordinary Lua tables: in Lua 4.0 for repre- strings do not nest: a long string ends as soon as a closing de-
senting the global environment (which keeps all global vari- limiter with the right number of ‘=’ is seen. Nevertheless, it
ables) and in Lua 5.0 for implementing extensible seman- is now easy to wrap arbitrary text, even text containing other
tics (see §6.8). Starting with Lua 4.0, global variables are long strings or unbalanced ‘]= · · · =]’ sequences: simply use
stored in an ordinary Lua table, called the table of globals, an adequate number of ‘=’ characters.
a simplification suggested by John Belmonte in April 2000.
6.4 Block comments
In Lua 5.0 we replaced tags and tag methods (introduced
in Lua 3.0) by metatables and metamethods. Metatables are Comments in Lua are signaled by ‘--’ and continue to the
ordinary Lua tables and metamethods are stored as fields end of the line. This is the simplest kind of comment, and
in metatables. Lua 5.0 also introduced environment tables is very effective. Several other languages use single-line
that can be attached to Lua functions; they are the tables comments, with different marks. Languages that use ‘--’ for
where global names in Lua functions are resolved at run comments include Ada and Haskell.
time. Lua 5.1 extended environment tables to C functions, We never felt the need for multi-line comments, or block
userdata, and threads, thus replacing the notion of global en- comments, except as a quick way to disable code. There
vironment. These changes simplified both the implementa- was always the question of which syntax to use: the famil-
tion of Lua and the API for Lua and C programmers, be- iar ‘/* · · · */’ syntax used in C and several other languages
cause globals and metamethods can be manipulated within does not mesh well with Lua’s single-line comments. There
Lua without the need for special functions. was also the question of whether block comments could nest
or not, always a source of noise for users and of complexity
6.3 Strings for the lexer. Nested block comments happen when program-
Strings play a major role in scripting languages and so the mers want to ‘comment out’ some block of code, to disable
facilities to create and manipulate strings are an important it. Naturally, they expect that comments inside the block of
part of the usability of such languages. code are handled correctly, which can only happen if block
The syntax for literal strings in Lua has had an interesting comments can be nested.
evolution. Since Lua 1.1, a literal string can be delimited ANSI C supports block comments but does not allow
by matching single or double quotes, and can contain C-like nesting. C programmers typically disable code by using the
escape sequences. The use of both single and double quotes C preprocessor idiom ‘#if 0 · · · #endif’. This scheme has
to delimit strings with the same semantics was a bit unusual the clear advantage that it interacts gracefully with existing
at the time. (For instance, in the tradition of shell languages, comments in the disabled code. With this motivation and in-
Perl expands variables inside double-quoted strings, but not spiration, we addressed the need for disabling blocks of code
inside single-quoted strings.) While these dual quotes allow in Lua — not the need for block comments — by introducing
strings to contain one kind of quote without having to escape conditional compilation in Lua 3.0 via pragmas inspired in
it, escape sequences are still needed for arbitrary text. the C preprocessor. Although conditional compilation could
Lua 2.2 introduced long strings, a feature not present in be used for block comments, we do not think that it ever
classical programming languages, but present in most script- was. During work on Lua 4.0, we decided that the support
ing languages.11 Long strings can run for several lines and for conditional compilation was not worth the complexity in
do not interpret escape sequences; they provide a convenient the lexer and in its semantics for the user, especially after
way to include arbitrary text as a string, without having to not having reached any consensus about a full macro facil-
worry about its contents. However, it is not trivial to de- ity (see §7). So, in Lua 4.0 we removed support for con-
sign a good syntax for long strings, especially because it ditional compilation and Lua remained without support for
is common to use them to include arbitrary program text block comments.12
(which may contain other long strings). This raises the ques- Block comments were finally introduced in Lua 5.0, in
tion of how long strings end and whether they may nest. the form ‘--[[· · ·]]’. Because they intentionally mimicked
Until Lua 5.0, long strings were wrapped inside matching the syntax of long strings (see §6.3), it was easy to modify
‘[[· · ·]]’ and could contain nested long strings. Unfortu- the lexer to support block comments. This similarity also
nately, the closing delimiter ‘]]’ could easily be part of a helped users to grasp both concepts and their syntax. Block
valid Lua program in an unbalanced way, as in ‘a[b[i]]’, 12 A further motivation was that by that time we had found a better way to
generate and use debug information, and so the pragmas that controlled this
11 ‘Long string’ is a Lua term. Other languages use terms such as ‘verbatim were no longer needed. Removing conditional compilation allowed us to
text’ or ‘heredoc’. get rid of all pragmas.
comments can also be used to disable code: the idiom is to right after being compiled; they did not exist as functions at
surround the code between two lines containing ‘--[[’ and the user level. This final step was taken in Lua 5.0, which
‘--]]’. The code inside those lines can be re-enabled by broke the loading and execution of chunks into two steps,
simply adding a single ‘-’ at the start of the first line: both to provide host programmers better control for handling and
lines then become harmless single-line comments. reporting errors. As a consequence, in Lua 5.0 chunks be-
Like long strings, block comments could nest, but they came ordinary anonymous functions with no arguments. In
had the same problems as long strings. In particular, valid Lua 5.1 chunks became anonymous vararg functions and
Lua code containing unbalanced ‘]]’s, such as ‘a[b[i]]’, thus can be passed values at execution time. Those values
could not be reliably commented out in Lua 5.0. The new are accessed via the new ‘...’ mechanism.
scheme for long strings in Lua 5.1 also applies to block com- From a different point of view, chunks are like modules
ments, in the form of matching ‘--[===[· · ·]===]’, and so in other languages: they usually provide functions and vari-
provides a simple and robust solution for this problem. ables to the global environment. Originally, we did not in-
tend Lua to be used for large-scale programming and so we
6.5 Functions did not feel the need to add an explicit notion of modules
Functions in Lua have always been first-class values. A func- to Lua. Moreover, we felt that tables would be sufficient for
tion can be created at run time by compiling and executing building modules, if necessary. In Lua 5.0 we made that feel-
a string containing its definition.13 Since the introduction of ing explicit by packaging all standard libraries into tables.
anonymous functions and upvalues in Lua 3.1, programmers This encouraged other people to do the same and made it
are able to create functions at run time without resorting to easier to share libraries. We now feel that Lua can be used for
compilation from text. large-scale programming, especially after Lua 5.1 brought a
Functions in Lua, whether written in C or in Lua, have package system and a module system, both based on tables.
no declaration. At call time they accept a variable number 6.6 Lexical scoping
of arguments: excess arguments are discarded and missing
From an early stage in the development of Lua we started
arguments are given the value nil. (This coincides with the
thinking about first-class functions with full lexical scoping.
semantics of multiple assignment.) C functions have always
This is an elegant construct that fits well within Lua’s philos-
been able to handle a variable number of arguments. Lua 2.5
ophy of providing few but powerful constructs. It also makes
introduced vararg Lua functions, marked by a parameter
Lua apt for functional programming. However, we could not
list ending in ‘...’ (an experimental feature that became
figure out a reasonable implementation for full lexical scop-
official only in Lua 3.0). When a vararg function was called,
ing. Since the beginning Lua has used a simple array stack
the arguments corresponding to the dots were collected into
to keep activation records (where all local variables and tem-
a table named ‘arg’. While this was simple and mostly
poraries live). This implementation had proved simple and
convenient, there was no way to pass those arguments to
efficient, and we saw no reason to change it. When we allow
another function, except by unpacking this table. Because
nested functions with full lexical scoping, a variable used by
programmers frequently want to just pass the arguments
an inner function may outlive the function that created it, and
along to other functions, Lua 5.1 allows ‘...’ to be used
so we cannot use a stack discipline for such variables.
in argument lists and on the right-hand side of assignments.
Simple Scheme implementations allocate frames in the
This avoids the creation of the ‘arg’ table if it is not needed.
heap. Already in 1987, Dybvig [20] described how to use
The unit of execution of Lua is called a chunk; it is
a stack to allocate frames, provided that those frames did
simply a sequence of statements. A chunk in Lua is like
not contain variables used by nested functions. His method
the main program in other languages: it can contain both
requires that the compiler know beforehand whether a vari-
function definitions and executable code. (Actually, a func-
able appears as a free variable in a nested function. This does
tion definition is executable code: an assignment.) At the
not suit the Lua compiler because it generates code to ma-
same time, a chunk closely resembles an ordinary Lua func-
nipulate variables as soon as it parses an expression; at that
tion. For instance, chunks have always had exactly the same
moment, it cannot know whether any variable is later used
kind of bytecode as ordinary Lua functions. However, before
free in a nested function. We wanted to keep this design for
Lua 5.0, chunks needed some internal magic to start execut-
implementing Lua, because of its simplicity and efficiency,
ing. Chunks began to look like ordinary functions in Lua 2.2,
and so could not use Dybvig’s method. For the same rea-
when local variables outside functions were allowed as an
son, we could not use advanced compiler techniques, such
undocumented feature (that became official only in Lua 3.1).
as data-flow analysis.
Lua 2.5 allowed chunks to return values. In Lua 3.0 chunks
Currently there are several optimization strategies to
became functions internally, except that they were executed
avoid using the heap for frames (e.g., [21]), but they all
13 Some need compilers with intermediate representations, which the
people maintain that the ability to evaluate code from text at run
time and within the environment of the running program is what character- Lua compiler does not use. McDermott’s proposal for stack
izes scripting languages. frame allocation [36], which is explicitly addressed to inter-
preters, is the only one we know of that does not require in- the reuse of open upvalues. Reuse is essential to get the cor-
termediate representation for code generation. Like our cur- rect semantics. If two closures, sharing an external variable,
rent implementation [31], his proposal puts variables in the have their own upvalues, then at the end of the scope each
stack and moves them to the heap on demand, if they go out closure will have its own copy of the variable, but the cor-
of scope while being used by a nested closure. However, his rect semantics dictates that they should share the variable.
proposal assumes that environments are represented by as- To ensure reuse, the algorithm that created closures worked
sociation lists. So, after moving an environment to the heap, as follows: for each external variable used by the closure, it
the interpreter has to correct only the list header, and all ac- first searched the list of open closures. If it found an upvalue
cesses to local variables automatically go to the heap. Lua pointing to that external variable, it reused that upvalue; oth-
uses real records as activation records, with local-variable erwise, it created a new upvalue.
access being translated to direct accesses to the stack plus an Edgar Toering, an active member of the Lua community,
offset, and so cannot use McDermott’s method. misunderstood our description of lexical scoping. It turned
For a long time those difficulties kept us from introducing out that the way he understood it was better than our orig-
nested first-class functions with full lexical scoping in Lua. inal idea: instead of keeping a list of open closures, keep a
Finally, in Lua 3.1 we settled on a compromise that we called list of open upvalues. Because the number of local variables
upvalues. In this scheme, an inner function cannot access used by closures is usually smaller than the number of clo-
and modify external variables when it runs, but it can access sures using them (the first is statically limited by the program
the values those variables had when the function was cre- text), his solution was more efficient than ours. It was also
ated. Those values are called upvalues. The main advantage easier to adapt to coroutines (which were being implemented
of upvalues is that they can be implemented with a simple at around the same time), because we could keep a separate
scheme: all local variables live in the stack; when a function list of upvalues for each stack. We added full lexical scoping
is created, it is wrapped in a closure containing copies of to Lua 5.0 using this algorithm because it met all our require-
the values of the external variables used by the function. In ments: it could be implemented with a one-pass compiler; it
other words, upvalues are the frozen values of external vari- imposed no burden on functions that did not access exter-
ables.14 To avoid misunderstandings, we created a new syn- nal local variables, because they continued to manipulate all
tax for accessing upvalues: ‘%varname’. This syntax made their local variables in the stack; and the cost to access an
it clear that the code was accessing the frozen value of that external local variable was only one extra indirection [31].
variable, not the variable itself. Upvalues proved to be very
6.7 Coroutines
useful, despite being immutable. When necessary, we could
simulate mutable external variables by using a table as the For a long time we searched for some kind of first-class
upvalue: although we could not change the table itself, we continuations for Lua. This search was motivated by the
could change its fields. This feature was especially useful for existence of first-class continuations in Scheme (always a
anonymous functions passed to higher-order functions used source of inspiration to us) and by demands from game
for table traversal and pattern matching. programmers for some mechanism for “soft” multithreading
In December 2000, Roberto wrote in the first draft of (usually described as “some way to suspend a character and
his book [27] that “Lua has a form of proper lexical scop- continue it later”).
ing through upvalues.” In July 2001 John D. Ramsdell ar- In 2000, Maria Julia de Lima implemented full first-class
gued in the mailing list that “a language is either lexically continuations on top of Lua 4.0 alpha, as part of her Ph.D.
scoped or it is not; adding the adjective ‘proper’ to the phrase work [35]. She used a simple approach because, like lexi-
‘lexical scoping’ is meaningless.” That message stirred us cal scoping, smarter techniques to implement continuations
to search for a better solution and a way to implement full were too complex compared to the overall simplicity of Lua.
lexical scoping. By October 2001 we had an initial imple- The result was satisfactory for her experiments, but too slow
mentation of full lexical scoping and described it to the list. to be incorporated in a final product. Nevertheless, her im-
The idea was to access each upvalue through an indirection plementation uncovered a problem peculiar to Lua. Since
that pointed to the stack while the variable was in scope; Lua is an extensible extension language, it is possible (and
at the end of the scope a special virtual machine instruc- common) to call Lua from C and C from Lua. Therefore, at
tion “closed” the upvalue, moving the variable’s value to a any given point in the execution of a Lua program, the cur-
heap-allocated space and correcting the indirection to point rent continuation usually has parts in Lua mixed with parts
there. Open closures (those with upvalues still pointing to in C. Although it is possible to manipulate a Lua continu-
the stack) were kept in a list to allow their correction and ation (essentially by manipulating the Lua call stack), it is
impossible to manipulate a C continuation within ANSI C.
14 A year later Java adopted a similar solution to allow inner classes. Instead
At that time, we did not understand this problem deeply
enough. In particular, we could not figure out what the ex-
of freezing the value of an external variable, Java insists that you can only
access final variables in inner classes, and so ensures that the variable is act restrictions related to C calls were. Lima simply forbade
frozen. any C calls in her implementation. Again, that solution was
satisfactory for her experiments, but unacceptable for an of- After that change, the implementation of coroutines became
ficial Lua version because the ease of mixing Lua code with straightforward.
C code is one of Lua’s hallmarks. Unlike most implementations of asymmetrical corou-
Unaware of this difficulty, in December 2001 Thatcher tines, in Lua coroutines are what we call stackfull [19]. With
Ulrich announced in the mailing list: them, we can implement symmetrical coroutines and even
the call/1cc operator (call with current one-shot continua-
I’ve created a patch for Lua 4.0 that makes calls from tion) proposed for Scheme [11]. However, the use of C func-
Lua to Lua non-recursive (i.e., ‘stackless’). This al- tions is severely restricted within these implementations.
lows the implementation of a ‘sleep()’ call, which ex- We hope that the introduction of coroutines in Lua 5.0
its from the host program [. . . ], and leaves the Lua marks a revival of coroutines as powerful control struc-
state in a condition where the script can be resumed tures [18].
later via a call to a new API function, lua_resume.
6.8 Extensible semantics
In other words, he proposed an asymmetric coroutine mech- As mentioned in §5.2, we introduced extensible semantics
anism, based on two primitives: yield (which he called sleep) in Lua 2.1 in the form of fallbacks as a general mechanism
and resume. His patch followed the high-level description to allow the programmer to intervene whenever Lua did not
given in the mailing list by Bret Mogilefsky on the changes know how to proceed. Fallbacks thus provided a restricted
made to Lua 2.5 and 3.1 to add cooperative multitasking in form of resumable exception handling. In particular, by us-
Grim Fandango. (Bret could not provide details, which were ing fallbacks, we could make a value respond to operations
proprietary.) not originally meant for it or make a value of one type be-
Shortly after this announcement, during the Lua Library have like a value of another type. For instance, we could
Design Workshop held at Harvard in February 2002, there make userdata and tables respond to arithmetic operations,
was some discussion about first-class continuations in Lua. userdata behave as tables, strings behave as functions, etc.
Some people claimed that, if first-class continuations were Moreover, we could make a table respond to keys that were
deemed too complex, we could implement one-shot contin- absent in it, which is fundamental for implementing inheri-
uations. Others argued that it would be better to implement tance. With fallbacks for table indexing and a little syntactic
symmetric coroutines. But we could not find a proper imple- sugar for defining and calling methods, object-oriented pro-
mentation of any of these mechanisms that could solve the gramming with inheritance became possible in Lua.
difficulty related to C calls. Although objects, classes, and inheritance were not core
It took us some time to realize why it was hard to im- concepts in Lua, they could be implemented directly in Lua,
plement symmetric coroutines in Lua, and also to under- in many flavors, according to the needs of the application. In
stand how Ulrich’s proposal, based on asymmetric corou- other words, Lua provided mechanisms, not policy — a tenet
tines, avoided our difficulties. Both one-shot continuations that we have tried to follow closely ever since.
and symmetric coroutines involve the manipulation of full The simplest kind of inheritance is inheritance by del-
continuations. So, as long as these continuations include any egation, which was introduced by Self and adopted in
C part, it is impossible to capture them (except by using fa- other prototype-based languages such as NewtonScript and
cilities outside ANSI C). In contrast, an asymmetric corou- JavaScript. The code below shows an implementation of in-
tine mechanism based on yield and resume manipulates par- heritance by delegation in Lua 2.1.
tial continuations: yield captures the continuation up to the
corresponding resume [19]. With asymmetric coroutines, the function Index(a,i)
current continuation can include C parts, as long as they if i == "parent" then
are outside the partial continuation being captured. In other return nil
words, the only restriction is that we cannot yield across a end
C call. local p = a.parent
After that realization, and based on Ulrich’s proof-of- if type(p) == "table" then
return p[i]
concept implementation, we were able to implement asym-
else
metrical coroutines in Lua 5.0. The main change was that the
return nil
interpreter loop, which executes the instructions for the vir- end
tual machine, ceased to be recursive. In previous versions, end
when the interpreter loop executed a CALL instruction, it setfallback("index", Index)
called itself recursively to execute the called function. Since
Lua 5.0, the interpreter behaves more like a real CPU: when When a table was accessed for an absent field (be it an
it executes a CALL instruction, it pushes some context infor- attribute or a method), the index fallback was triggered.
mation onto a call stack and proceeds to execute the called Inheritance was implemented by setting the index fallback
function, restoring the context when that function returns. to follow a chain of “parents” upwards, possibly triggering
the index fallback again, until a table had the required field The code below shows the implementation of inheritance
or the chain ended. in Lua 5.0. The index metamethod replaces the index tag
After setting that index fallback, the code below printed method and is represented by the ‘__index’ field in the
‘red’ even though ‘b’ did not have a ‘color’ field: metatable. The code makes ‘b’ inherit from ‘a’ by setting
a metatable for ‘b’ whose ‘__index’ field points to ‘a’.
a=Window{x=100, y=200, color="red"} (In general, index metamethods are functions, but we have
b=Window{x=300, y=400, parent=a} allowed them to be tables to support simple inheritance by
print(b.color) delegation directly.)
a=Window{x=100, y=200, color="red"}
There was nothing magical or hard-coded about delega-
b=Window{x=300, y=400}
tion through a “parent” field. Programmers had complete
setmetatable(b,{ __index = a })
freedom: they could use a different name for the field con- print(b.color) --> red
taining the parent, they could implement multiple inheri-
tance by trying a list of parents, etc. Our decision not to 6.9 C API
hard-code any of those possible behaviors led to one of the Lua is provided as a library of C functions and macros that
main design concepts of Lua: meta-mechanisms. Instead of allow the host program to communicate with Lua. This API
littering the language with lots of features, we provided ways between Lua and C is one of the main components of Lua; it
for users to program the features themselves, in the way they is what makes Lua an embeddable language.
wanted them, and only for those features they needed. Like the rest of the language, the API has gone through
Fallbacks greatly increased the expressiveness of Lua. many changes during Lua’s evolution. Unlike the rest of the
However, fallbacks were global handlers: there was only one language, however, the API design received little outside in-
function for each event that could occur. As a consequence, fluence, mainly because there has been little research activity
it was difficult to mix different inheritance mechanisms in in this area.
the same program, because there was only one hook for The API has always been bi-directional because, since
implementing inheritance (the index fallback). While this Lua 1.0, we have considered calling Lua from C and call-
might not be a problem for a program written by a single ing C from Lua equally important. Being able to call Lua
group on top of its own object system, it became a problem from C is what makes Lua an extension language, that is, a
when one group tried to use code from other groups, because language for extending applications through configuration,
their visions of the object system might not be consistent macros, and other end-user customizations. Being able to
with each other. Hooks for different mechanisms could be call C from Lua makes Lua an extensible language, because
chained, but chaining was slow, complicated, error-prone, we can use C functions to extend Lua with new facilities.
and not very polite. Fallback chaining did not encourage (That is why we say that Lua is an extensible extension lan-
code sharing and reuse; in practice almost nobody did it. guage [30].) Common to both these aspects are two mis-
This made it very hard to use third-party libraries. matches between C and Lua to which the API must adjust:
Lua 2.1 allowed userdata to be tagged. In Lua 3.0 we static typing in C versus dynamic typing in Lua and manual
extended tags to all values and replaced fallbacks with tag memory management in C versus automatic garbage collec-
methods. Tag methods were fallbacks that operated only on tion in Lua.
values with a given tag. This made it possible to implement Currently, the C API solves both difficulties by using an
independent notions of inheritance, for instance. No chain- abstract stack15 to exchange data between Lua and C. Every
ing was needed because tag methods for one tag did not af- C function called by Lua gets a new stack frame that initially
fect tag methods for another tag. contains the function arguments. If the C function wants to
The tag method scheme worked very well and lasted return values to Lua, it pushes those values onto the stack
until Lua 5.0, when we replaced tags and tag methods by just before returning.
metatables and metamethods. Metatables are just ordinary Each stack slot can hold a Lua value of any type. For each
Lua tables and so can be manipulated within Lua without Lua type that has a corresponding representation in C (e.g.,
the need for special functions. Like tags, metatables can strings and numbers), there are two API functions: an injec-
be used to represent user-defined types with userdata and tion function, which pushes onto the stack a Lua value cor-
tables: all objects of the same “type” should share the same responding to the given C value; and a projection function,
metatable. Unlike tags, metatables and their contents are which returns a C value corresponding to the Lua value at
naturally collected when no references remain to them. (In a given stack position. Lua values that have no correspond-
contrast, tags and their tag methods had to live until the ing representation in C (e.g., tables and functions) can be
end of the program.) The introduction of metatables also manipulated via the API by using their stack positions.
simplified the implementation: while tag methods had their
own private representation inside Lua’s core, metatables use 15 Throughout this section, ‘stack’ always means this abstract stack. Lua
mainly the standard table machinery. never accesses the C stack.
Practically all API functions get their operands from the The stack was another key component of the API. It
stack and push their results onto the stack. Since the stack was used to pass values from C to Lua. There was one
can hold values of any Lua type, these API functions operate push function for each Lua type with a direct representation
with any Lua type, thus solving the typing mismatch. To in C: lua_pushnumber for numbers, lua_pushstring for
prevent the collection of Lua values in use by C code, the strings, and lua_pushnil, for the special value nil. There
values in the stack are never collected. When a C function was also lua_pushobject, which allowed C to pass back
returns, its Lua stack frame vanishes, automatically releasing to Lua an arbitrary Lua value. When a C function returned,
all Lua values that the C function was using. These values all values in the stack were returned to Lua as the results of
will eventually be collected if no further references to them the C function (functions in Lua can return multiple values).
exist. This solves the memory management mismatch. Conceptually, a lua_Object was a union type, since it
It took us a long time to arrive at the current API. To could refer to any Lua value. Several scripting languages,
discuss how the API evolved, we use as illustration the including Perl, Python, and Ruby, still use a union type
C equivalent of the following Lua function: to represent their values in C. The main drawback of this
function foo(t)
representation is that it is hard to design a garbage collector
return t.x for the language. Without extra information, the garbage
end collector cannot know whether a value has a reference to it
stored as a union in the C code. Without this knowledge,
In words, this function receives a single parameter, which the collector may collect the value, making the union a
should be a table, and returns the value stored at the ‘x’ field dangling pointer. Even when this union is a local variable in
in that table. Despite its simplicity, this example illustrates a C function, this C function can call Lua again and trigger
three important issues in the API: how to get parameters, garbage collection.
how to index tables, and how to return results. Ruby solves this problem by inspecting the C stack, a task
In Lua 1.0, we would write foo in C as follows: that cannot be done in a portable way. Perl and Python solve
void foo_l (void) { this problem by providing explicit reference-count functions
lua_Object t = lua_getparam(1); for these union values. Once you increment the reference
lua_Object r = lua_getfield(t, "x"); count of a value, the garbage collector will not collect that
lua_pushobject(r); value until you decrement the count to zero. However, it is
} not easy for the programmer to keep these reference counts
Note that the required value is stored at the string index "x" right. Not only is it easy to make a mistake, but it is dif-
because ‘t.x’ is syntactic sugar for ‘t["x"]’. Note also that ficult to find the error later (as anyone who has ever de-
all components of the API start with ‘lua_’ (or ‘LUA_’) to bugged memory leaks and dangling pointers can attest). Fur-
avoid name clashes with other C libraries. thermore, reference counting cannot deal with cyclic data
To export this C function to Lua with the name ‘foo’ we structures that become garbage.
would do Lua never provided such reference-count functions. Be-
fore Lua 2.1, the best you could do to ensure that an unan-
lua_register("foo", foo_l); chored lua_Object was not collected was to avoid calling
After that, foo could be called from Lua code just like any Lua whenever you had a reference to such a lua_Object.
other Lua function: (As long as you could ensure that the value referred to by
the union was also stored in a Lua variable, you were safe.)
t = {x = 200}
Lua 2.1 brought an important change: it kept track of all
print(foo(t)) --> 200
lua_Object values passed to C, ensuring that they were not
A key component of the API was the type lua_Object, collected while the C function was active. When the C func-
defined as follows: tion returned to Lua, then (and only then) all references to
typedef struct Object *lua_Object; these lua_Object values were released, so that they could
be collected.16
In words, lua_Object was an abstract type that represented More specifically, in Lua 2.1 a lua_Object ceased to
Lua values in C opaquely. Arguments given to C functions be a pointer to Lua’s internal data structures and became an
were accessed by calling lua_getparam, which returned a index into an internal array that stored all values that had to
lua_Object. In the example, we call lua_getparam once be given to C:
to get the table, which is supposed to be the first argument to
typedef unsigned int lua_Object;
foo. (Extra arguments are silently ignored.) Once the table
is available in C (as a lua_Object), we get the value of This change made the use of lua_Object reliable: while a
its "x" field by calling lua_getfield. This value is also value was in that array, it would not be collected by Lua.
represented in C as a lua_Object, which is finally sent back
to Lua by pushing it onto the stack with lua_pushobject. 16 A similar method is used by JNI to handle “local references”.
When the C function returned, its whole array was erased, all values created after a lua_beginblock were removed
and the values used by the function could be collected if pos- from the internal array at the corresponding lua_endblock.
sible. (This change also gave more freedom for implement- However, since a block discipline could not be forced onto
ing the garbage collector, because it could move objects if C programmers, it was all too common to forget to use these
necessary; however, we did not followed this path.) blocks. Moreover, such explicit scope control was a little
For simple uses, the Lua 2.1 behavior was very practi- tricky to use. For instance, a naive attempt to correct our
cal: it was safe and the C programmer did not have to worry previous example by enclosing the for body within a block
about reference counts. Each lua_Object behaved like a would fail: we had to call lua_endblock just before the
local variable in C: the corresponding Lua value was guar- break, too. This difficulty with the scope of Lua objects
anteed to be alive during the lifetime of the C function that persisted through several versions and was solved only in
produced it. For more complex uses, however, this simple Lua 4.0, when we redesigned the whole API. Nevertheless,
scheme had two shortcomings that demanded extra mecha- as we said before, for typical uses the API was very easy to
nisms: sometimes a lua_Object value had to be locked for use, and most programmers never faced the kind of situation
longer than the lifetime of the C function that produced it; described here. More important, the API was safe. Erroneous
sometimes it had to be locked for a shorter time. use could produce well-defined errors, but not dangling ref-
The first of those shortcomings had a simple solution: erences or memory leaks.
Lua 2.1 introduced a system of references. The function Lua 2.1 brought other changes to the API. One was the
lua_lock got a Lua value from the stack and returned a introduction of lua_getsubscript, which allowed the use
reference to it. This reference was an integer that could of any value to index a table. This function had no explicit
be used any time later to retrieve that value, using the arguments: it got both the table and the key from the stack.
lua_getlocked function. (There was also a lua_unlock The old lua_getfield was redefined as a macro, for com-
function, which destroyed a reference.) With such refer- patibility:
ences, it was easy to keep Lua values in non-local C vari-
#define lua_getfield(o,f) \
ables. (lua_pushobject(o), lua_pushstring(f), \
The second shortcoming was more subtle. Objects stored lua_getsubscript())
in the internal array were released only when the function
returned. If a function used too many values, it could over- (Backward compatibility of the C API is usually imple-
flow the array or cause an out-of-memory error. For instance, mented using macros, whenever feasible.)
consider the following higher-order iterator function, which Despite all those changes, syntactically the API changed
repeatedly calls a function and prints the result until the call little from Lua 1 to Lua 2. For instance, our illustrative func-
returns nil: tion foo could be written in Lua 2 exactly as we wrote it for
Lua 1.0. The meaning of lua_Object was quite different,
void l_loop (void) { and lua_getfield was implemented on top of new primi-
lua_Object f = lua_getparam(1); tive operations, but for the average user it was as if nothing
for (;;) { had changed. Thereafter, the API remained fairly stable until
lua_Object res; Lua 4.0.
lua_callfunction(f);
Lua 2.4 expanded the reference mechanism to support
res = lua_getresult(1);
weak references. A common design in Lua programs is to
if (lua_isnil(res)) break;
printf("%s\n", lua_getstring(res)); have a Lua object (typically a table) acting as a proxy for a
} C object. Frequently the C object must know who its proxy
} is and so keeps a reference to the proxy. However, that
reference prevents the collection of the proxy object, even
The problem with this code was that the string returned by when the object becomes inaccessible from Lua. In Lua 2.4,
each call could not be collected until the end of the loop (that the program could create a weak reference to the proxy; that
is, of the whole C function), thus opening the possibility reference did not prevent the collection of the proxy object.
of array overflow or memory exhaustion. This kind of error Any attempt to retrieve a collected reference resulted in a
can be very difficult to track, and so the implementation of special value LUA_NOOBJECT.
Lua 2.1 set a hard limit on the size of the internal array Lua 4.0 brought two main novelties in the C API: support
that kept lua_Object values alive. That made the error for multiple Lua states and a virtual stack for exchanging
easier to track because Lua could say “too many objects in a values between C and Lua. Support for multiple, indepen-
C function” instead of a generic out-of-memory error, but it dent Lua states was achieved by eliminating all global state.
did not avoid the problem. Until Lua 3.0, only one Lua state existed and it was imple-
To address the problem, the API in Lua 2.1 offered two mented using many static variables scattered throughout the
functions, lua_beginblock and lua_endblock, that cre- code. Lua 3.1 introduced multiple independent Lua states;
ated dynamic scopes (“blocks”) for lua_Object values; all static variables were collected into a single C struct. An
API function was added to allow switching states, but only int l_loop (lua_State *L) {
one Lua state could be active at any moment. All other API for (;;) {
functions operated over the active Lua state, which remained lua_pushvalue(L, 1);
implicit and did not appear in the calls. Lua 4.0 introduced lua_call(L, 0, 1);
explicit Lua states in the API. This created a big incompat- if (lua_isnil(L, -1)) break;
printf("%s\n", lua_tostring(L, -1));
ibility with previous versions.17 All C code that communi-
lua_pop(L, 1);
cated with Lua (in particular, all C functions registered to }
Lua) had to be changed to include an explicit state argument return 0;
in calls to the C API. Since all C functions had to be rewrit- }
ten anyway, we took this opportunity and made another ma-
jor change in the C – Lua communication in Lua 4.0: we To call a Lua function, we push it onto the stack and then
replaced the concept of lua_Object by an explicit virtual push its arguments, if any (none in the example). Then we
stack used for all communication between Lua and C in both call lua_call, telling how many arguments to get from
directions. The stack could also be used to store temporary the stack (and therefore implicitly also telling where the
values. function is in the stack) and how many results we want from
In Lua 4.0, our foo example could be written as follows: the call. In the example, we have no arguments and expect
one result. The lua_call function removes the function and
int foo_l (lua_State *L) { its arguments from the stack and pushes back exactly the
lua_pushstring(L, "x");
requested number of results. The call to lua_pop removes
lua_gettable(L, 1);
the single result from the stack, leaving the stack at the same
return 1;
} level as at the beginning of the loop. For convenience, we
can index the stack from the bottom, with positive indices,
The first difference is the function signature: foo_l now re- or from the top, with negative indices. In the example, we
ceives a Lua state on which to operate and returns the num- use index -1 in lua_isnil and lua_tostring to refer to
ber of values returned by the function in the stack. In pre- the top of the stack, which contains the function result.
vious versions, all values left in the stack when the func- With hindsight, the use of a single stack in the API seems
tion ended were returned to Lua. Now, because the stack is an obvious simplification, but when Lua 4.0 was released
used for all operations, it can contain intermediate values many users complained about the complexity of the new
that are not to be returned, and so the function needs to tell API. Although Lua 4.0 had a much cleaner conceptual model
Lua how many values in the stack to consider as return val- for its API, the direct manipulation of the stack requires
ues. Another difference is that lua_getparam is no longer some thought to get right. Many users were content to use
needed, because function arguments come in the stack when the previous API without any clear conceptual model of
the function starts and can be directly accessed by their in- what was going on behind the scenes. Simple tasks did
dex, like any other stack value. not require a conceptual model at all and the previous API
The last difference is the use of lua_gettable, which worked quite well for them. More complex tasks often broke
replaced lua_getsubscript as the means to access table whatever private models users had, but most users never
fields. lua_gettable receives the table to be indexed as programmed complex tasks in C. So, the new API was seen
a stack position (instead of as a Lua object), pops the key as too complex at first. However, such skepticism gradually
from the top of the stack, and pushes the result. Moreover, it vanished, as users came to understand and value the new
leaves the table in the same stack position, because tables are model, which proved to be simpler and much less error-
frequently indexed repeatedly. In foo_l, the table used by prone.
lua_gettable is at stack position 1, because it is the first The possibility of multiple states in Lua 4.0 created an un-
argument to that function, and the key is the string "x", expected problem for the reference mechanism. Previously,
which needs to be pushed onto the stack before calling a C library that needed to keep some object fixed could cre-
lua_gettable. That call replaces the key in the stack with ate a reference to the object and store that reference in a
the corresponding table value. So, after lua_gettable, global C variable. In Lua 4.0, if a C library was to work with
there are two values in the stack: the table at position 1 several states, it had to keep an individual reference for each
and the result of the indexing at position 2, which is the top state and so could not keep the reference in a global C vari-
of the stack. The C function returns 1 to tell Lua to use that able. To solve this difficulty, Lua 4.0 introduced the registry,
top value as the single result returned by the function. which is simply a regular Lua table available to C only. With
To further illustrate the new API, here is an implementa- the registry, a C library that wants to keep a Lua object can
tion of our loop example in Lua 4.0: choose a unique key and associate the object with this key in
the registry. Because each independent Lua state has its own
17 We provided a module that emulated the 3.2 API on top of the 4.0 API, registry, the C library can use the same key in each state to
but we do not think it was used much. manipulate the corresponding object.
We could quite easily implement the original reference easily ensure that a userdata had the expected type by check-
mechanism on top of the registry by using integer keys to ing its tag. (The problem of how a library writer chose a tag
represent references. To create a new reference, we just find that did not clash with tags from other libraries remained
an unused integer key and store the value at that key. Retriev- open. It was only solved in Lua 3.0, which provided tag man-
ing a reference becomes a simple table access. However, we agement via lua_newtag.)
could not implement weak references using the registry. So, A bigger problem with Lua 2.1 was the management
Lua 4.0 kept the previous reference mechanism. In Lua 5.0, of C resources. More often than not, a userdata pointed
with the introduction of weak tables in the language, we to a dynamically allocated structure in C, which had to be
were finally able to eliminate the reference mechanism from freed when its corresponding userdata was collected in Lua.
the core and move it to a library. However, userdata were values, not objects. As such, they
The C API has slowly evolved toward completeness. were not collected (in the same way that numbers are not
Since Lua 4.0, all standard library functions can be written collected). To overcome this restriction, a typical design was
using only the C API. Until then, Lua had a number of built- to use a table as a proxy for the C structure in Lua, storing
in functions (from 7 in Lua 1.1 to 35 in Lua 3.2), most of the actual userdata in a predefined field of the proxy table.
which could have been written using the C API but were not When the table was collected, its finalizer would free the
because of a perceived need for speed. A few built-in func- corresponding C structure.
tions could not have been written using the C API because This simple solution created a subtle problem. Because
the C API was not complete. For instance, until Lua 3.2 it the userdata was stored in a regular field of the proxy table, a
was not possible to iterate over the contents of a table using malicious user could tamper with it from within Lua. Specif-
the C API, although it was possible to do it in Lua using the ically, a user could make a copy of the userdata and use the
built-in function next. The C API is not yet complete and copy after the table was collected. By that time, the corre-
not everything that can be done in Lua can be done in C; sponding C structure had been destroyed, making the user-
for instance, the C API lacks functions for performing arith- data a dangling pointer, with disastrous results. To improve
metic operations on Lua values. We plan to address this issue the control of the life cycle of userdata, Lua 3.0 changed
in the next version. userdata from values to objects, subject to garbage collec-
tion. Users could use the userdata finalizer (the garbage-
6.10 Userdata
collection tag method) to free the corresponding C structure.
Since its first version, an important feature of Lua has been The correctness of Lua’s garbage collector ensured that a
its ability to manipulate C data, which is provided by a userdata could not be used after being collected.
special Lua data type called userdata. This ability is an However, userdata as objects created an identity problem.
essential component in the extensibility of Lua. Given a userdata, it is trivial to get its corresponding pointer,
For Lua programs, the userdata type has undergone no but frequently we need to do the reverse: given a C pointer,
changes at all throughout Lua’s evolution: although userdata we need to get its corresponding userdata.18 In Lua 2, two
are first-class values, userdata is an opaque type and its only userdata with the same pointer and the same tag would be
valid operation in Lua is equality test. Any other operation equal; equality was based on their values. So, given the
over userdata (creation, inspection, modification) must be pointer and the tag, we had the userdata. In Lua 3, with
provided by C functions. userdata being objects, equality was based on identity: two
For C functions, the userdata type has undergone several userdata were equal only when they were the same userdata
changes in Lua’s evolution. In Lua 1.0, a userdata value was (that is, the same object). Each userdata created was different
a simple void* pointer. The main drawback of this simplic- from all others. Therefore, a pointer and a tag would not be
ity was that a C library had no way to check whether a user- enough to get the corresponding userdata.
data was valid. Although Lua code cannot create userdata To solve this difficulty, and also to reduce incompatibili-
values, it can pass userdata created by one library to another ties with Lua 2, Lua 3 adopted the following semantics for
library that expects pointers to a different structure. Because the operation of pushing a userdata onto the stack: if Lua
C functions had no mechanisms to check this mismatch, the already had a userdata with the given pointer and tag, then
result of this pointer mismatch was usually fatal to the appli- that userdata was pushed on the stack; otherwise, a new user-
cation. We have always considered it unacceptable for a Lua data was created and pushed on the stack. So, it was easy for
program to be able to crash the host application. Lua should C code to translate a C pointer to its corresponding userdata
be a safe language. in Lua. (Actually, the C code could be the same as it was in
To overcome the pointer mismatch problem, Lua 2.1 in- Lua 2.)
troduced the concept of tags (which would become the seed
for tag methods in Lua 3.0). A tag was simply an arbitrary in- 18 A typical scenario for this need is the handling of callbacks in a GUI
teger value associated with a userdata. A userdata’s tag could
toolkit. The C callback associated with a widget gets only a pointer to the
only be set once, when the userdata was created. Provided widget, but to pass this callback to Lua we need the userdata that represents
that each C library used its own exclusive tag, C code could that widget in Lua.
However, Lua 3 behavior had a major drawback: it com- (possibly to catch typing mistakes). For these two tasks, we
bined into a single primitive (lua_pushuserdata) two ba- needed access to the type of a Lua value and a mechanism to
sic operations: userdata searching and userdata creation. traverse a table and visit all its pairs.
For instance, it was impossible to check whether a given Lua 1.0 provided the needed functionality with only two
C pointer had a corresponding userdata without creating that functions, which still exist: type and next. The type func-
userdata. Also, it was impossible to create a new userdata re- tion returns a string describing the type of any given value
gardless of its C pointer. If Lua already had a userdata with ("number", "nil", "table", etc.). The next function re-
that value, no new userdata would be created. ceives a table and a key and returns a “next” key in the ta-
Lua 4 mitigated that drawback by introducing a new func- ble (in an arbitrary order). The call next(t,nil) returns a
tion, lua_newuserdata. Unlike lua_pushuserdata, this “first” key. With next we can traverse a table and process all
function always created a new userdata. Moreover, what was its pairs. For instance, the following code prints all pairs in a
more important at that time, those userdata were able to store table t:19
arbitrary C data, instead of pointers only. The user would tell k = next(t,nil)
lua_newuserdata the amount memory to be allocated and while k do
lua_newuserdata returned a pointer to the allocated area. print(k,t[k])
By having Lua allocate memory for the user, several com- k = next(t,k)
mon tasks related to userdata were simplified. For instance, end
C code did not need to handle memory-allocation errors, be-
Both these functions have a simple implementation: type
cause they were handled by Lua. More important, C code
checks the internal tag of the given value and returns the
did not need to handle memory deallocation: memory used
corresponding string; next finds the given key in the table
by such userdata was released by Lua automatically, when
and then goes to the next key, following the internal table
the userdata was collected.
representation.
However, Lua 4 still did not offer a nice solution to the
In languages like Java and Smalltalk, reflection must
search problem (i.e., finding a userdata given its C pointer).
reify concepts like classes, methods, and instance variables.
So, it kept the lua_pushuserdata operation with its old be-
Moreover, that reification demands new concepts like meta-
havior, resulting in a hybrid system. It was only in Lua 5 that
classes (the class of a reified class). Lua needs nothing like
we removed lua_pushuserdata and dissociated userdata
that. In Lua, most facilities provided by the Java reflective
creation and searching. Actually, Lua 5 removed the search-
package come for free: classes and modules are tables, meth-
ing facility altogether. Lua 5 also introduced light userdata,
ods are functions. So, Lua does not need any special mech-
which store plain C pointer values, exactly like regular user-
anism to reify them; they are plain program values. Simi-
data in Lua 1. A program can use a weak table to associate
larly, Lua does not need special mechanisms to build method
C pointers (represented as light userdata) to its correspond-
calls at run time (because functions are first-class values and
ing “heavy” userdata in Lua.
Lua’s parameter-passing mechanism naturally supports call-
As is usual in the evolution of Lua, userdata in Lua 5
ing a function with a variable number of arguments), and it
is more flexible than it was in Lua 4; it is also simpler to
does not need special mechanisms to access a global vari-
explain and simpler to implement. For simple uses, which
able or an instance variable given its name (because they are
only require storing a C structure, userdata in Lua 5 is trivial
regular table fields).20
to use. For more complex needs, such as those that require
mapping a C pointer back to a Lua userdata, Lua 5 offers
the mechanisms (light userdata and weak tables) for users to 7. Retrospect
implement strategies suited to their applications. In this section we give a brief critique of Lua’s evolutionary
process, discussing what has worked well, what we regret,
6.11 Reflectivity and what we do not really regret but could have done differ-
Since its very first version Lua has supported some reflective ently.
facilities. A major reason for this support was the proposed One thing that has worked really well was the early de-
use of Lua as a configuration language to replace SOL. As cision (made in Lua 1.0) to have tables as the sole data-
described in §4, our idea was that the programmer could structuring mechanism in Lua. Tables have proved to be
use the language itself to write type-checking routines, if powerful and efficient. The central role of tables in the lan-
needed. guage and in its implementation is one of the main character-
For instance, if a user wrote something like 19 Although this code still works, the current idiom is ‘for k,v in
T = @track{ y=9, x=10, id="1992-34" } pairs(t) do print(k,v) end’.
20 Before Lua 4.0, global variables were stored in a special data structure
we wanted to be able to check that the track did have a y
inside the core, and we provided a nextvar function to traverse it. Since
field and that this field was a number. We also wanted to be Lua 4.0, global variables are stored in a regular Lua table and nextvar is
able to check that the track did not have extraneous fields no longer needed.
istics of Lua. We have resisted user pressure to include other not developed in a collaborative way. We do accept user
data structures, mainly “real” arrays and tuples, first by be- suggestions, but never their code verbatim. We always try
ing stubborn, but also by providing tables with an efficient to do our own implementation.
implementation and a flexible design. For instance, we can Another unusual aspect of Lua’s evolution has been our
represent a set in Lua by storing its elements as indices of handling of incompatible changes. For a long time we con-
a table. This is possible only because Lua tables accept any sidered simplicity and elegance more important than com-
value as index. patibility with previous versions. Whenever an old feature
Another thing that has worked well was our insistence on was superseded by a new one, we simply removed the old
portability, which was initially motivated by the diverse plat- feature. Frequently (but not always), we provided some sort
forms of Tecgraf’s clients. This allowed Lua to be compiled of compatibility aid, such as a compatibility library, a con-
for platforms we had never dreamed of supporting. In par- version script, or (more recently) compile-time options to
ticular, Lua’s portability is one of the reasons that Lua has preserve the old feature. In any case, the user had to take
been widely adopted for developing games. Restricted en- some measures when moving to a new version.
vironments, such as game consoles, tend not to support the Some upgrades were a little traumatic. For instance, Tec-
complete semantics of the full standard C library. By gradu- graf, Lua’s birthplace, never upgraded from Lua 3.2 to
ally reducing the dependency of Lua’s core on the standard Lua 4.0 because of the big changes in the API. Currently,
C library, we are moving towards a Lua core that requires a few Tecgraf programs have been updated to Lua 5.0, and
only a free-standing ANSI C implementation. This move new programs are written in this version, too. But Tecgraf
aims mainly at embedding flexibility, but it also increases still has a large body of code in Lua 3.2. The small size
portability. For instance, since Lua 3.1 it is easy to change and simplicity of Lua alleviates this problem: it is easy for a
a few macros in the code to make Lua use an application- project to keep to an old version of Lua, because the project
specific memory allocator, instead of relying on malloc and group can do its own maintenance of the code, when neces-
friends. Starting with Lua 5.1, the memory allocator can be sary.
provided dynamically when creating a Lua state. We do not really regret this evolution style. Gradually,
With hindsight, we consider that being raised by a small however, we have become more conservative. Not only is
committee has been very positive for the evolution of Lua. our user and code base much larger than it once was, but
Languages designed by large committees tend to be too also we feel that Lua as a language is much more mature.
complicated and never quite fulfill the expectations of their We should have introduced booleans from the start, but
sponsors. Most successful languages are raised rather than we wanted to start with the simplest possible language. Not
designed. They follow a slow bottom-up process, starting as introducing booleans from the start had a few unfortunate
a small language with modest goals. The language evolves side-effects. One is that we now have two false values: nil
as a consequence of actual feedback from real users, from and false. Another is that a common protocol used by Lua
which design flaws surface and new features that are actually functions to signal errors to their callers is to return nil
useful are identified. This describes the evolution of Lua followed by an error message. It would have been better if
quite well. We listen to users and their suggestions, but false had been used instead of nil in that case, with nil being
we include a new feature in Lua only when all three of us reserved for its primary role of signaling the absence of any
agree; otherwise, it is left for the future. It is much easier useful value.
to add features later than to remove them. This development Automatic coercion of strings to numbers in arithmetic
process has been essential to keep the language simple, and operations, which we took from Awk, could have been omit-
simplicity is our most important asset. Most other qualities ted. (Coercion of numbers to strings in string operations is
of Lua — speed, small size, and portability — derive from its convenient and less troublesome.)
simplicity. Despite our “mechanisms, not policy” rule — which we
Since its first version Lua has had real users, that is, users have found valuable in guiding the evolution of Lua — we
others than ourselves, who care not about the language itself should have provided a precise set of policies for mod-
but only about how to use it productively. Users have al- ules and packages earlier. The lack of a common policy for
ways given important contributions to the language, through building modules and installing packages prevents different
suggestions, complaints, use reports, and questions. Again, groups from sharing code and discourages the development
our small committee plays an important role in managing of a community code base. Lua 5.1 provides a set of policies
this feedback: its structure gives us enough inertia to listen for modules and packages that we hope will remedy this sit-
closely to users without having to follow all their sugges- uation.
tions. As mentioned in §6.4, Lua 3.0 introduced support for con-
Lua is best described as a closed-development, open- ditional compilation, mainly motivated to provide a means
source project. This means that, even though the source to disable code. We received many requests for enhancing
code is freely available for scrutiny and adaption, Lua is conditional compilation in Lua, even by people who did not
use it! By far the most popular request was for a full macro language (mainly C++, in the case of games) is a big ad-
processor like the C preprocessor. Providing such a macro vantage.
processor in Lua would be consistent with our general phi- Simplicity: Most game designers, scripters and level writers
losophy of providing extensible mechanisms. However, we are not professional programmers. For them, a language
would like it to be programmable in Lua, not in some other with simple syntax and simple semantics is particularly
specialized language. We did not want to add a macro facil- important.
ity directly into the lexer, to avoid bloating it and slowing
compilation. Moreover, at that time the Lua parser was not Efficiency and small size: Games are demanding applica-
fully reentrant, and so there was no way to call Lua from tions; the time alloted to running scripts is usually quite
within the lexer. (This restriction was removed in Lua 5.1.) small. Lua is one of the fastest scripting languages [1].
So endless discussions ensued in the mailing list and within Game consoles are restricted environments. The script in-
the Lua team. But no consensus was ever reached and no so- terpreter should be parsimonious with resources. The Lua
lution emerged. We still have not completely dismissed the core takes about 100K.
idea of providing Lua with a macro system: it would give Control over code: Unlike most other software enterprises,
Lua extensible syntax to go with extensible semantics. game production involves little evolution. In many cases,
once a game has been released, there are no updates or
new versions, only new games. So, it is easier to risk
8. Conclusion using a new scripting language in a game. Whether the
scripting language will evolve or how it will evolve is not
Lua has been used successfully in many large companies, a crucial point for game developers. All they need is the
such as Adobe, Bombardier, Disney, Electronic Arts, Intel, version they used in the game. Since they have complete
LucasArts, Microsoft, Nasa, Olivetti, and Philips. Many of access to the source code of Lua, they can simply keep
these companies have shipped Lua embedded into commer- the same Lua version forever, if they so choose.
cial products, often exposing Lua scripts to end users.
Lua has been especially successful in games. It was said Liberal license: Most commercial games are not open
recently that “Lua is rapidly becoming the de facto stan- source. Some game companies even refuse to use any
dard for game scripting” [37]. Two informal polls [5, 6] con- kind of open-source code. The competition is hard, and
ducted by gamedev.net (an important site for game program- game companies tend to be secretive about their tech-
mers) in September 2003 and in June 2006 showed Lua as nologies. For them, a liberal license like the Lua license
the most popular scripting language for game development. is quite convenient.
Roundtables dedicated to Lua in game development were Coroutines: It is easier to script games if the scripting lan-
held at GDC in 2004 and 2006. Many famous games use guage supports multitasking because a character or ac-
Lua: Baldur’s Gate, Escape from Monkey Island, FarCry, tivity can be suspended and resumed later. Lua supports
Grim Fandango, Homeworld 2, Illarion, Impossible Crea- cooperative multitasking in the form of coroutines [14].
tures, Psychonauts, The Sims, World of Warcraft. There are Procedural data files: Lua’s original design goal of provid-
two books on game development with Lua [42, 25], and sev- ing powerful data-description facilities allows games to
eral other books on game development devote chapters to use Lua for data files, replacing special-format textual
Lua [23, 44, 41, 24]. data files with many benefits, especially homogeneity and
The wide adoption of Lua in games came as a surprise to expressiveness.
us. We did not have game development as a target for Lua.
(Tecgraf is mostly concerned with scientific software.) With
hindsight, however, that success is understandable because Acknowledgments
all the features that make Lua special are important in game Lua would never be what it is without the help of many
development: people. Everyone at Tecgraf has contributed in different
forms — using the language, discussing it, disseminating it
Portability: Many games run on non-conventional plat- outside Tecgraf. Special thanks go to Marcelo Gattass, head
forms, such as game consoles, that need special devel- of Tecgraf, who always encouraged us and gave us complete
opment tools. An ANSI C compiler is all that is needed freedom over the language and its implementation. Lua is
to build Lua. no longer a Tecgraf product but it is still developed inside
Ease of embedding: Games are demanding applications. PUC-Rio, in the LabLua laboratory created in May 2004.
They need both performance, for its graphics and simula- Without users Lua would be just yet another language,
tions, and flexibility, for the creative staff. Not by chance, destined to oblivion. Users and their uses are the ultimate
many games are coded in (at least) two languages, one test for a language. Special thanks go to the members of our
for scripting and the other for coding the engine. Within mailing list, for their suggestions, complaints, and patience.
that framework, the ease of integrating Lua with another The mailing list is relatively small, but it is very friendly and
contains some very strong technical people who are not part [12] W. Celes, L. H. de Figueiredo, and M. Gattass. EDG: uma
of the Lua team but who generously share their expertise ferramenta para criação de interfaces gráficas interativas.
with the whole community. In Proceedings of SIBGRAPI ’95 (Brazilian Symposium on
We thank Norman Ramsey for suggesting that we write Computer Graphics and Image Processing), pages 241–248,
a paper on Lua for HOPL III and for making the initial 1995.
contacts with the conference chairs. We thank Julia Lawall [13] B. Davis, A. Beatty, K. Casey, D. Gregg, and J. Waldron. The
for thoroughly reading several drafts of this paper and for case for virtual register machines. In Proceedings of the 2003
carefully handling this paper on behalf of the HOPL III Workshop on Interpreters, Virtual Machines and Emulators,
committee. We thank Norman Ramsey, Julia Lawall, Brent pages 41–49. ACM Press, 2003.
Hailpern, Barbara Ryder, and the anonymous referees for [14] L. H. de Figueiredo, W. Celes, and R. Ierusalimschy.
their detailed comments and helpful suggestions. Programming advanced control mechanisms with Lua
We also thank André Carregal, Anna Hester, Bret coroutines. In Game Programming Gems 6, pages 357–369.
Mogilefsky, Bret Victor, Daniel Collins, David Burgess, Charles River Media, 2006.
Diego Nehab, Eric Raible, Erik Hougaard, Gavin Wraith, [15] L. H. de Figueiredo, R. Ierusalimschy, and W. Celes. The
John Belmonte, Mark Hamburg, Peter Sommerfeld, Reuben design and implementation of a language for extending
Thomas, Stephan Herrmann, Steve Dekorte, Taj Khattra, and applications. In Proceedings of XXI SEMISH (Brazilian
Thatcher Ulrich for complementing our recollection of the Seminar on Software and Hardware), pages 273–284, 1994.
historical facts and for suggesting several improvements to [16] L. H. de Figueiredo, R. Ierusalimschy, and W. Celes. Lua:
the text. Katrina Avery did a fine copy-editing job. an extensible embedded language. Dr. Dobb’s Journal,
Finally, we thank PUC-Rio, IMPA, and CNPq for their 21(12):26–33, Dec. 1996.
continued support of our work on Lua, and FINEP and [17] L. H. de Figueiredo, C. S. Souza, M. Gattass, and L. C. G.
Microsoft Research for supporting several projects related Coelho. Geração de interfaces para captura de dados sobre
to Lua. desenhos. In Proceedings of SIBGRAPI ’92 (Brazilian
Symposium on Computer Graphics and Image Processing),
pages 169–175, 1992.
References [18] A. de Moura, N. Rodriguez, and R. Ierusalimschy. Coroutines
[1] The computer language shootout benchmarks. http: in Lua. Journal of Universal Computer Science, 10(7):910–
//shootout.alioth.debian.org/. 925, 2004.
[2] Lua projects. http://www.lua.org/uses.html. [19] A. L. de Moura and R. Ierusalimschy. Revisiting coroutines.
[3] The MIT license. http://www.opensource.org/ MCC 15/04, PUC-Rio, 2004.
licenses/mit-license.html. [20] R. K. Dybvig. Three Implementation Models for Scheme.
[4] Timeline of programming languages. http://en. PhD thesis, Department of Computer Science, University
wikipedia.org/wiki/Timeline of programming of North Carolina at Chapel Hill, 1987. Technical Report
languages. #87-011.
[5] Which language do you use for scripting in your game [21] M. Feeley and G. Lapalme. Closure generation based on
engine? http://www.gamedev.net/gdpolls/viewpoll. viewing LAMBDA as EPSILON plus COMPILE. Journal of
asp?ID=163, Sept. 2003. Computer Languages, 17(4):251–267, 1992.
[6] Which is your favorite embeddable scripting language? [22] T. G. Gorham and R. Ierusalimschy. Um sistema de
http://www.gamedev.net/gdpolls/viewpoll.asp? depuração reflexivo para uma linguagem de extensão.
ID=788, June 2006. In Anais do I Simpósio Brasileiro de Linguagens de
Programação, pages 103–114, 1996.
[7] K. Beck. Extreme Programming Explained: Embrace
Change. Addison-Wesley, 2000. [23] T. Gutschmidt. Game Programming with Python, Lua, and
Ruby. Premier Press, 2003.
[8] G. Bell, R. Carey, and C. Marrin. The Virtual Re-
ality Modeling Language Specification—Version 2.0. [24] M. Harmon. Building Lua into games. In Game Program-
http://www.vrml.org/VRML2.0/FINAL/, Aug. 1996. ming Gems 5, pages 115–128. Charles River Media, 2005.
(ISO/IEC CD 14772). [25] J. Heiss. Lua Scripting für Spieleprogrammierer. Hit the
[9] J. Bentley. Programming pearls: associative arrays. Commu- Ground with Lua. Stefan Zerbst, Dec. 2005.
nications of the ACM, 28(6):570–576, 1985. [26] A. Hester, R. Borges, and R. Ierusalimschy. Building flexible
[10] J. Bentley. Programming pearls: little languages. Communi- and extensible web applications with Lua. Journal of
cations of the ACM, 29(8):711–721, 1986. Universal Computer Science, 4(9):748–762, 1998.
[11] C. Bruggeman, O. Waddell, and R. K. Dybvig. Representing [27] R. Ierusalimschy. Programming in Lua. Lua.org, 2003.
control in the presence of one-shot continuations. In [28] R. Ierusalimschy. Programming in Lua. Lua.org, 2nd edition,
SIGPLAN Conference on Programming Language Design 2006.
and Implementation, pages 99–107, 1996.
[29] R. Ierusalimschy, W. Celes, L. H. de Figueiredo, and
R. de Souza. Lua: uma linguagem para customização de
aplicações. In VII Simpósio Brasileiro de Engenharia de
Software — Caderno de Ferramentas, page 55, 1993.
[30] R. Ierusalimschy, L. H. de Figueiredo, and W. Celes. Lua:
an extensible extension language. Software: Practice &
Experience, 26(6):635–652, 1996.
[31] R. Ierusalimschy, L. H. de Figueiredo, and W. Celes. The
implementation of Lua 5.0. Journal of Universal Computer
Science, 11(7):1159–1176, 2005.
[32] R. Ierusalimschy, L. H. de Figueiredo, and W. Celes. Lua 5.1
Reference Manual. Lua.org, 2006.
[33] K. Jung and A. Brown. Beginning Lua Programming. Wrox,
2007.
[34] L. Lamport. LATEX: A Document Preparation System.
Addison-Wesley, 1986.
[35] M. J. Lima and R. Ierusalimschy. Continuações em Lua.
In VI Simpósio Brasileiro de Linguagens de Programação,
pages 218–232, June 2002.
[36] D. McDermott. An efficient environment allocation scheme
in an interpreter for a lexically-scoped LISP. In ACM
conference on LISP and functional programming, pages 154–
162, 1980.
[37] I. Millington. Artificial Intelligence for Games. Morgan
Kaufmann, 2006.
[38] B. Mogilefsky. Lua in Grim Fandango. http://www.
grimfandango.net/?page=articles&pagenumber=2,
May 1999.
[39] Open Software Foundation. OSF/Motif Programmer’s Guide.
Prentice-Hall, Inc., 1991.
[40] J. Ousterhout. Tcl: an embeddable command language. In
Proc. of the Winter 1990 USENIX Technical Conference.
USENIX Association, 1990.
[41] D. Sanchez-Crespo. Core Techniques and Algorithms in
Game Programming. New Riders Games, 2003.
[42] P. Schuytema and M. Manyen. Game Development with Lua.
Delmar Thomson Learning, 2005.
[43] A. van Deursen, P. Klint, and J. Visser. Domain-specific
languages: an annotated bibliography. SIGPLAN Notices,
35(6):26–36, 2000.
[44] A. Varanese. Game Scripting Mastery. Premier Press, 2002.

You might also like