A Cursory Overview of The Rust Programming Language

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Department of Computing Science

Umeå University 1st March 2016

A cursory overview of the Rust programming language

Filip Allberg ([email protected])


Adam Dahlgren Lindström ([email protected])
Patrik Hörnquist ([email protected])
Jakob Lindqvist ([email protected])
Emil Marklund ([email protected])

http://rustbyexample.github.io/examples/hello/README.html

// hello.rs
// the main function
fn main() {
// print to the console
println!("Hello World!");
}

$ rustc hello.rs

$ ./hello
Hello World!

Programming Languages (Programspråk VT16), 7.5 hp5DV086 HT15, 7.5 hp


Supervisor(s): Petter Ericson ([email protected]), Jan-Erik Moström ([email protected])
Programming Languages (Programspråk VT16), 7.5 hp 1(8)

Contents
Introduction 1

1 Background 1

2 Design 2
2.1 Ownership . . . . . . . . . . . . . . . 2
2.2 Manual memory management . . . . 2
2.3 Type-system . . . . . . . . . . . . . 3
2.4 Concurrency . . . . . . . . . . . . . . 4

3 Influences 5
3.1 Functional elements . . . . . . . . . 5
3.2 Memory management . . . . . . . . 5
3.3 Concurrency . . . . . . . . . . . . . . 5

4 Use cases 6
4.1 Servo layout engine . . . . . . . . . . 6
4.2 Webrender . . . . . . . . . . . . . . . 6
4.3 Redox . . . . . . . . . . . . . . . . . 6

5 Benchmarks 6

6 Discussion 7
Programming Languages (Programspråk VT16), 7.5 hp 1(8)

Introduction github.com/rust-lang and uses Github to main-


tain its request for changes, i.e. feature-requests
https://github.com/rust-lang/rfcs.
This report is written by a group of students at
Umeå University for the course Programming Lan- The initial compiler (written in OCaml) has been
guages (Programspråk). The goal is to do research replaced by a self-hosting compiler written in Rust.
about the programming language Rust and present Known as rustc, it successfully compiled itself in
its main features, what kind of software engineering 2011. At the time of writing, Rust is primarily used
it is targetted at, and how well it performs. by Mozilla Corporation in an attempt to parallelize
Firefox. There are currently two projects, a new
layout engine and a new renderer. [3][4]
1 Background Historically, programming languages have had a
trade-off: you can have a language which is safe,
Rust is a general-purpose, multi-paradigm, com- but you give up control, or you can have a lan-
piled programming language developed by Mozilla guage with control, but it is unsafe. C++ falls
research. It was initially published in 2010 with a into that latter category. While modern C++ is
first stable version appearing in May, 2015[1]. It is significantly safer than it used to be, there are fun-
specifically targetting the domain currently domi- damental aspects of C++ which make it impossible
nated by C and C++, namely systems program- to ever be truly safe. Rust attempts to give you a
ming, but differentiates itself by having been de- language with 100% control, yet be absolutely safe!
veloped to prevent some of the problems related to It does this through extensive compile-time check-
invalid memory accesses (which generate segmen- ing, so you pay little and often no cost at runtime
tation faults)[2]. for its safety features. This safety issue is not just
about programmer convenience, though. C++ is
Rust is a systems programming language unsafe in ways which cause serious security vulner-
that runs blazingly fast, prevents segfaults, abilities.
and guarantees thread safety.
For example, in the last Pwn2Own competition,
major browsers had some serious vulnerabilities,
due to C++ being unsafe. Many of these er-
The Rust website primarily mentions the following rors would have been compile-time bugs if those
features:[2] browsers were written in Rust, and the runtime
ones would not have produced a security vulner-
• Move semantics ability.
• Guaranteed memory safety. Rust is vaguely functional, but it is not full-blown.
OCaml better as a functional language than Rust.
• Threads without data races.
That said, it might make it easy to start to use
• Pattern matching. some functional concepts, because you do not have
to go whole-hog. This is why OCaml is often easier
• Type inference. than Haskell, for example.

• Minimal runtime. You can use Rust anywhere you could use C. There
are practical reasons why you may choose not to,
• Efficient C bindings. for example, there are not nearly as many libraries.
But Rust can work well with C libraries as it has
• Trait-based generics, this allows for Venn- efficient C-bindings, which are not the case with
diagram style implementations. how other languages, such as Java, call C code.

Rust is developed openly, on Github, https://


Programming Languages (Programspråk VT16), 7.5 hp 2(8)

2 Design Because of this, the ownership system alone would


make programming difficult. Luckily, you can also
pass references to functions, and thus keeping the
This section is an overview of Rust’s design in par- original ownership. This is known as borrowing in
ticular the specifics that are unique to the Rust Rust and will be explained more in subsection 2.3.
language. We have omitted details of the primi-
tive aspects of the language which are are borrowed It is also possible to copy data as opposed to moving
from languages from the same family. them. For instance, this is the case for all primitive
data types, such as integers and floating point num-
bers. They implement a trait called Copy, which
2.1 Ownership can be implemented for other data types as well.
If we had not boxed the integer in Code snippet 1,
One of Rust’s most unique features is called own- then the Copy trait would kick in and have variable
ership. This means that variables own the allo- b own a copy of the value, instead of the actual
cated data resources (such as vectors, stacks and value. Thus, printing variable a would be a valid
other objects) they are bound to. This results in operation.[5][6]
that every variable binding must own exactly one
data resource.
2.2 Manual memory management
For instance, suppose we have the code in Code
snippet 1. Variable a allocates memory on the Rust strives for zero-cost abstractions, which
heap for an integer (boxing means wrapping someits memory management is an excellent example
data into an object), and then the ownership moves
of. Rust uses manual memory management in the
from variable a to variable b. This is called move
sense that the programmer is in complete control of
semantics in general programming terms. Af- which memory is being allocated, just like in C.[7]
ter this, trying to use variable a would result in
But unlike C, the programmers might not be aware
a compile-error. of how much control they have because Rust auto-
let a = Box::new(23); //a now owns a boxed int matically knows when to allocate and free memory.
let b = a; //ownership moves to b Memory management has been abstracted away
println!("{}", a); //error from the programmers view, so they do not even
notice it. In other words, Rust feels like a high-level
Code snippet 1: Ownership example, move seman-
language, but can be used as a low-level language
tics
as well.

The same behaviour goes with functions. If we passManual memory management is possible thanks to
a variable as argument to a function, the functionthe ownership feature. Whenever a variable goes
will consume its ownership. This means that the out of scope, Rust frees the variable and all data it
code in Code snippet 2 also gives a compile-error.is bound to. This is the reason any data must be
owned by exactly one variable. It is quite easy to
fn main() {
see why. Suppose we have two variables owning the
let a = Box::new(23);
_
do something(a); // give away ownership
same data. If the first variable goes out of scope,
println!("The number is {}", a); // error both itself and the data it owned are automatically
} freed. Now, trying to use the second variable would
// i takes over ownership be very dangerous because it owns data that does
fn do_something(i: Box<i32>) { not exist anymore! This is why we would get a
// something is done here compile-error earlier in Code snippet 1. As a con-
} sequence from ownership, it is always safe to free
Code snippet 2: Functions take ownership of their related data when their variable goes out of scope.
passed arguments
Programming Languages (Programspråk VT16), 7.5 hp 3(8)

2.3 Type-system {
let x = box 5i;

Rust uses affine types, [8] and regions (which is


// stuff happens
a specific kind of affine types) which aren’t found }
in many other real-world languages, to ensure that
there are no data-races, post-allocation memory ac-
cesses, etc. As being similar to this C code:

Rust is a language that focuses on not having life- {


times of data, not memory management or garbage int *x;
collection, and has a lot of (built-in) support for x = (int *)malloc(sizeof(int));
concurrency. The type-system underpins this fo- *x = 5;
cus.
// stuff happens
The type system is complex (but not complicated),
free(x);
but does not affect the end-user much beyond
}
compile-time guarantees. In comparison with other
languages with type-systems which are complex the
general consensus seems to be that those systems It is through regions and boxes that Rust can
are somewhat in the way of a quick development provide its compile-time guarantees. Regions and
cycle. boxes is where ownership and borrowing comes
from, which makes it impossible to allocate the in-
Rust has three "realms" in which objects can be al- correct amount of memory, because Rust figures it
located: the stack, the local heap, and the exchange out from the types.
heap. These realms have corresponding pointer
types: the borrowed pointer &T, the shared box You cannot forget to free memory you’ve allocated,
@T, and the unique box ~T. because Rust does it for you.

In February 2015 Eric Reed at the University of Rust ensures that this free happens at the right
Washington proved the soundness and correctness time, when it is truly not used. Use-after-free is
of the combined type-system in Rust, where it was not possible. Rust enforces that no other writeable
only previously known that its constituent parts pointers alias to this heap memory, which means
were sound but it remained unknown if they re- writing to an invalid pointer is not possible.
mained sound together, as well as proved that the
static compile time analyzer Borrow Checker is cor- Using boxes and references together is very com-
rect.[8] mon. For example:

Boxes are what’s called an affine type. This means fn add_one(x: &int) -> int {
that the Rust compiler, at compile time, determines *x + 1
when the box comes into and goes out of scope, and }
inserts the appropriate calls there. Furthermore,
boxes are a specific kind of affine type, known as a fn main() {
let x = box 5i;
region.
_
println!("{}", add one(&*x));
You don’t need to fully grok the theory of affine
}
types or regions to grok boxes, though. As a rough
approximation, you can treat this Rust code, which
demonstrates a box: In this case, Rust knows that x is being ’borrowed’
by the add_one() function, and since it’s only read-
ing the value, allows it.
Programming Languages (Programspråk VT16), 7.5 hp 4(8)

Meanwhile regions pertain to the life-time of ob- communicate. By declaring a sender and receiver
jects, with respect to their scope. an expensive and time consuming task can be done
on a separate thread. When the task is completed
We can borrow x multiple times, as long as it’s not the result can be fetched from the receiver.
simultaneous:
What if you want to use many threads to work
fn add_one(x: &int) -> int { together on the same task? Rust has familiar syn-
*x + 1 chronization tools. A lock/mutex can contain a re-
}
source in order to allow several threads to use it
safely. Even if the idea of locks is similar to C they
fn main() {
let x = box 5i;
are not exactly the same. In C there is no explicit
relation between the lock and its contents, the lock
println!("{}", add_one(&*x)); only controls one or several blocks of code. In Rust
println!("{}", add_one(&*x)); mutex is a generic (i.e. Mutex<Type>) which di-
println!("{}", add_one(&*x)); rectly contains data.
}
“Lock data, not code” is enforced in
Rust.[9]
Or as long as it’s not a mutable borrow. This will
error during compile-time: Rust still does not allow more than one reference
to an object. How can the lock be shared between
fn add_one(x: &mut int) -> int { threads? Arc (Atomic Reference Counter) is an-
*x + 1 other generic that exists to solve this problem. An
}
Arc lets references be cloned and shared with new
owners. In general reference counters are a way to
fn main() {
let x = box 5i;
handle multiple references to the same data while
avoiding "use after free". No owner may directly
//err: cannot borrow immutable dereference deallocate the resource, instead the deallocation
//of ‘&‘-pointer as mutable will take place when the Arc determines that no
println!("{}", add_one(&*x)); references still exist (i.e. all owners have gone out
} of scope). Figure 1 on page 4 visualizes how a ref-
erence counter works. In order to share a mutable
resource an Arc can wrap a mutex which in turn
Notice we changed the signature of add_one() to
wraps a resource.
request a mutable reference.

2.4 Concurrency

Rust was created to make it simpler to write code


that would be both safe and efficient compared with
C/C++. Concepts such as Ownership and Bor-
rowing are not only a way to avoid memory er-
rors, it is also what makes Rust suited for parallell
programming. A thread takes ownership of a re-
source in a way that makes it inaccessible by others.
This chapter will overview the way concurrency and
threads are used in Rust.
Figure 1: Reference counter
Channels are an asynchronous way for threads to
Programming Languages (Programspråk VT16), 7.5 hp 5(8)

These features are not embedded in the language uses of references and smart pointers.
itself, instead they are implemented in libraries. In
order to make them easier to replace updates or Rust also implements Resourse Acquisition Is Ini-
changes does not require a language update. tialization (RAII), a powerful concept that binds
the lifetime of a resource to the scope of the variable
Reference counters (as well as mutex etc) are not holding it[11]. Taken from C++, this gives safer
unique to Rust. However the programmer is en- memory management as destructors takes care of
couraged to write safe programs as Rust will not deallocation implicitly when leaving a function.
allow faulty code to compile. In other words: what
is considered good practice in other languages is In order to compete with C++ in terms of speed,
required in Rust. monomorphization is implemented in Rust [12]. It
allows for generic functions but compiles these to
specialized versions depending of how the function
is used in the code (e.g. one version handling int’s,
3 Influences one handling floats) and in C++ this is seen as
templates.
As expected, Rust draws its inspiration from many
While C++ is the raw model for memory manage-
other programming languages [10]. As Rust is
ment, ML Kit and Cyclone have provided the re-
truly a multiparadigm language, both supporting a
gion based memory management aspect of Rust.
pure-functional and object orientated approaches,
Region based memory management is used to be
the sources are widely spread over the spectrum of
able to do static verification of the lifetimes of ref-
paradigms.
erences, removing any runtime overhead and thus
It will be apparent from the following sections that improving speed performance.
Rust looks to other languages with focus on e.g.
However, what is interesting is that this is not tra-
concurrency to design its own version of concur-
ditional as it is combined with the scope-based re-
rency. In other words, Rust avoids to reinvent the
source management, but is used to prevent dangling
wheel as much as it can.
pointers.

3.1 Functional elements 3.3 Concurrency


Rust presents many of the common traits inherent
Many of the concurrency issues that needs to be
to the functional paradigm. First of, the ML di-
solved in a programming language are concerned
alects OCaml and SML are two big influences where
with memory management. However, thread com-
their take on algebraic data types, pattern match-
munication is another issue to be addressed.
ing, type inference can be directly linked to Rust.
As with memory management and C++, Rust
Haskell provides the base for how typeclasses and
draws from the highly concurrent language Erlang
type families are implemented, whereas Swift con-
to solve these communication problems. For in-
tributes with optional bindings.
stance, Rust uses a message passing similar to that
of Erlang, meaning that threads do not share mem-
ory. Rust combines the message passing of Erlang
3.2 Memory management with the channel concept used in Newsqueak, which
has been a big influence alongside Erlang in terms
One of the key selling points behind Rust is as pre- of concurrency. Newsqueak is a predecessor to Go,
viously stated safe memory management. In the a language competing with Rust as a replacement
attempt to be an alternative to C++ the develop- of C++.
ers have use the same memory model with similar
Programming Languages (Programspråk VT16), 7.5 hp 6(8)

4 Use cases 4.3 Redox

Redox is an operating system written in Rust. [14]


4.1 Servo layout engine
Rust has been chosen as programming language to
enforce memory safe usage. While still in early de-
Rust began it’s development as a project at Mozilla velopment, Redox already has implemented lots of
Corporation in parallel with the development on core functionality that Linux has.
the Servo Layout Engine. Mozilla have been using
Gecko as a layout engine since the late 1990’s. From
earlier experience on the Gecko engine, Mozilla had
come to a series of conclusions: 5 Benchmarks
• A large proportion (roughly 50%) of all secu-
rity critical bugs are memory errors in the form There are not many references that pit languages
of use after free, usage of initialized variables against each other in a rigorous fashion, although
etc.[3] their are some that come close. They fail to address
in the manner with which they may be misrepresen-
• Modern hardware usually has many proces- tative, primarily the reader should note that when
sor cores. Earlier browsing engines has mostly benchmarking languages, the theoretical speed lim-
been built optimized for single processor exe- its of the language is not what is being measured,
cution. instead the source-code is compiled for different
known CPU intensive computations whereby a sin-
As servo comes with enforced memory safety, Servo gle implementation which solves said problem is ex-
will hopefully be free from memory errors. ecuted in both languages and the run-time of each
is compared to eachother.
A key feature in Rust that has proven useful when
developing Servo is language interoperability with One good example of when the underlying imple-
c-code, enabling the Servo Project to make use of mentation proves to misrepresent the relative com-
old libraries and providing useful functinality in an parison of languages is for regex_dna when for in-
early stage of development. stance the C implementation calls outside of the C
language to a more powerful regex engine, mean-
while the Rust implementation relies on the regex
4.2 Webrender engine in its core library which is not as optimized
to the task, and subsequently its runtime takes a
hit (curiously the Rust implementation still beats
In parallel with the Servo Layout Engine, Mozilla the C implementation.
is researching a new rendererer for the web, called
Webrender. The idea is to move processing load Another is fasta_redux where the Rust implemen-
from the CPU to dedicated graphic processing units tation is single-threaded and the C implementation
(GPU), providing a more responsive web experi- is multi-threaded.
ence. The renderer is also supposed to optimize
rendering for ordinary web use-cases. Our benchmark source is http://
benchmarksgame.alioth.debian.org/u64q/
The renderer is still in quite early development, but rust.html (Rust 1.5)
it is capable to render some ordinary web pages
such as Wikipedia in several hundreds frames per
seconds on modern hardware.[13]
Programming Languages (Programspråk VT16), 7.5 hp 7(8)

50
Rust: rustc 1.6.0 (c30b771ad 2016-01-19)
6 Discussion
C gcc: gcc (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010
40 C++ g++: g++ (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010
Because of some special features in Rust, such as
ownership, the learning curve is steeper than many
30 other languages. A beginner programmer will con-
stantly get compile-errors due to the strict rules
the code has to follow. This can cause some con-
20
fusion because the programmer initially believes to
have written correct code. On the other hand, this
10 is positive since it forces the programmer to think
more logically than they would have to in many
0
other languages. Also, Rust has a very friendly
ta t a ts y community that is willing to help out if some-
e
fas otid elbro x-dn idigi -tree
s nt ux -bod orm edux
le e p me red n l-n -r body gets stuck.[7] Having many people engaged in
e uc man
d
r eg i nary mple uch- c tra fasta
k-n b -co n k pe
e f a n s a fairly hard-to-learn programming language and
er s
rev
helping others out is important to make it grow
bigger. However, Rust is still a very new and small
Figure 2: Runtimes of each language as stacked
language. Over time or as it grows, it is possi-
bars to compare relative run-time
ble that the community loses this special advantage
simply because more Rust-related questions means
more work answering them.
Rust: rustc 1.6.0 (c30b771ad 2016-01-19)

20
C gcc: gcc (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010 A bigger question mark, however, is how well Rust
C++ g++: g++ (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010
Mean Rust runtime performs against other languages that also aim at
Mean C runtime
Mean C++ runtime
being both fast and safe. Having the best perfor-
15 mance would be a big advantage to gain popularity
among programmers, and this could very well be
10
the case considering how close Rust already is to
C/C++ in this aspect.

0
ta e rot a ts es ux od
y x
fas tid elb gex-
dn igi tre en
t
ed
rm du
leo nd pid ry- lem ch-r n-b al-no ta-re
e uc a r e n a p u c tr a s
k-n
m bi om nnk sp
e f
e-c fa
ers
rev

Figure 3: Runtime of each language for each test


and the respective mean runtimes

Sadly we found no reliable benchmarks with Rust


compared to D. Since D is claiming to be the suc-
cessor of C this would have been an interesting com-
parison.
Programming Languages (Programspråk VT16), 7.5 hp 8(8)

References [12] D. Herman. (2015). Rust, Infoq, [Online].


Available: http : / / www . infoq . com /
presentations/Rust.
[1] B. Eich, Future tense, 2011. [Online]. Avail- [13] Air.mozilla.org, Bay area rust meetup febru-
able: http : / / www . slideshare . net / ary 2016, 2016. [Online]. Available: https :
BrendanEich / future - tense - 7782010 / / air . mozilla . org / bay - area - rust -
(visited on Feb. 16, 2016). meetup-february-2016 (visited on Feb. 29,
[2] A. Avram. (). Interview on rust, a systems 2016).
programming language developed by mozilla, [14] R. Developers, Redox - your next(gen) os,
[Online]. Available: http : / / www . infoq . 2016. [Online]. Available: http : / / www .
com/news/2012/08/Interview-Rust (vis- redox-os.org/ (visited on Mar. 1, 2016).
ited on Jan. 26, 2016).
[3] L. Bergstrom, Larsbergstrom/papers, Not yet
published paper, 2016. [Online]. Available:
https : / / github . com / larsbergstrom /
papers/tree/master/icse16- servo (vis-
ited on Feb. 16, 2016).
[4] GitHub, Servo/webrender, 2015. [Online].
Available: https : / / github . com / servo /
webrender/wiki (visited on Feb. 26, 2016).
[5] The rust programming language, ownership
section. [Online]. Available: https : / / doc .
rust-lang.org/book/ownership.html.
[6] A. Crichton, Intro to the rust programming
language, 2014. [Online]. Available: https://
www.youtube.com/watch?v=agzf6ftEsLU
(visited on Dec. 11, 2014).
[7] M. Asay, Two reasons the rust language will
succeed, 2015. [Online]. Available: http : / /
www . infoworld . com / article / 2947214 /
open- source- tools/two- reasons- the-
rust-language-will-succeed.html (vis-
ited on Jun. 13, 2015).
[8] E. Reed, “Patina: a formalization of the rust
programming language”, [Online]. Available:
ftp://ftp.cs.washington.edu/tr/2015/
03/UW-CSE-15-03-02.pdf.
[9] A. Turon, Fearless concurrency with rust,
2015. [Online]. Available: http : / / blog .
rust- lang.org/2015/04/10/Fearless-
Concurrency.html.
[10] The rust reference, influences appendix. [On-
line]. Available: https : / / doc . rust -
lang . org / reference . html # appendix -
influences.
[11] Raii. [Online]. Available: http : / / en .
cppreference . com / w / cpp / language /
raii.

You might also like