TASTY-Tool For Automating Secure Two-partY
TASTY-Tool For Automating Secure Two-partY
TASTY-Tool For Automating Secure Two-partY
computations
(Full Version)
Wilko Henecka, Stefan Kögl, Ahmad-Reza Sadeghi,
Thomas Schneider, Immo Wehrenberg
System Security Lab
Ruhr-University Bochum
Germany
{ahmad.sadeghi,thomas.schneider}@trust.rub.de,
{wilko.henecka,stefan.koegl,immo.wehrenberg}@rub.de
Data in MBytes
setup phase grow linearly in the DB size (corresponding to 10
40
ment checks the syntactical correctness of the pro-
tocol description, exchanges a hash of it to en-
30
sure that both parties run the same protocol, and
20 analyzes the protocol to automatically determine
10 which parts of the protocol can be pre-computed.
(b) In the Setup Phase, the parties pre-compute those
0
0 100 200 300 400 500 600 700 800 900 1000 parts of the protocol which are independent of
DB Size
their inputs, e.g., create/send garbled circuits and
oblivious transfers (OT), see §4.3 for details.
Figure 3: Hybrid Face Recognition: Times (c) Finally, in the Online Phase, both parties provide
their inputs to the computation, and the online
part of the SFE protocol is executed (e.g., ho-
SCiFI - a system for secure face identification. We note momorphic encryptions and decryptions, online
that the recent face recognition system of [44], consisting of a OTs, and evaluation of GCs) to jointly compute
novel recognition algorithm which was co-designed together the respective outputs for both parties.
with a highly efficient SFE protocol, is more accurate and 3. TASTY provides a tool to compare the performance
efficient than Eigenface-based protocols. costs of multiple SFE protocols as described in §4.2.
Protocol Description Value Garbled
Client C Server S
in TASTYL Value
bitlength
mux, <, =, ...
Runtime Environment +, -, *
Plain Value
Analyzation Phase N rand, input, output
/, <, =, ... Homomorphic
Setup Phase Value
Input Input
Online Phase Unsigned Signed Modular
Output Output
Unsigned Signed Modular
Costs Vector Vector Vector
Homomorphic
Figure 5: Architecture and Workflow of TASTY Vector
Plain Vector
rand, input, output
/, =, ... Garbled
Vector Vector
Implementation. We selected Python as implementa- +, -, *, dot min, max, ...
tion language for TASTY as it combines elements from both,
object oriented and functional programming paradigms. In
particular the built-in support for generators, a function Figure 6: TASTYL Types and Operators
which yields a value and can be resumed afterwards, was
useful for intuitive programming of streamlined large data
structures, e.g., for dynamic generation of circuits which al-
lows TASTY to evaluate very large circuits. We successfully Homomorphic Value/Vector. Unsigned, Signed and
created and evaluated garbled circuits with 221 non-XOR Modular Values/Vectors can be converted into and from ho-
gates in less than 14 minutes on the PCs used for the exper- momorphically encrypted Homomorphic Values/Vectors of
iments in §5. server S. While Unsigned and Modular values are mapped
directly, for Signed values, the positive values are mapped to
4.1 TASTY input Language (TASTYL) the elements 0, 1, . . . of the plaintext space of the underly-
TASTYL, the input language for TASTY, allows to for- ing homomorphic cryptosystem, and the negative values to
mulate secure computations as sequence of operations on en- n − 1, n − 2, . . . as described in [31]. Addition of two Homo-
crypted data, allowing to abstract away all details of the un- morphic, and (dot) multiplication of a Homomorphic with a
derlying cryptographic protocols. We start with an overview Plain Value/Vector provided by S is done non-interactively.
of the types and operators provided by TASTYL in §4.1.1 (Dot) multiplication of two Homomorphic Values/Vectors
and explain the concrete syntax afterwards in §4.1.2. requires one round of interaction.
Garbled Value/Vector. Unsigned/Signed Plain and
4.1.1 Types and Operators Homomorphic Values/Vectors can be converted into and
from Garbled Values/Vectors of client C. A Garbled Value
The type system of TASTYL and the operators supported
can be compared with another one resulting in a Garbled
by each type are shown in Fig. 6. Each variable in TASTYL
Value of length one bit. This can be used to multiplex (mux)
is either a scalar Value (cf. top half of Fig. 6) or a Vector (cf.
between two Garbled Values. Similarly, the minimum or
bottom half of Fig. 6) which consists of N Values. They can
maximum value and/or index of the components of a Gar-
be either unencrypted Plain Values/Vectors or encrypted
bled Vector can be determined as Garbled Value(s), e.g.,
Garbled or Homomorphic Values/Vectors.
min_value computes the minimum value. For each opera-
All Values and Vectors provide the basic operators for
tion on Garbled Values/Vectors, TASTY automatically in-
(component-wise) addition, subtraction, and multiplication;
fers the underlying garbled circuit.
Vectors also provide dot multiplication: v · w = N
P
i=1 vi wi .
Number Representation. Each Value has a bitlength ` 4.1.2 Syntax and Example
that represents the number of bits needed for its represen-
tation. Unsigned are unsigned integer values in the range TASTYL is a subset of the Python language; we use the
[0, 2` [, Signed are signed integers in the range ]−2`−1 , 2`−1 [5 , following example to explain its syntax and semantics.
and Modular are elements in the plaintext space of the ho- Example (cf. Fig. 7). Client C and server S have
momorphic cryptosystem, i.e., Zn for Paillier. vectors v and w of N = 4 unsigned 32-bit values as inputs.
In addition to the operations of Value/Vector, the plain/en- As output, C obtains mini=1,..,N (vi ·wi ). The products vi ·wi
crypted types support further operations and conversions: are computed with homomorphic encryption (HE) and the
Plain Value/Vector. Inputs and outputs of the two par- minimum with garbled circuits (GC).
ties are Plain Values/Vectors. They can be chosen uniformly This protocol can be directly formulated in TASTYL as
at random and provide additional operations (integer) divi- shown in Fig. 7 and described in the following: The protocol
sion6 and comparison. gets two parties client and server as inputs to whom the
variables used throughout the protocol are bound (details
5
Note, we exclude the value −2`−1 for signed integers to also below). At the beginning, two constants N = 4 and L = 32
allow sign-magnitude representation. are defined. Then, the input of C, client.v, is defined as an
6 unsigned vector of bitlength L and dimension N , and read
Division raises an exception for division by zero or (the
unlikely event of) a non-invertible Modular value. from standard input. Similarly, the input of S, server.w, is
# −∗− c o d i n g : u t f −8 −∗− ception if the result does not fit into the plaintext space of
def p r o t o c o l ( c l i e n t , s e r v e r ) : the homomorphic cryptosystem. For example, in Fig. 7 the
N = 4
L = 32
component-wise product of two vectors with N components
of unsigned L-bit values results in the homomorphic vector
# input of c l i e n t server.hx with N components of unsigned 2L-bit values.
c l i e n t . v = UnsignedVec ( b i t l e n=L , dim=N) Multiple Outputs. Garbled circuits can also have mul-
c l i e n t . v . i n p u t ( d e s c=” e n t e r v a l u e s f o r v ” )
tiple garbled output values written as comma separated list
# input of server on the left side of the assignment operator, e.g., the garbled
s e r v e r . w = UnsignedVec ( b i t l e n=L , dim=N) minimum value gv and its index gi can be computed as
s e r v e r . w . i n p u t ( d e s c=” e n t e r v a l u e s f o r w” )
(client.gv, client.gi)=client.gx.min_value_index().
# c o n v e r t u n s i g n e d t o homomorphic v e c t o r Circuits from File. TASTY allows secure evaluation of
c l i e n t . hv = HomomorphicVec ( v a l=c l i e n t . v ) boolean circuits read from an external file, e.g., circuits gen-
s e r v e r . hv <<= c l i e n t . hv
erated by the FairplayMP compiler [3]. For this, the labels
# m u l t i p l y v e c t o r s ( component−w i s e ) of the input- and output wires of the circuit are mapped
s e r v e r . hx = s e r v e r . hv ∗ s e r v e r . w to Garbled Values of corresponding bitlength. An exam-
# c o n v e r t homomorphic t o g a r b l e d v e c t o r
ple TASTYL file with the concrete syntax for evaluating a
c l i e n t . gx <<= GarbledVec ( v a l=s e r v e r . hx ) garbled file circuit is available at [56].
# compute minimum v a l u e
c l i e n t . gmin = c l i e n t . gx . m i n v a l u e ( )
4.2 Tools
The TASTY framework provides the following tools to
# c o n v e r t g a r b l e d t o u n s i g n e d v a l u e and o u t p u t
c l i e n t . min = Unsigned ( v a l=c l i e n t . gmin )
initialize, execute, and post-process TASTYL programs:
c l i e n t . min . o u t p u t ( d e s c=”minimum v a l u e ” ) tasty_init <path> creates a new directory which con-
tains a file protocol.py with a template for the TASTYL
program (the example program shown in Fig. 7) and a file
Figure 7: Example TASTYL Program protocol.ini which contains default configuration param-
eters such as the intended security level (cf. Table 4), or the
IP address and port of the server.
tasty <options> <path> is the runtime environment of
defined and read. Then, C’s input vector client.v is con- TASTY as explained in §4 (cf. Fig. 5): it analyzes the
verted into a homomorphic vector server.hv for S who mul- TASTYL program in path, establishes a TCP/IP socket be-
tiplies this component-wise with his input vector server.w tween server S and client C, and runs the setup phase and
resulting in the homomorphic vector server.hx. This homo- online phase of the SFE protocol. The option flags allow
morphic vector is converted into a garbled vector client.gx to overwrite the default parameters and to specify if run as
and its minimum value client.gmin is computed. Finally, server (-s) or as client (-c).
C obtains the intended output by decrypting (converting) Testing and Benchmarking. When invoked with the
client.gmin into the unsigned value client.min. -d option, tasty runs in driver mode. Here, the TASTYL
Type Conversions. Types can be naturally converted program is instrumented by a driver, an additional class
into each other by providing them as input to the con- written in protocol.py. The driver can invoke the protocol
structor of the target type, e.g., in Fig. 7, the unsigned multiple times with varying static parameters (e.g., different
vector client.v is converted into the homomorphic vector bitlengths) and inputs to the TASTYL program; the outputs
client.hv via client.hv=HomomorphicVec(val=client.v). of the TASTYL program are sent back to the driver which
The underlying conversion protocols are described in §2. allows to write functional test cases. The costs of each pro-
Garbled Bit Manipulations. To allow manipulation of tocol run, i.e., detailed information on the transferred data
single bits, a Garbled Value gv can be converted back and and timings of the sub-tasks of the protocol phases, are writ-
forth into a list of Garbled Bits (= Garbled 1-bit Values): ten into a file which can be post-processed as described next.
gv[i] yields the i-th garbled bit of gv (i = 0 is the least tasty_post <analyze_script> <cost_files> can post-
significant bit). Vice versa, a (unsigned) garbled m-bit value process the costs measured in one or more driver runs with
gv can be constructed from a list of m garbled bits, e.g., an analyze script, e.g., average, print, or plot graphs [22].
gv = Garbled(val=[gb0,gb1]). All graphs in this paper were plotted with tasty_post.
Send Operator. The send operator <<= transfers vari- A concrete example for how to use TASTY’s benchmark-
ables between the parties, e.g., in Fig. 7, hv is sent from C to ing capability is given in Appendix §C.
S with server.hv <<= client.hv. When combined with a
type conversion, the send operator invokes the correspond- 4.3 Primitives and Optimizations
ing conversion protocol, e.g., in Fig. 7, homomorphic vector In TASTY we implemented the following efficient primi-
hx held by S is converted into garbled vector gx held by C tives and automatic optimizations that allow to move expen-
with client.gx <<= GarbledVec(val=server.hx). sive operations as pre-computations into the setup phase (cf.
Binding of Variables. While constants can be declared Fig. 5) in order to achieve an online phase with low latency.
globally (e.g., N and L in Fig. 7), each variable has to be The modular architecture of TASTY allows easy extension
assigned to one of the parties as an attribute. with other primitives as well. Due to the lack of space we
Inferring Type and Length Automatically. For each mention the key-features of the used primitives and refer to
operator, TASTY automatically infers the bitlength and the description in [31] and the original papers for details.
type of the output variables from those of the input variables Pre-Defined Security Levels. TASTY has pre-defined
s.t. no overflow occurs. Homomorphic variables raise an ex- security levels following standard recommendations of NIST
and ECRYPT II [19] as shown in Table 4. By using matching as random oracle. The elliptic curve implementation pro-
basic primitives both security and efficiency are optimized vides (optional) point compression to reduce communication
simultaneously. We use elliptic curves from the SECG stan- at the cost of a negligibly larger computation overhead.
dard [53] and SHA-256 as cryptographic hash function. Compiler Optimizations. TASTY parses the TASTYL
program and performs several optimizations on the result-
ing abstract syntax tree (AST): constant propagation, dead
Table 4: Pre-Defined Security Levels in TASTY. code elimination, partial code evaluation, and loop unrolling.
Symmetric/
Security Level Asymmetric Curve [53]
Statistical 5. PERFORMANCE MEASUREMENTS
ultra-short 80 bit 1,248 bit secp160r1 We measure the performance of primitives implemented in
short 96 bit 1,776 bit secp192r1 TASTY and compare different protocols against each other
medium 112 bit 2,432 bit secp224r1 and with existing SFE implementations: multiplication cir-
long 128 bit 3,248 bit secp256r1 cuits and protocols based on GC or HE (§5.1), SFE of an
AES circuit generated by the Fairplay compiler (§5.2), and
SFE of large GCs (§5.3).
Homomorphic Encryption (HE). We use the addi- System Setup. All performance measurements are per-
tively homomorphic cryptosystem of Paillier [45]. As key formed on two desktop PCs with Intel Core 2 Duo CPU
generation for Paillier (an RSA modulus n) is computation- (E6850) running at 3.00GHz and 4GB RAM connected via
ally expensive and can be used over multiple protocol runs, Gigabit Ethernet. The system runs on 64 bit Gentoo Linux
the public key is generated and exchanged in the analyza- with Python version 2.6.5, gmpy version 1.11 and GMP ver-
tion phase. For efficient encryption we use the extensions sion 4.3.2. Unless stated otherwise, all measurements were
of [12, Sect. 6] for pre-computing expensive modular expo- performed for short-term security (cf. Table 4) and using
nentiations of the form rn mod n2 in the setup phase and point compression for elliptic curves (cf. §4.3).
only two modular multiplications per encryption in the on-
line phase. As C knows the factorization p, q of n, he uses 5.1 Multiplication Circuits and Protocols
Chinese remaindering modulo p and q for pre-computing rn As arithmetic circuits can express arbitrary computations
mod n2 and efficient decryption. Paillier ciphertexts have as sequence of additions and multiplications, multiplication
twice the length of the asymmetric security parameter as is a fundamental basic operation. Indeed, the main differ-
the ciphertext space is Z∗n2 . For modular arithmetics we use ence between SFE protocols based on arithmetic and boolean
gmpy [21], a Python wrapper for the GMP library [20]. circuits is the cost for multiplications. We present efficient
Garbled Circuits (GC). We use the GC construction multiplication circuits in §5.1.1 and compare the perfor-
with free XORs and garbled row reduction of [48] secure in mance of secure multiplication protocols in §5.1.2.
the random-oracle model. This GC construction provides
free XOR gates (no garbled table and negligible computa- 5.1.1 Multiplication Circuits
tion). For non-XOR d-input gates, the garbled table consists Textbook Multiplication. The usual way of multi-
of 2d − 1 entries (of size t + 1 bit each with symmetric se- plying two unsigned `-bit integers x and y, called “Text-
curity parameter t), creation requires 2d and evaluation 1 book Method”, multiplies x with each bit of y and adds
invocation of SHA-256 modeled as random oracle. up all the
Pproperly shifted results according to the formula
`−1 i 2
Circuits. For computations on Garbled Values/Vectors, x·y = i=0 xyi 2 . This results in a circuit with 2` − `
TASTY dynamically generates circuits using the efficient non-XOR 2-input gates [30].
circuit constructions of [30] which are optimized for a low Karatsuba Multiplication. As observed by Karatsuba
number of non-XOR gates (cf. §5.1.1 for multiplication cir- [28], multiplication can be performed more efficiently using
cuits). Alternatively, circuits can be generated externally, the following recursive method (details in Algorithm 1): x
e.g., using the Fairplay compiler [37], and read from a file (cf. and y are split into two halves as x = xh 2d`/2e + xl and
§4.1.2). TASTY optimizes the circuits to a low number of y = yh 2d`/2e + yl . Then, the product can be computed as
non-XOR gates using the optimization of [48] which replaces xy = (xh 2d`/2e +xl )(yh 2d`/2e +yl ) = zh 22d`/2e +zd 2d`/2e +zl .
3-input gates with a low number of 2-input non-XOR gates. After computing zh = xh yh and zl = xl yl , zd can be com-
XNOR gates are replaced by an XOR gate and an inversion puted with only one multiplication as zd = (xh + xl )(yh +
gate which is propagated into successor gates [46]. Generat- yl ) − zh − zl . This process is continued recursively until
ing, reading, and optimizing circuits is mostly pipelined to the numbers are sufficiently small (` = 19 in our case as
allow processing of large circuits with low memory footprint. described below) and multiplied with the classical school
Oblivious Transfer (OT). All OTs are pre-computed in method. Overall, multiplying two ` bit numbers with Karat-
the setup phase (cf. Fig. 5) using the construction of [2]; the suba’s method requires three multiplications of `/2 bit num-
resulting online phase for OT is highly efficient (transfer and bers and some additions and subtractions with linear bit
XOR of bitstrings) and depends mostly on the network la- complexity resulting in costs
tency for two messages. To minimize the computation com-
TKara (`) = 3TKara (`/2) + c` + d
plexity of the setup phase, we use the efficient OT extension
of [24] to reduce the usually large number of OTs needed for constants c and d. The master theorem [8, §4.3f] yields
in the protocol down to at most t real OTs and some in- asymptotic complexity TKara (`) ∈ O(`log2 3 ) ≈ O(`1.585 ).
vocations of SHA-256 modeled as random oracle, where t is Circuit Complexity. In TASTY we have implemented
the symmetric (computational) security parameter. The re- both methods for multiplication based on efficient addition
maining real OTs (at most t) are implemented with the OT and subtraction circuits of [30]. As shown in Fig. 8 and Ta-
protocol of [39, Sect. 3.1] using elliptic curves and SHA-256 ble 5, Karatsuba multiplication is more efficient, i.e., results
Algorithm 1 Karatsuba multiplication consider the case where both inputs are provided by one
1: function karatsuba(x, y) . x, y are `-bit integers party (S for GC1 and C for HE1), or one by each of the
2: if ` ≤ 19 then parties (GC2 and HE2). The inputs are Unsigned `-bit val-
3: return Textbook(x, y) ues and the output, a 2`-bit Unsigned value is converted
4: end if into a Plain output for C. In the following, we compare
5: xh ||xl ← x . x = xh 2d`/2e + xl the communication- and the computation complexity of the
6: yh ||yl ← y . y = yh 2d`/2e + yl setup- and online phase of the protocols.
7: Ph ← KARATSUBA(xh , yh ) 1e+07
8: Pl ← KARATSUBA(yl , yl ) HE1: Online
HE2: Online
9: xs ← xh + xl GC1: Setup
GC1: Online
GC2: Setup
10: ys ← yh + yl 1e+06
GC2: Online
11: Ps ← KARATSUBA(xs , ys )
12: Pd ← Ps − Ph − Pl 100000
Data in Bytes
13: return (Ph 22d`/2 e) + Pd 2d`/2e + Pl
14: end function
10000
100
10
8
Time in ms
0
0 10 20 30 40 50 60 70 80
bitlength