Academia.eduAcademia.edu

Minimizing latency for secure distributed computing

2017 IEEE International Symposium on Information Theory (ISIT)

We consider the setting of a master server who possesses confidential data (genomic, medical data, etc.) and wants to run intensive computations on it, as part of a machine learning algorithm for example. The master wants to distribute these computations to untrusted workers who have volunteered or are incentivized to help with this task. However, the data must be kept private (in an information theoretic sense) and not revealed to the individual workers. The workers may be busy, or even unresponsive, and will take a random time to finish the task assigned to them. We are interested in reducing the aggregate delay experienced by the master. We focus on linear computations as an essential operation in many iterative algorithms. A known solution is to use a linear secret sharing scheme to divide the data into secret shares on which the workers can compute. We propose to use instead new secure codes, called Staircase codes, introduced previously by two of the authors. We study the delay induced by Staircase codes which is always less than that of secret sharing. The reason is that secret sharing schemes need to wait for the responses of a fixed fraction of the workers, whereas Staircase codes offer more flexibility in this respect. For instance, for codes with rate R = 1/2 Staircase codes can lead to up to 40% reduction in delay compared to secret sharing.

Minimizing Latency for Secure Distributed Computing arXiv:1703.01504v1 [cs.IT] 4 Mar 2017 Rawad Bitar, Parimal Parag, and Salim El Rouayheb Abstract—We consider the setting of a master server who possesses confidential data (genomic, medical data, etc.) and wants to run intensive computations on it, as part of a machine learning algorithm for example. The master wants to distribute these computations to untrusted workers who have volunteered or are incentivized to help with this task. However, the data must be kept private (in an information theoretic sense) and not revealed to the individual workers. The workers may be busy, or even unresponsive, and will take a random time to finish the task assigned to them. We are interested in reducing the aggregate delay experienced by the master. We focus on linear computations as an essential operation in many iterative algorithms. A known solution is to use a linear secret sharing scheme to divide the data into secret shares on which the workers can compute. We propose to use instead new secure codes, called Staircase codes, introduced previously by two of the authors. We study the delay induced by Staircase codes which is always less than that of secret sharing. The reason is that secret sharing schemes need to wait for the responses of a fixed fraction of the workers, whereas Staircase codes offer more flexibility in this respect. For instance, for codes with rate R = 1/2 Staircase codes can lead to up to 40% reduction in delay compared to secret sharing. I. I NTRODUCTION We consider the setting of distributed computing in which a server M, referred to as Master, possesses confidential data, such as personal information of online users, genomic and medical data etc., and wants to perform intensive computations on it. M wants to divide these computations into smaller computational tasks and distribute them to n worker machines that can perform these smaller tasks in parallel. The workers then return their results to the master, who can process them to obtain the result of its original task. The well celebrated MapReduce [1] framework falls under this model and is implemented in many computing clusters. In this paper, we are interested in applications in which the worker machines do not belong to the same system or cluster as the master. Rather, the workers are online computing machines that can be hired or can volunteer to help the master in its computations. Existing applications that fall under this model include the SETI@home project for search for extraterrestrial intelligence [2], the folding@home project for disease research that simulates protein folding [3] and Amazon mechanical turk1 [4]. The additional constraint that we worry about here, and which does not exist in the previous applications, is that the workers cannot be trusted with the sensitive R. Bitar and S. El Rouayheb are with the ECE department of Illinois Institute of Technology. P. Parag is with the ECE department of the Indian Institute of Science. Emails: [email protected], [email protected], [email protected]. 1 Amazon mechanical turk hires humans to perform tasks. But, one can imagine a similar application where computing machines are hired. data, which must remain hidden from them. Our privacy constraint is information theoretic, meaning that each worker must obtain zero information about the data irrespective of its computational power. We choose information theoretic privacy instead of homomorphic encryption, due to the high computation and memory overheads of the latter [5]. We focus on linear computations (matrix multiplication) since they form a basic building block of many iterative algorithms. The workers introduce random delays due to the difference of their workloads or network congestion. This causes the Master to wait for the slowest workers, referred to as stragglers in the distributed computing community [6]– [8]. In addition, some workers may never respond. Our goal is to reduce the delay at the Master caused by the workers. Privacy can be achieved by encoding the data using a linear secret sharing codes [9] as illustrated in Example 1. However, these codes are not specifically designed to minimize latency as we will highlight later. Example 1. Let the matrix A denote the data set owned by M and let x be a given vector. M wants to compute Ax . Suppose that M gets the help of n = 3 workers out of which at most n − k = 1 may be unresponsive. M generates a random matrix R of same dimensions as A and over the same field and encodes A and R into 3 shares S1 = R, S2 = R + A and S3 = R + 2A using a secret sharing scheme [10], [11]. First, M sends share Si to worker Wi (Figure 1a) and then sends x to all the workers. Each worker computes Si x and sends it back to M (Figure 1b). M can decode Ax after receiving any k = 2 responses. For instance, if the first two workers respond, M can obtain Ax = S2 x − S1 x. No information about A is revealed to the workers, because A is one-time padded by R. Workers Randomness R M Data A Master x W1 S1 S1 x W2 S2 x S2 S3 x S 3 W3 (a) M encodes A into 3 secret shares S1 , S2 , S3 and sends them to the workers. W2 M S2 Secret sharing W1 S 1x W3 x S 3 (b) M sends x to the workers. Each worker Wi computes Si x and sends the result to M. Fig. 1: Secure distributed matrix multiplication with 3 workers. The delay experienced by M in the previous example results from the fact it has to wait until k = 2 workers finish their whole tasks in order to decode Ax, even when the 3 workers are all responsive. This is due to the fact that classical secret sharing codes are designed for the worst-case scenario of one worker being unresponsive. We overcome this limitation by using Staircase codes which were introduced in [12] and are explained in the next example. Example 2 (Staircase code). Consider the same setting as Example 1. Instead of using a classical secret sharing code, M now encodes A and R using the Staircase code given in Table I. The Staircase code requires M to divide the matrices Worker 1 A 1 + A 2 + R1 R1 + R2 Worker 2 A1 + 2A2 + 4R1 R1 + 2R2 Worker 3 A1 + 3A2 + 4R1 R1 + 3R2 TABLE I: The shares sent by M to each worker. All operations are in GF (5).  T  T A and R into A = A1 A2 and R = R1 R2 . In this setting, M sends two subshares to each worker, hence each task consists of 2 subtasks. The master sends x to all the workers. Each worker multiplies the subshares by x (going top to bottom) and sends each multiplication back to M independently. Now, M has two possibilities for decoding: 1) M receives the first subtask from all the workers, i.e., receives (A1 +A2 +R1 )x, (A1 +2A2 +4R1 )x and (A1 +2A2 +4R1 )x and decodes Ax which is the concatenation of A1 x and A2 x. Note that M decodes only R1 x and does not need to decode R2 x. 2) M receives all the subtasks from any 2 workers and decodes Ax. Here M has to decode R1 x and R2 x. One can check that no information about A is revealed to the workers. Under an exponential delay model for each worker, we show that the Staircase code given in Example 2 can lead to a 25% improvement in delay over the secret sharing code given in Example 1. Our goal is to give a general systematic study of the delay incurred by Staircase codes and compare it to classical secret sharing codes. Related work: Straggler mitigation and privacy concerns are studied separately in the literature. In [13] Liang et al. adaptively encoded the tasks depending on the workload at the workers’ end. Lee et al. [8] used MDS codes to mitigate stragglers in linear distributed machine learning algorithms. Tandon et al. [14] introduced new codes for straggler mitigation in distributed gradient descent algorithms. Li et al. [15] studied the effect of the workers’ computation load on the communication complexity. On the other hand, privacy concerns have been studied in the machine learning literature, see e.g., [16]–[18]. The main model assumes that several parties owning private data sets want to train a model based on all the data sets without revealing them, e.g., [19], [20]. However, the techniques extensively rely on cryptographic assumptions and secure multi-party computation. Atallah and Frikken [9] studied the problem of distributively multiplying two private matrices assuming that k − 1 workers can collude (with k 2 < n2 ). The provided solution ensures information theoretic privacy, but does not account for straggler mitigation. Another related problem is federated learning [21]. A large number of users own different amounts of data and a central server aims to train a high-quality model based on all the data with the smallest communication complexity. However, privacy is ensured by keeping the data local to the users. Contributions: In this paper, we consider the model in which M owns the whole data set on which it wants to perform a distributed linear computation. We introduce a new approach for securely outsourcing the linear computations to n workers which do not own any parts of the data. The data set is to be kept private in an information theoretic sense. We assume that at most n − k, k < n, workers may be unresponsive, the remaining respond at random times. This is similar to the straggler problem. We study the master’s waiting time, i.e., the aggregate delays caused by the workers, under the exponential model when using Staircase codes. More specifically, we make the following contributions: (i) we derive an upper bound and a lower bound on the mean waiting time; (ii) we derive an integral expression leading to the CDF of the waiting time and use this expression to find the exact mean waiting time for the cases when k = n − 1 and k = n − 2; and (iii) we compare our approach to the approach using secret sharing and show that for high rates, k/n, and small number of workers our approach saves about 40% of the waiting time. Moreover, we ran simulations to check the tightness of the bounds and show that for low rates our approach saves at least 10% of the waiting time for all values of n. II. S YSTEM M ODEL We consider a server M which wants to perform intensive computations on confidential data represented by an m×ℓ matrix A (typically m >> ℓ). M divides these computations into smaller computational tasks and assigns them to n workers Wi , i = 1, . . . , n, that can perform these tasks in parallel. Computations model: We focus on linear computations. The motivation is that a building block in several iterative machine learning algorithms, such as gradient descent, is the multiplication of A by a sequence of ℓ × 1 attribute vectors x1 , x2 , . . . . In the sequel, we focus on the multiplication Ax with one attribute vector x. Workers model: The workers have the following properties: 1) At most n − k workers may be unresponsive. The actual number of unresponsive workers is unknown a priori. 2) The responsive workers incur random delays while executing the task assigned to them by M resulting in what is known as the straggler problem [6]–[8]. We model all the delays incurred by each worker by an independent and identical exponential random variable. 3) The workers do not collude, i.e., they do not share with each other the data they receive from M. This has implications on the privacy constraint described later. General scheme: M encodes A, using randomness, into n shares Si sent to worker Wi , i = 1, . . . , n. Any k or more shares can decode A. The workers obtain zero information about A, i.e., H(A|Si ) = H(A) for all i ∈ {1, . . . , n}. At each iteration, the master sends x to all the workers. Then, each worker computes Si x and sends it back to the master. Since the scheme and the computations are linear, the master FTi (t) , FTA ((k − 1)t) = 1 − e−(k−1)λt . For an (n, k) system using Staircase codes, we assume that Ti is evenly distributed between the subshares, i.e., the time spent by a worker Wi on one subshare is equal to Ti /α. Let T(i) be the ith order statistic of the Ti ’s and TSC be the time the master waits until it can decode Ax. We can write   k−1 T(d) , min αd T(d) , TSC = min d−1 d∈{k,...,n} d∈{k,...,n} Discussion: Our extensive simulations show that (1) is a good approximation of the mean waiting time. Moreover, by taking d = k in (1), the upper bound on the mean waiting time of Staircase codes becomes the one of classical secret sharing, i.e., Hn − Hn−k . (3) E[TSC ] ≤ E[TSS ] = λ(k − 1) While finding the exact expression of the mean waiting time for any (n, k) system remains open, we derive in Corollary 1 an expression for systems with 1 and 2 parities, i.e. (k + 1, k) and (k + 2, k) systems, using the result of Theorem 2. Using Corollary 1 one can compare the performance of Staircase codes an secret sharing codes. For instance, in a (4, 2) system Staircase codes reduce the mean waiting time by 40%. Theorem 2. Let ti , t(i − 1)/(k − 1), the CDF of the waiting time TSC of an (n, k) system using Staircase codes is given by Z F (yk )k−1 dF (yn ) · · · dF (yk ), (4) FTSC (t) = 1 − n! y∈A(t) (k − 1)! where A(t) = ∩i≥k {yi ∈ (ti , yi+1 ]} and F (yi ) = FTi (yi ). To check the tightness of the bounds we plot in Figure 2 the upper bound in (1), lower bound in (2) and the exact mean waiting time in (17) for (k + 2, k) systems. Mean waiting time can decode Ax after receiving enough responses2 . We refer to such scheme as an (n, k) system. Encoding: We consider classical secret sharing codes [10], [11] and universal Staircase codes [12]. Due to lack of space we only describe their properties that are necessary for performing the delay analysis. Secret sharing codes require the division of A into k − 1 row blocks and encodes them into n shares of dimension m/(k − 1) × ℓ each. Any k shares can decode A. Whereas, Staircase codes require the division of A into (k − 1)α row blocks, α = LCM{k, . . . , n − 1}, and encodes them into n shares. Each share consists of α subshares and is of dimension m/(k − 1) × ℓ. Any (k − 1)/(d − 1) fraction of any d shares can decode A, where d ∈ {k, . . . , n}. We show that Staircase codes outperform classical codes in terms of incurred delays. Delay model: Let TA be the random variable representing the time spent to compute Ax at one worker. We assume a mother runtime distribution FTA (t) that is exponential3 with rate λ. Due to the encoding, each task given to a worker is k − 1 times smaller than A. Let Ti , i ∈ {1, . . . , n} denote the time spent by worker Wi to execute its task, then we assume that FTi is a scaled distribution of FTA , i.e., 0.6 Lower bound in (1) Upper bound in (2) Mean waiting time in (17) 0.4 0.2 0 5 10 15 20 25 30 Number of workers n where αi , (k−1)/(i−1). For an (n, k) system using classical secret sharing codes, we can write TSS = T(k) . Fig. 2: Bounds on mean waiting time E[TSC ] for (k + 2, k) systems with λ = 1. III. M AIN R ESULTS Asymptotics: To better understand the above results, we look at the asymptotic behavior of the lower and upper bounds when n goes to infinity in two regimes: 1) For a constant number of parities r = n − k. The mean waiting time of the system is given by limn→∞ E[TSC ] = E[TSS ]. Meaning, in this regime there is no advantage in using Staircase codes (Figure 2). 2) For a fixed rate R = k/n. The mean waiting time can be bounded by E[TSC ] ≤ log (1/(1 − c)) /(λ(nc − 1)), where c is a constant satisfying R ≤ c < 1. In this regime, the mean waiting time of systems using Staircase codes is smaller by a constant factor s, s < 1/R, than systems using classical secret sharing codes (Figure 3b). Our main results are summarized as follows. We provide an upper bound and a lower bound on the mean waiting time of M in Theorem 1. Theorem 1. The mean waiting time E[TSC ] of an (n, k) system using Staircase codes is upper bounded by   Hn − Hn−d , (1) E[TSC ] ≤ min λ(d − 1) d∈{k,...,n} Pn where Hn is the nth harmonic sum defined as Hn , i=1 1i , and H0 , 0. The mean waiting time is lower bounded by i   k−1 X n  X L(d, i, j) i (−1)j E[TSC ] ≥ max , j i λ d∈{k,...,n} j=0 i=0 2 L(d, i, j) = . n(n − 1) + d(d − 1) − 2(i − j)(d − 1) 2 In (2) some cases the attribute vectors xj contain information about A, and therefore need to be hidden from the workers. We describe in [22] how our scheme can be generalized to such cases. 3 Our analysis remains true for the shifted exponential model [8], [13]. IV. P ROOF OF T HEOREM 1 We will need the following characterization of order statistics for iid exponential random variables. Theorem 3 (Renyi [23]). The dth order statistic T(d) of n iid exponential random variables Ti , with distribution function F (t) = 1 − e−λt , is equal to a random variable Z in the distribution, where T(d) , d−1 X Zj , n−j j=0 where F̄Zj (t) , Pr (Zj > t). From (9) we get   Y   n t (n − j + 1) Pr (C) = F̄T(k) F̄Zj t αd (k − 1) and Zj are iid random variables with distribution F (t). A. Upper bound on the mean waiting time We use Jensen’s inequality to upper bound the mean waiting time E[TSC ]. The exact mean waiting time is given by    k−1 E[TSC ] = E min T(d) . d−1 d∈{k,...,n} Since min is a convex function, we can use Jensen’s inequality to write       k−1 k−1 E E min ≤ min T(d) T(d) . d−1 d−1 d∈{k,...,n} d∈{k,...,n} (5) The average of the dth order statistic E[T(d) ] can be written as E[T(d) ] = E[Zi ] d−1 X j=0 1 Hn − Hn−d = n−j λ(k − 1) (6) Equations (5) and (6) conclude the proof. We give an intuitive behavior of the upper bound. The harmonic number can be approximated by Hn ≈ log(n) + γ, where γ ≈ 0.577218 is called the Euler-Mascheroni constant. Therefore, log(n) < Hn < log(n + 1). Hence, we can write      1 n+1    min , log   n−d d∈{k,...,n−1} λ(d − 1) E[TSC ] < min . 1       log (n + 1) λ(n − 1) (7) B. Lower bound on the mean waiting time To lower bound the mean waiting time E[TSC ], we find the probability distribution of a small (sufficient) set of conditions that result in TSC > t. This distribution serves as a lower bound on the exact distribution of TSC . For a given d ∈ {k, . . . , n}, consider the following set of conditions  \    n t t t − T(j) − T(j−1) > , C , T(k) > αd αj αj−1 j=d+1 where αj , (k − 1)/(j − 1). For TSC to be greater than t, all the j th order statistic T(j) ’s must be greater than t/αj for j ∈ {k, . . . , n}. We show that if C is satisfied, then the previous condition is satisfied. If T(k) > t/αd , then T(i) > t/αi for all i ∈ {k, . . . , d}, because T(i) ≥ T(k) > t/αd > t/αi . It follows that if for all j ∈ {d + 1, . . . , n}, T(j) − T(j−1) > t/αj − t/αj−1 , then T(j) > t/αj . Therefore, Pr (TSC > t) ≥ Pr (C is satisfied) , Pr(C). Furthermore, Z ∞ Z ∞ Pr(C)dt. (8) Pr (TSC > t) dt ≥ E[TSC ] = 0 0 R∞ (10) j=d+1 Next we derive an expression of 0 Pr (C) dt. Note that 1/αj − 1/αj−1 = 1/(k − 1), using Theorem 3 we can write     t (n − j + 1)t Pr T(j) − T(j−1) > = F̄Zj , (9) k−1 k−1 Since F̄Zj (t) = e−(k−1)λt , we can write     n n Y X (n − j + 1) (n − j + 1) t = F̄Zj  t F̄Zj (k − 1) (k − 1) j=d+1 j=d+1   (n − d)(n − d + 1) . = F̄Zj t 2(k − 1) (11) On the other hand, F̄T(k) (t/αd ) is the probability that there are at most k − 1 Ti ’s less than t/αd , therefore   k−1  n−i  i X n  t t t . (12) FTi = F̄T(k) F̄Ti αd αd αd i i=0 Recall that FTi (t) = 1 − e(k−1)λt = 1 − F̄Ti (t), therefore by using the binomial expansion we can write  j  i X i   t i t j (−1) F̄Ti = . (13) FT i j αd α d j=0 Using (13) and the fact that FTi (t) = e−(k−1)λt , (12) becomes   k−1   i   X n  X t (n − i + j)(d − 1) i j FT(k) (−1) F̄Ti t = . i j=0 j αd (k − 1) i=0 (14) Combining (11) and (14) and noting that F̄Ti (t) = F̄Zj (t) = e−λ(k−1)t , (10) becomes i   k−1 X n  X i (−1)j exp − λt(n − i + j)(d − 1) Pr (C) = j i j=0 i=0  − λt(n − d)(n − d + 1)/2 . (15) R ∞ −xt Note that 0 e dt = 1/x and that the integral of a sum is equal to the sum of the integrals. Therefore, integrating (15) from 0 to ∞ and maximizing it over all values of d, d ∈ {k, . . . , n}, concludes the proof. V. P ROOF OF T HEOREM 2 We derive an integral expression leading to the probability distribution of the waiting time TSC . Since the delays at the workers’ side Ti ’s are independent and are absolutely continuous with respect to the Lebesgue measure (i.e. the probability density exists), we have fT(1) ,...,T(n) (t1 , . . . , tn ) = n! n Y n fTi (ti ) = n!λ exp −λ n X ti i=1 i=1 where ti denotes t/αi and 0 ≤ t1 ≤ . . . ≤ tn . Therefore we can write the distribution of TSC as Z n \  fT(1) ,...,T(n) (y)dy, Pr{TSC > t} = Pr T(d) > td = d=k A(t) ! , E[TSC ] = k+2 X i=2 (−1)i k+2 i λ     k+1 1 i 1X i k+1 (−1) E [TSC ] = . − i λ i=2 k + (k − 1)(i − 1) ki  1 i(i − 1) i(i − 1) i . − + − (k + 1) + k(i − 1) (k + 1)i 4(k + 1) + 2(k − 1)(i − 2) (2k + 1) + (k − 1)(i − 2) (17) where yn+1 = ∞ and R EFERENCES A(t) = {0 ≤ y1 ≤ . . . ≤ yn : yd > td , for k ≤ d ≤ n} = ∩i≥k {yi ∈ (ti , yi+1 ]} ∩i<k {yi ∈ [0, yi+1 ]}. That is, we can re-write Pr{TSC > t} as Z yk+1 Y Z ∞ Z yk Z n ··· n! dFTi (yi ) ··· tn tk Claim 1. R yk 0 0 i=k ··· R y2 Qk−1 0 i=1 dFTi (yi ) = 0 y2 k−1 Y ! dFTi (yi ) . i=1 F (yk )k−1 (k−1)! . The result of Claim 1 is straightforward, it follows from integrating k − 1 times the complementary CDF of an exponential random variable in respect to its derivative. This completes the proof. A more detailed proof of Claim 1 can be found in [22]. We state the mean waiting time for the (k + 2, k) and (k + 1, k) systems in Corollary 1. Corollary 1. The mean waiting time E[TSC ] for (k + 1, k) and (k + 2, k) systems are given by (16) and (17), respectively. VI. S IMULATIONS We check the tightness of the bounds of Theorem 1 and measure the improvement, in terms of delays, of Staircase codes over classical secret sharing codes for systems with fixed rate R , k/n. In Figure 3 (a) we plot the upper bound (1), lower bound (2) and the simulated mean waiting time for R = 1/4. Our extensive simulations show that the upper bound is a good approximation of the exact mean waiting time, whereas the lower bound might be loose. 0.6 Lower bound in (1) Upper bound in (2) Staircase codes Secret sharing Mean waiting time Mean waiting time 0.3 0.2 0.1 0 0 20 40 60 (16) 80 Number of workers n (a) Waiting time for systems with rate R = 1/4. 100 Lower bound in (1) Upper bound in (2) Mean waiting time in (17) 0.4 0.2 0 5 10 15 20 25 30 Number of workers n (b) Delay reduction using Staircase codes. Fig. 3: Simulations for (n, k) systems with fixed rate. Figure 3 (b) aims to better understand the comparison between Staircase codes and classical codes. We plot the normalized difference between the mean waiting times, i.e., (E[TSS ] − E[TSC ]) / (E[TSS ]), for different rates. For high rates, Staircase codes offer high savings for small values of n, whereas for low rates Staircase codes offer high savings for all values of n. [1] J. Dean and S. Ghemawat, “Mapreduce: simplified data processing on large clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008. [2] https://setiathome.berkeley.edu. [3] https://foldingathome.stanford.edu. [4] https://www.mturk.com/mturk/welcome. [5] Z. Brakerski and V. Vaikuntanathan, “Efficient fully homomorphic encryption from (standard) lwe,” SIAM Journal on Computing, vol. 43, no. 2, pp. 831–871, 2014. [6] J. Dean and L. A. Barroso, “The tail at scale,” Communications of the ACM, vol. 56, no. 2, pp. 74–80, 2013. [7] G. Ananthanarayanan, S. Kandula, A. G. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris, “Reining in the outliers in map-reduce clusters using mantri.,” in OSDI, vol. 10, p. 24, 2010. [8] K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran, “Speeding up distributed machine learning using codes,” arXiv preprint arXiv:1512.02673, 2015. [9] M. J. Atallah and K. B. Frikken, “Securely outsourcing linear algebra computations,” in Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security, ASIACCS ’10, (New York, NY, USA), pp. 48–59, ACM, 2010. [10] A. Shamir, “How to share a secret,” Communications of the ACM, vol. 22, no. 11, pp. 612–613, 1979. [11] R. J. McEliece and D. V. Sarwate, “On sharing secrets and reed-solomon codes,” Communications of the ACM, vol. 24, no. 9, pp. 583–584, 1981. [12] R. Bitar and S. El Rouayheb, “Staircase codes for secret sharing with optimal communication and read overheads,” in IEEE International Symposium on Information Theory (ISIT), 2016. [13] G. Liang and U. C. Kozat, “TOFEC: Achieving optimal throughputdelay trade-off of cloud storage using erasure codes,” in IEEE International Conference on Computer Communications, 2014. [14] R. Tandon, Q. Lei, A. G. Dimakis, and N. Karampatziakis, “Gradient coding,” in 29th Conference on Neural Information Processing Systems (NIPS), 2016. [15] S. Li, M. A. Maddah-Ali, and A. S. Avestimehr, “Fundamental tradeoff between computation and communication in distributed computing,” in IEEE International Symposium on Information Theory (ISIT), 2016. [16] H. Takabi, E. Hesamifard, and M. Ghasemi, “Privacy preserving multiparty machine learning with homomorphic encryption,” in 29th Annual Conference on Neural Information Processing Systems (NIPS), 2016. [17] R. Hall, S. E. Fienberg, and Y. Nardi, “Secure multiple linear regression based on homomorphic encryption,” Journal of Official Statistics, vol. 27, no. 4, p. 669, 2011. [18] L. Kamm, D. Bogdanov, S. Laur, and J. Vilo, “A new way to protect privacy in large-scale genome-wide association studies,” Bioinformatics, vol. 29, no. 7, pp. 886–893, 2013. [19] V. Nikolaenko, U. Weinsberg, S. Ioannidis, M. Joye, D. Boneh, and N. Taft, “Privacy-preserving ridge regression on hundreds of millions of records,” in IEEE Symposium on Security and Privacy (SP), 2013. [20] A. Gascon, P. Schoppmann, B. Balle, M. Raykova, J. Doerner, S. Zahur, and D. Evans, “Secure linear regression on vertically partitioned datasets,” tech. rep., Cryptology ePrint Archive: Report 2016/892, 2016. [21] J. Konečnỳ, H. B. McMahan, D. Ramage, and P. Richtárik, “Federated optimization: distributed machine learning for on-device intelligence,” arXiv preprint arXiv:1610.02527, 2016. [22] R. Bitar, P. Parag, and S. El Rouayheb, “Minimizing latency for secure distributed computations-extended version.” http://www.ece.iit. edu/∼salim/minimizinglatency.pdf. [23] A. Rényi, “On the theory of order statistics,” Acta Mathematica Academiae Scientiarum Hungarica, vol. 4, no. 3-4, pp. 191–231, 1953.