Loss of Availability: Depsky-A
Loss of Availability: Depsky-A
Loss of Availability: Depsky-A
Protocols
Below is a brief explanation of the DepSky protocols to store data in a cloud-of-
clouds. All of them replicate the data for all clouds used but only is ensured that
the data is properly stored in three (due to the Byzantines quoruns).
DepSky-A
This protocol replicates all the data in clear text in each cloud.
DepSky-CA
This protocol uses secret sharing and erasure code techniques to replicate the
data in a cloud-of-clouds. The image below show how this is donne. First is
generated an encryption key, and after that the original data block is encrypted.
Then the encrypted data block is erasure coded and are computed key shares of
the encryption key. In this case we get four erasure coded blocks and four key
shares because we use four clouds. Lastly, is stored in each cloud a different
coded block together with a different key share.
<FIGURE>
DepSky-only-JSS
This protocol only use secret sharing. Basically, is generated an encryption key
and the data is encrypted. Then is generated four key shares of the key. Finally
are spread by each cloud the data encrypted together with a different key share.
DepSky-only-JEC
On the other hand, this protocol only use erasure codes to replicate the data. The
data is erasure coded in four different blocks and then each of them is stored in a
different provider.
This protocol may be useful to those who your application already encrypt the
data.
Costs
As would be expected, a DepSky client would be required to pay four (using a
cloud-of-clouds of four cloud providers) times more than he would pay if uses a
single cloud. That not happens (if using DepSky-CA protocol) due to the erasure
codes techniques. The erasure codes technique used (see JEC) allow us to store
in each of the four cloud providers only half of the orginal block data size. So,
using DepSky, the client only will pay twice more than using a single cloud.
For more information see the DepSky paper. You can find it here EuroSys'11
paper.
First of all, you need to download the latest stable version available and extract it.
Make sure you have java 1.7 or later installed.
Done this, you need to fill up the accounts.properties file (you can find it inside
the config folder). To fill up this file you need first create accounts in the cloud
providers we support. To do that follow the links below:
Amazon S3
Google Storage
RackSpace Files
After create the accounts you have access to yours API keys and so, you can fill
up the accounts.properties file. To help you to find your keys, follow the steps
below.
To find Google Storage keys go to the Google API Console, and then go to
the Google Cloud Storage separator. Now choose Interoperable Access
and there you can find your keys. Don't forget first enable Google Cloud
Storage in the services separator.
To find Windows Azure keys go to the windows azure portal. First you
need to create a new storage project. After select this new project, at the
bottom of the page, you can find the key management. In this case your
access key is your storage project name and you secret key is the primary
key in the key management.
If you only want to use Amazon S3 as your cloud storage provider, you can only
create one account at Amazon S3 and use the example file provided
(config/accounts_amazon.properties). To do that, copy the content of the
'accounts_amazon.properties' file to the one mentioned before
(config/accounts.properties). In this case will be used four different Amazon S3
locations to store the data (US_Standard, EU_Ireland, US_West and AP_Tokyo).
The first one is the client id (for now use ids below 6 because we only have
keys generated for ids until the 6).
The second argument indicates what protocol will be used to replicate the
data. There are 4 possibilities:
o 1 if you want to store all the data locally (testing purposes). If you
want to use the local storage you need first run the server that can
be found
in src.depskys.clouds.drivers.localStorageService.ServerThread. To
run this server you can use the Run_LocalStorage.sh script at the
root fo the project. This server will receive all requests at
ip 127.0.0.1 and port 5555.
Let us give you an example. If we run DepSky with the command below, we
gonna start a session with the client id 0, all the data will be replicated using
erasure codes and secret sharing and will be stored on the cloud providers.
$ ./DepSky_Run 0 1 0
This main allow you to read, write and delete. You have five commands
available:
pick_du 'name' - will change the container that you are using to read and
write.
write 'data' - will write a new version with the content 'data' over the
container selected.
read - will read the last version written to the container selected
delete - will delete all the data (data and metadata files) associated with
the container selected.
read_m 'num' - will read old versions over the container selected. If 'num' =
0, will read the last version written, if 'num' = 1, will read the penultimate
version written, etc. Note that only is possible read old versions written in
this session because this main maintain all the information in memory. To
read all the old versions this main must be changed.
This main is not enough to take advantage of all the functionalities provided by
DepSky. To learn more about all you can do with DepSky read the nexte section.
this.clientId = clientId;
DepSkySKeyLoader keyLoader = new DepSkySKeyLoader(null);
if(!useModel){
this.cloud1 = new LocalDiskDriver("cloud1");
this.cloud2 = new LocalDiskDriver("cloud2");
this.cloud3 = new LocalDiskDriver("cloud3");
this.cloud4 = new LocalDiskDriver("cloud4");
this.drivers = new IDepSkySDriver[]{cloud1, cloud2, cloud3, cloud4};
}else{
List<String[][]> credentials = null;
try {
credentials = readCredentials();
} catch (FileNotFoundException e) {
System.out.println("accounts.properties file dosen't
exist!");
e.printStackTrace();
} catch (ParseException e) {
System.out.println("accounts.properties misconfigured!");
e.printStackTrace();
}
this.drivers = new IDepSkySDriver[4];
String type = null, driverId = null, accessKey = null, secretKey =
null;
for(int i = 0 ; i < credentials.size(); i++){
for(String[] pair : credentials.get(i)){
if(pair[0].equalsIgnoreCase("driver.type")){
type = pair[1];
}else if(pair[0].equalsIgnoreCase("driver.id")){
driverId = pair[1];
}else if(pair[0].equalsIgnoreCase("accessKey")){
accessKey = pair[1];
}else if(pair[0].equalsIgnoreCase("secretKey")){
secretKey = pair[1];
}
}
drivers[i] = DriversFactory.getDriver(type, driverId,
accessKey, secretKey);
}
}
this.manager = new DepSkySManager(drivers, this, keyLoader);
this.replies = new HashMap<Integer, CloudRepliesControlSet>();
this.N = drivers.length;
this.F = 1;
this.encoder = new ReedSolEncoder(2, 2, 8);
this.decoder = new ReedSolDecoder(2, 2, 8);
if(!startDrivers()){
System.out.println("Connection Error!");
}
}
(2)
public DepSkySDataUnit(String regId, String bucketName) {
...
Write
When you want to use the write operation, you have to pass
the DepSkySDataUnit object for which you want to write and the data to be
written. As we can see below, this operation return a byte[]. This byte[] is a SHA-
1 hash of the written data. This hash must be saved by the client if he want to
use the read matching operation (see bellow).
public synchronized byte[] write(DepSkySDataUnit reg, byte[] value) throws
Exception {
...
Read
To use this operation, you only have as argument the DepSkySDataUnit object.
This operation will read the last version written to this DepSkySDataUnit.
public synchronized byte[] read(DepSkySDataUnit reg) throws Exception {
...
Read Matching
Delete
The delete operation will delete all the files associated with the
given DepSkySDataUnit, that includes all the versions written and the metadata
file.
public synchronized void deleteContainer(DepSkySDataUnit reg) throws Exception{
...
SetAcl
For Amazon S3, the grantee user can find the canonicalId in the same page of
the access credential (see the beginning of this page). For the other clouds, the
information is quite intuitive. For Google Storage is only need the email of the
grantee (must be a gmail account). For RackSpace the name and the grantee.
Finally, for Windows Azure nothing is needed (see this paper).
This operation returns a LinkedList> with the same organization of the one given
as argument. This list must be given to the grantee user, as well as the name of
the DepSkySDataUnit in order he can access the shared resource. But first the
user who is sharing must add to it some information. More specifically, he must
add to the AMAZON-S3 pair his own cannonicalID, and to the GOOGLE-
STORAGE pair his email.
Once the grantee user have this list with he, he can use it in the other operations
(read, write, delete) to operate on the shared bucket.