Friday, February 25, 2011

Step1:1 - ".emu" file format without policy

I want to structure the ".emu" file more simple and easy to process, decided to use the JSON format (as Amazon) and came up with following formats for Buckets and Object. Right now this format is more theoretical, it will be updated iteratively.

Once this format is applied for the Buckets and Objects, planning to include Policy also.


Bucket Format


|- myBucket
|- .emu -> Json file represents bucket properties

{ "bucketname": {
"accessControlList": [
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
],
"AWSAccessKeyId": "AWS87289288929829kkjkhS88",
"timestamp": "2010-10-02 11:00:00",
"signature": "AAAAAAAAAAAAAAAAAAAAAAAAAAA",
"objectCount": "0"
}
}

Object Format


|- Object1
|- Object2
|- Key1/Object3
|- Key2/Object4
|- ".emu"

{ "bucketname": {
"objects": [
{"object1": {
"key": "Key1",
"metadata": [
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }},
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }},
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }}
],
"data": "1234567",
"contentlength": "5",
"accessControlList": [
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
],
"AWSAccessKeyId": "AWS87289288929829kkjkhS88",
"timestamp": "2010-10-02 11:00:00",
"signature": "AAAAAAAAAAAAAAAAAAAAAAAAAAA }},
{"object2": {
"key": "Key2",
"metadata": [
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }},
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }},
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }}
],
"data": "1234567",
"contentlength": "5",
"accessControlList": [
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
],
"AWSAccessKeyId": "AWS87289288929829kkjkhS88",
"timestamp": "2010-10-02 11:00:00",
"signature": "AAAAAAAAAAAAAAAAAAAAAAAAAAA" }},
{"object3": {
"key": "Key3",
"metadata": [
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }},
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }},
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }}
],
"data": "1234567",
"contentlength": "5",
"accessControlList": [
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
],
"AWSAccessKeyId": "AWS87289288929829kkjkhS88",
"timestamp": "2010-10-02 11:00:00",
"signature": "AAAAAAAAAAAAAAAAAAAAAAAAAAA" }},
{"object4": {
"key": "Key4",
"metadata": [
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }},
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }},
{"metadata_obj": { "name": "Content-Type", "value": "text/plain" }}
],
"data": "1234567",
"contentlength": "5",
"accessControlList": [
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
{ "ID": "a9a7b886d6fd24a52fe8ca5bef65f89a64e0193f23000e241bf9b1c61be666e9", "DisplayName":"chriscustomer", "Permission": "FULL_CONTROL"},
],
"AWSAccessKeyId": "AWS87289288929829kkjkhS88",
"timestamp": "2010-10-02 11:00:00",
"signature": "AAAAAAAAAAAAAAAAAAAAAAAAAAA" }}
]
}
}

Saturday, February 19, 2011

Step1:0: Study - Amazon S3 Emulator

There are lot of ways to implement this emulator and lot of existing implementation also available to do this, but purpose of this emulator implementation is to understand the Amazon S3 operations (which I am going to use in upcoming steps). My only requirement is "it should be identical to the Amazon S3 behaviour", way we access the service and all other functionalities. Intern I want to test with Amazon Java/PHP APIs (sorry I am not interested in .Net)

Downloaded the AmazonS3.wsdl & AmazonS3.xsd and generated the Axis implementation for the same. Initially thought of using database with hibernate to store the buckets and objects, but with this option we cannot access the uploaded objects as Amazon S3 provides. Decided to store the objects in the disk as node file, which web server can access. I have planned to use Apache Commons IO and Apache MINA libraries for the same.

Bucket

- Will be like webapps

- Access control, AWSAccessKeyId, timestamp, etc list will be maintained in ".emu" file. (like .svn not folder, but file)

Example:

<webapps>/myBuket/

Can be accessed from outside like, http://localhost:port/myBuket/

Object

- Key will be name of the object (file)

- Access control, AWSAccessKeyId, timestamp, etc list will be maintained in ".emu" file. (like .svn not folder, but file)

Example:

webapps/myBuket/profile.jpg

Can be accessed from outside like, http://localhost:port/myBuket/profile.jpg

Similarly all other operations supported by the Amazon S3 will be provided.

Next

Step1:1 Complete: Amazon S3 Emulator – Policy Implementation, Step1:2 Complete: Amazon S3 Emulator, Step1:3 Study - Amazon S3 with Hadoop,…..

Step 2:0 Study - Apache Jackrabbit - Amazon S3..