Python wrapper for Windows Azure storage

The best way to really learn a system is to write code against it. I spent some time over a weekend and started writing a Python wrapper on top of our storage APIs. I've gotten it to the point where I can authenticate against a storage endpoint, be it development storage on your local machine or the *.core.windows.net endpoints in the sky. I spent some time today implementing the basic blob primitives (list containers, get/put blob but there is a long way to go before it is usable).

Though it is raw, I'm making the code public since I thought it would be instructive for folks trying to figure out how authentication works or trying to implement wrappers in other languages.

image

I'm hosting the code at a GitHub repository here and you can follow commits as I make them. I'm hoping to get all the blob primitives done by the end of the week and queues if possible. I need to spend some time thinking about what the right way to model table storage in Python is. Here is some sample code below to put and get a blob and a container out of storage


conn = WAStorageConnection(DEVSTORE_HOST, DEVSTORE_ACCOUNT, DEVSTORE_SECRET_KEY)
for (container_name,etag, last_modified ) in conn.list_containers():
print container_name
print etag
print last_modified

conn.create_container("testcontainer", False)
conn.put_blob("testcontainer","test","Hello World!" )
print conn.get_blob("testcontainer", "test")

Here's the magic signing code. This is based on my reading of the docs so this shouldn't be considered 'official' in any way (consult the SDK sample for something of production quality)




def _get_auth_header(self, http_method, path, data, headers):
# As documented at http://msdn.microsoft.com/en-us/library/dd179428.aspx
string_to_sign =""

#First element is the method
string_to_sign += http_method + NEW_LINE

#Second is the optional content MD5
string_to_sign += NEW_LINE

#content type - this should have been initialized atleast to a blank value
if headers.has_key("content-type"):
string_to_sign += headers["content-type"]
string_to_sign += NEW_LINE

# date - we don't need to add header here since the special date storage header
# always exists in our implementation
string_to_sign += NEW_LINE

# Construct canonicalized storage headers.
# TODO: Note that this doesn't implement parts of the spec - combining header fields with same name,
# unfolding long lines and trimming white spaces around the colon

ms_headers =[header_key for header_key in headers.keys() if header_key.startswith(PREFIX_STORAGE_HEADER)]
ms_headers.sort()
for header_key in ms_headers:
string_to_sign += "%s:%s%s" % (header_key, headers[header_key], NEW_LINE)

# Add canonicalized resource
string_to_sign += "/" + self.account_name + path
utf8_string_to_sign = unicode(string_to_sign).encode("utf-8")
hmac_digest = hmac.new(self.secret_key, utf8_string_to_sign, hashlib.sha256).digest()
return base64.encodestring(hmac_digest).strip()

NOTE: This is code written by a program manager with too much free time :-). Use at your own risk and don't blame me if your computer blows up or if demons fly out of your nose.


Thanks to Igor Dvorkin for helping me out with this.

Labels:


Comments:
Did you notice that the indentation is not proper when you pasted the code? Maybe you should embedd pastebin or just put up a link to it?
 
Proper indentation now :-)
 
just what i was looking for!

thanks!
 
Post a Comment



<< Home

Archives

November 2004   January 2006   June 2006   July 2006   August 2006   September 2006   October 2006   November 2006   December 2006   January 2007   February 2007   March 2007   April 2007   May 2007   June 2007   July 2007   August 2007   September 2007   October 2007   December 2007   January 2008   February 2008   March 2008   April 2008   May 2008   June 2008   July 2008   August 2008   September 2008   October 2008   November 2008   December 2008   January 2009   April 2009