In part 1, we have seen that why we choose Amazon SimpleDB for our analytics purpose. In this part, we will see some more information about SimpleDB.

Amazon SimpleDB

For basic features of SimpleDB like Low-touch nature, Scalability, High-Availability, Flexibility, Pricing etc, please visit product page here. Here we will discuss about how SDB works and about its design.

SDB Explorer

We used SDB Explorer, a SimpleDB client for Mac. SDB Explorer is a user interface to explore Amazon's SimpleDB and it works on all the major operating systems. Provide your AWS Access Key and AWS Secret Key the first time you start SDB Explorer. Afterwards you will automatically be logged into your SimpleDB account whenever you start SDB Explorer. Another options is SDB tool, a firefox plugin, available on all the major operating systems.

SDB explorer Screen shot

SimpleDB Design and Data Model:

SimpleDB is a highly scalable NoSQL data storage system. In SimpleDB, data is stored as key-value pairs. It doesn’t have normalization, joints, schemas etc as we see in a relational Database. SimpleDB data model consists of domains, items, attributes and values. SDB data model is similar to that of a spreadsheet, except that each 'cell' can have multiple values. The below screenshot will let us understand about SDB data model more clearly.

Image Courtesy: Simple DB documentation from Amazon

Now lets understand what each term means in SimpleDB data model:

This is the most confusing term in SDB data model. Many people confuse domain as analogous to database in a relational DB model. In fact a domain can be seen as analogous to a table in a RDBMS or a worksheet in a spreadsheet. You can have up to 100 domains per AWS account. You can however increase this limit by submitting a request to Amazon. SimpleDB currently enables individual domains to grow up to 10 GB each. If your data set is larger than 10 GB, you have to spread your data over multiple domains.

Items are similar to rows in a RDBMS table. Each item is identified by a unique key or identifier, or in relational database terminology, a primary key. We will be using country codes as items for our purpose.

Attributes are similar to columns in RDBMSs and spreadsheets. SimpleDB, unlike relational database management systems, allows you have different attributes for each item in a domain. This schema independence allows you to add attributes to your domain on the fly. So you can add attributes (columns) as and when required without first having to go through a schema change equivalent.

Each attribute is associated with a value, which is the same as a cell in a spreadsheet or the value of a column in a database. To further illustrate the difference between SimpleDB and relational databases, a RDBMS or a spreadsheet supports only a single value per cell or column, whereas SimpleDB allows you to have multiple values for a single item attribute.

In next parts, we will discuss about the library (ruby gem rightaws), we used for coding, and code.

Subscribe - To get an automatic feed of all future posts subscribe here, or to receive them via email go here and enter your email address in the box. You can also like us on facebook and follow me on Twitter @akashag1001.