Note: The Pages o’ Peat have moved to http://peat.org/ — please update your bookmarks and references accordingly. Thank you!

I ran across this error message when I made a slight mistake when booting an OpenSolaris image on EC2:

Client.InvalidParameterValue: Invalid value 'solaris.indiana' for kernel profile. Supported values are [default, solaris, freebsd].

I’ve been waiting for this since EC2 was announced. Anyone have more information on the status of FreeBSD on EC2?

Note: The Pages o’ Peat have moved to http://peat.org/ — please update your bookmarks and references accordingly. Thank you!

Interesting news this week — along with the release of Project Indiana, Sun is also providing limited access to OpenSolaris images running on Amazon’s Elastic Compute Cloud. I’m keen to try it out, but at the same time I’m a little skeptical about the whole thing.

I have high hopes for Project Indiana. After working with Joyent Accelerators, there are a lot of things I like about Solaris (the service manager, ZFS, DTrace, etc.) and a lot of things I don’t like (awkward package management, very DIY for relatively simple things).

Indiana running on EC2 instances is a good way to introduce people to the platform, but it’s a bummer you have to register with Sun and get their permission before jumping in the pool. I hope the waiting list isn’t too long. I’m itching to play.

Hopefully Indiana on EC2 is lean, mean, and easy to get started with … but I have my doubts that it will be a replacement for my current Ubuntu AMIs. I don’t have any super custom configurations, I just don’t think EC2 isn’t the kind of environment where Solaris really shines — EC2 is lots of little servers, not a big box with a bunch of cores and spindles. Regardless, I’m an optimist, and I look forward to being proven wrong.

I’m waiting on access to the Project Indiana AMIs. I’ll report back as soon as I get my feet wet.

Update: I’ve been accepted to the beta program, but I don’t think I can do a test drive until this weekend. More information then!

Note: The Pages o’ Peat have moved to http://peat.org/ — please update your bookmarks and references accordingly. Thank you!

I received a friendly e-mail this morning from Amazon, announcing persistent storage for EC2 instances. From the looks of it, the storage behaves like NAS — it exists independent of the instances you’re using, and can be mounted whenever you like. Not bad. I’m interested to see what the IO performance is like.

Other features include:

* Snapshots, to back up the storage to S3.
* Multiple volumes per instance.
* Shows up as a block device on the instance, so any filesystem can be used.

Persistent storage is in limited private beta right now, but according to the announcement it should be publicly available “later this year.”

Catching The Next Wave

April 8, 2008

Note: The Pages o’ Peat have moved to http://peat.org/ — please update your bookmarks and references accordingly. Thank you!

Every now and then a set of technologies gets twisted together by a small group of dedicated people, and a new industry is born — a watershed event that demonstrates a new way of thinking about things, and throws out a lot of old rules.

There are a three that are coming together to trigger another watershed.

The first is open, popular, mobile Internet devices.  Think Blackberry, iPhone, or the slew of new MIDs that Intel showed off a few days ago in Shanghai.   These are built around the assumption of ubiquitous access to the Internet, high resolution displays, multimedia capabilities, and a bit of horsepower under the hood.  Any college student can get their hands on the Android or iPhone or Windows Mobile SDKs and build a hot little application in their spare time.

The second is web services.  It doesn’t matter if it’s WS* or REST or XML or JSON — the point is being able to query and manipulate data at a distance, with open protocols across public and private networks.  Pick your web framework of choice … building a web service is almost a drag and drop process today.

The third and final piece is cheap and scalable cloud computing.  The physical infrastructure capable of serving billions of transactions is available to anyone with a credit card and a little spare time on the weekend.  Amazon’s Web Services, Google’s App Engine, and a slew of smaller providers sell scalable computing and bandwidth by the hour and gigabyte.

These three fit together to form a fundamentally different picture of mobile computing:  light weight applications that fit in your pocket that take advantage of the local hardware, but seamlessly tap into “Internet scale” computing power and storage.

I’ve talked with a dozen entrepreneurs in as many months who are exploring these waters.  Streaming media (push and pull), information discovery and analysis, mobile social interactions, and location aware applications all depend on this trinity of capabilities.  I’m just one guy in a groundswell of people who are looking at the landscape and thinking “hot damn!”

What makes this so exciting is how easy it is to do today.  You don’t need a dozen engineers and a multi-million dollar budget.  You don’t need to negotiate with a corporate gatekeeper.  You don’t need to pitch to VCs.  You don’t need to wait.

2009 is going to bring a wave of media rich, location aware, always connected mobile applications to hundreds of millions of people.  I’m confident we’ll see a real forehead slapper by the end of 2008 — a tool or service that is painfully obvious, but fundamentally changes how we think about a day to day task.  It’ll make a millionaire or two, at the very least.

This will be fun.  :)

Note: The Pages o’ Peat have moved to http://peat.org/ — please update your bookmarks and references accordingly. Thank you!

Amazon announced Elastic IP Addresses for their Elastic Cloud Computing (EC2) service this morning, which removes one of the biggest hurdles for deploying web sites on the service. Previously, customers had no control over the IP addresses assigned to their EC2 instances, a frustrating situation for anyone wanting to reliably point a domain into the cloud.

Elastic IP Addresses solve this issue in a rather elegant way, by assigning a static IP address to your EC2 account, and providing a mechanism for routing that address to any of your EC2 instances. This system provides a reliable address for DNS, and enables failover and takeover features for applications with high availability requirements.

Kudos to Amazon!

Werner Vogels, the CTO at Amazon, has a great post about the contentious idea of “eventual consistency” for the new SimpleDB service. The idea that a database could be inconsistent is a little disconcerting to a lot of people — after all, inconsistent means unpredictable, and that just doesn’t fly for us deterministic computer people. Right?

Well, “eventual consistency” isn’t entirely unpredictable. And, it has it’s benefits — especially when it means avoiding locking on highly concurrent read and write operations. That’s exactly what SimpleDB was designed to do. To quote Vogels:

“Inconsistency can be tolerated for two reasons: for improving read and write performance under highly concurrent conditions and for handling partition cases where a majority model would render part of the system unavailable even though the nodes are up and running.

“Whether or not inconsistencies are acceptable depends on the client application. A specific popular case is a website scenario in which we can have the notion of user-perceived consistency; the inconsistency window needs to be smaller than the time expected for the customer to return for the next page load. This allows for updates to propagate through the system, before the next read is expected.”

(from “Eventually Consistent“)

SimpleDB was intentionally designed to behave this way, which means it certainly wasn’t built to replace traditional ACID relational databases for all scenarios. If you think about how often you require immediate consistency in your web applications, you’ll likely find that a very significant portion of your database interactions don’t.

My biggest concern about SimpleDB isn’t consistency or relationships, it’s latency. SimpleDB queries from outside of the Amazon cloud won’t be fast enough to feed sites that require more than a couple of queries per page — unless those queries can be executed in parallel, which isn’t an easy option in single-threaded web environments (PHP, Rails, etc.).

I’m excited to see how it operates with parallel queries, though. If an application is built to make dozens of queries simultaneously, rather than sequentially, the performance could be excellent.

I have a little Java toolkit for querying web services in parallel, and I’m itching to unleash it on SimpleDB. All this hot air blowing isn’t worth much without real numbers, right? :)

Amazon SimpleDB

December 14, 2007

Amazon will soon be releasing their SimpleDB service under a limited beta program.

I’m very excited about this. Persistent, high performance databases are a big missing piece in Amazon’s cloud computing initiative — EC2 doesn’t offer storage that persists across reboots, and S3 isn’t structured to provide the IO required by a database.

Conceptually, SimpleDB is very compelling. It’s designed for real time querying, has no hard limits on storage, and is metered based on storage and CPU time. It looks a lot like Amazon’s Dynamo technology … and it wouldn’t surprise me if they released the Dynamo paper to gauge interest in exposing such a service.

But, there are three big caveats.

It’s not SQL. This isn’t actually as big a deal as it seems, but I know there are going to be a lot of people who are bent out of shape on this one. Why isn’t it a big deal? Because …

It’s not relational. SimpleDB provides a big flat table, with arbitrary attributes per row. So, queries are all about filtering through data, and while they can have very complex rules, it doesn’t behave like the “normal” relational databases we’re accustomed to using.

Updates are “eventually consistent.” This means that if you immediately query for data you just pushed into SimpleDB, it may not show up. You have a guarantee that it will show up within a few seconds, but not immediately. Amazon calls this “eventual consistency.”

It may be a little scary for some folks who are most comfortable with the traditional model of building apps around a single relational database. On the other hand, it appears to be a great system for people who have built big websites, and are already comfortable dealing with lazy synchronization and custom data sources.

I’m looking forward to playing with it!

(Tip o’ the hat to @grigs)

Update: Here’s a great post that goes into a little more detail about the give and take of SimpleDB. Fun fact: it’s written in Erlang.

Update:  More stats.  Looks like their opening offering lets you create up to 100 “domains” containing up to 10 GB of data each.   That’s a good start.

Behind the Curtain

October 4, 2007

Here’s an interesting excerpt from Werner Vogels’ Dynamo paper about some of the guts behind Amazon’s e-commerce platform:

“For example a page request to one of the e-commerce sites typically requires the rendering engine to construct its response by sending requests to over 150 services. These services often have multiple dependencies, which frequently are other services, and as such it is not uncommon for the call graph of an application to have more than one level.”

There’s no doubt Amazon uses extensive caching to keep performance up, but 150+ service calls to render a page is remarkable, regardless of how you cut it. Even more impressive is how all of these services are built around the assumption that something, somewhere is failing: disks are crashing, networks are flapping, and processes are dying.

Check out the paper for more details.

Update:  It looks like Ars Technica got interested and put together a little write up on Dynamo.

I had lunch yesterday with some of the fine folks at JanRain, and one of our discussions was about Amazon’s S3 … can a business actually save money, using it for file storage and distribution? It turns out there’s a few pretty good cases for it, the most impressive being SmugMug saving half a million bucks vs. their DIY approach.

But things are changing in June. Amazon unveiled a new pricing model for S3, which is a little more complex than the previous $0.15 per gigabyte stored per month, with $0.20 per gigabyte transferred (simple, ‘eh?).

The storage cost is the same, but transfers have been lowered and put into a tiered structure, and there’s an additional charge for each request:

  • $0.10 per GB – all data uploaded
  • $0.18 per GB – first 10 TB / month data downloaded
  • $0.16 per GB – next 40 TB / month data downloaded
  • $0.13 per GB – data downloaded / month over 50 TB
  • $0.01 per 1,000 PUT or LIST requests
  • $0.01 per 10,000 GET and all other requests

I’m not a big fan of complexity, but Amazon seems to think it’ll save most of us some money:

If this new pricing had been applied to customers’ March 2007 usage, 75% of customers would have seen their bill decrease, while an additional 11% would have seen an increase of less than 10%. Only 14% of customers would have experienced an increase of greater than 10%.

Fair enough, I suppose.

Firefox EC2 Plugin

April 10, 2007

More Firefox goodies!  A short while ago I noted the S3Fox extension for wrangling files on Amazon’s S3 file storage service.  Today I found a great extension for managing EC2 machines — the aptly named EC2 UI.  It’s about as simple as it gets.  Very handy!

Follow

Get every new post delivered to your Inbox.