Saturday, December 29, 2007

Three Reasons Not To Use Amazon SQS

I've seriously considered Amazon's Simple Queue Service (SQS) for a project I'm working on. That is, until I read the fine print.

1. SQS does not guarantee order which fundamentally violates the basic principle of the queue data structure.

2. SQS does not guarantee deletion of a message on the queue meaning that you must handle the case where a message could be processed twice.

3. SQS does not guarantee returning all the messages in the queue when queried.

Very disappointing. What appealed to me most was the fact that the queue service is guaranteed to always be up and running.

Here's Amazon's SQS documentation:

The following information can help you design your application to work with Amazon SQS correctly.

Message order—SQS makes a best effort to preserve order in messages, but due to the distributed nature of the queue, we cannot guarantee you will receive messages in the exact order you sent them. If your system requires that order be preserved, we recommend you place sequencing information in each message so you can reorder the messages upon receipt.

At-least-once delivery—SQS stores copies of your messages on multiple servers for redundancy and high availability. On rare occasions, one of the servers storing a copy of a message might be unavailable when you receive or delete the message. If that occurs, the copy of the message will not be deleted on that unavailable server, and you might get that message copy again when you receive messages. Because of this, you must design your application to be idempotent (i.e., it must not be adversely affected if it processes the same message more than once).

Message sampling—When you retrieve messages from the queue, SQS samples a subset of the servers (based on a weighted random distribution) and returns messages from just those servers. This means that a particular receive request might not return all your messages. Or, if you have a small number of messages in your queue (less than 1000), it means a particular request might not return any of your messages, whereas a subsequent request will. If you keep retrieving from your queues, SQS will sample all of the servers, and you will receive all of your messages. The figure below shows messages being returned after one of your system components makes a receive request. SQS samples several of the servers (in blue) and returns the messages from those servers (Message A, C, D, and B). Message E is not returned to this particular request, but it would be returned to a subsequent request.

1 comment:

Anonymous said...

Comment comes late, but it does come. :-)
There's a reason why Amazon calls it Simple (sic!) Queue Service, and the restrictions you name are the tradeoff for having a world wide messaging infrastracture at all, that's easy to use and cheap!

I'm a huge fan of SQS and I second what the documentation tells: there are ways to cope with its weaknesses inside your software design and architecture.
I don't say that SQS will fit to any task, but to probably more than what seems obvious at a first glance.