Wednesday, 10 June 2009

RETRO: A (hopefully) RESTful Transaction Model

This is a post for discussion around RETRO, a transaction model that aims to enable ACID transactions on the Web, while remaining fully RESTful.

The full model can be seen here.


  1. In my opinion, any communication pattern that leaves state on the server side between requests is a breach of the statelessness. PUT is ok, because it transfers state to the server that the client is then no longer responsible for. That's server/service state rather than application state. However, pub/sub and pessimistic lock both require a mirroring of state between client and server between requests and for the duration of the pattern execution.
    A good sign that you have stepped past the statelessness constraint limit is that the server needs to start timing out state if the client is uncommunicative for a particular time. Again, we see this in both a pessimistic lock scenario and a pub/sub scenario. If the server dies silently, the server/service has some clean-up to do.
    By these measures I would suggest that your proposal does not fall within the REST constraints. However, I do believe there is a class of problems that can be addressed by non-RESTful but Web-friendly mechanisms. In particular, wherever locking and pub/sub are already known to work (eg at the enterprise level) the upside in adding optional support to HTTP may outweigh the downside of a more complex standard and the potential for over reliance on these features. It seems possible that if these are approached as separate standards with their statefulness problems clearly highlighted that they may help businesses who essentially want a RESTful services environment but with reasonable exceptions.

  2. A few comments on your proposal:
    * It is good to see you addressing the multi-service problem as a two-phase commit. I think this is perhaps the key problem that actually needs solving on the enterprise scale, as single-service problems can generally be worked around with a minimal amount of locking and without too much hassle.
    * I think some of your error return codes are a bit screwy. I would have expected a return codes of perhaps 409 Conflict or 503 Service Unavailable instead of 405 and 403. The request is valid, but not at this time. I know that my automated clients will automatically retry after a short delay with a 503, which would presumably be appropriate behaviour.
    * Did you consider using the link headers proposal by Mark Nottingham for advertisement of locking capabilities instead of requiring an embedded chunk of xml?
    * I don't really understand why we can reasonably expect that different lockable resources support the same transaction lock collection and transaction collection factory URLs. Can I assume that these transactions will only work within a predefined scope with a common transaction service? I would also assume that transaction service needs to speak an unspecified protocol back to the services that own the lockable resources in order to be effective?
    * The whole protocol seems a little bit chatty :/ I can appreciate that this might be needed for a 100% solution, but I wonder if a 90% solution might turn out to be simpler and more widely applicable. I think you are pushing too much interim state to the sever side simply to give the state URLs you can reference. I would have preferred that state stay on the client side, and I think it is more synergistic with REST principles to leave the state on the client side if at all possible. Bad enough that the server has locks to deal with, it shouldn't really have to deal with interim transaction state. The only case where it might be necessary to head down this path is where the client might not have enough memory resources to completely prepare the transaction on its side, or that that transaction once composed is difficult to transfer and manage because of its size.
    It really depends on the problem domain as to whether you need to head down this full-blown databasey approach with server-side management of transaction state. I suspect that if you really need this you will be close enough to the database to do it directly. Otherwise I expect that transactions will still be relatively manageable beasts from the client's perspective.
    I really think that all you need to do is issue a bunch of locks (possibly just exploiting existing webdav protocol features), perform a bunch of GET requests, then submit a transaction to the desired send-me-a-transaction URL. Each URL you touch must have pointed to the same URL in order for the transaction to have been valid.

  3. Hi Benjamin, thank you for looking deeply into the model and providing very insightful comments. Some of them were 'D'oh!' moments, the perils of developing in isolation I suppose.

    I'll jump right into your most poignant criticism, that of the timestamps on the locks and its effect of statelessness. I think the way to approach this problem is to examine whether it requires any action to be taken by the server(s) outside the request-response cycle. I posit it does not. Let's assume that a lock has expired (e.g. the time implied by adding a lock's timestamp + duration elements has passed.) The server does not need to actively do anything about that until the next request about a resource relevant to this transaction or the resources it has locked arrives. So when the transaction resource is again requested, the server can examine if all its locks are unexpired and return 'Active' as its status. If however a lock has expired, the status that will be returned will be 'Aborted'. Similarly, if the lock collection of a lockable resource is requested, only locks that have not expired will be returned. this pattern can be followed for all transaction-related requests. Notice that all the decisions can be taken at request-time, by whatever server happens to handle the request. In this sense, the model passes Stefan Tilkov's test of the server farm burning down, and the disks getting transplanted to a new set of servers between requests, no problems arise. Now, this load at request-time may have some performance implications, and implementers may choose to have a daemon running in the background to 'pre-process/cache' these responses, but this does not equal statefullness in my view. To sum up, your criticism is valid as the most straightforward way to deal with time-outs would be to keep state indeed. However I hope I have presented a way that this can be dealt with without doing that, keeping the model fully RESTful. The above discussion will certainly be added to the document at the next revision.

    On replacing the 405/403 error codes with 503, truth is I went for the most semantically correct error codes rather than the ones that would cause the desired behaviour. Looking at the HTTP spec, what bothers me is that the description implies that the server is somehow under stress when this is absolutely inaccurate. What looked extremely attractive is that 503 enable the Retry-After header. That can be set to be the exact moment when the lock will expire, allowing very good synchronization with minimal ping-type requests. Ideally I would like to combine Retry-After with the 403/405s to maintain semantic correctness while causing the desired behaviour. Things being as they are, 503/Retry-After is an extremely attractive proposition that may warrant sacrificing semantic correctness. Thanks for that.

    The Link Headers proposal you refer to is another excellent suggestion I was not aware of. I have some mention of 'custom headers' that I picked as an idea up from a talk by Mark Nottingham. Now I realize he was referring to his link headers proposal. Link headers actually are an excellent way to communicate this information and would allow a general implementation of the transaction model to be added to a RESTful service. Until now I suspected that the 'model' was more of a 'pattern' to be implemented by each API-maker separately. Again, excellent suggestion.

    Regarding the transaction collection factory. My (addmitedly unstated) assumption was that the Tc, T, T-Lc and R-Lc resources would reside within the same service as the lockable resources, making each service self-contained. This would mean that the transaction-service communication would in fact be internal service communication. However making the transaction service separate from the actual service is a very interesting challenge that would enable a range of scenarios and is therefore an excellent future direction.

  4. Regarding the 'chatiness' of the protocol, I think this was the biggest D-oh! moment you comments caused for me. you are absolutely right that this could be done in a much more efficient way by essentially submitting all the operations for the transaction in one go. I guess The simplest way to do that, similar to what you suggest, would be to follow a 'Create Transaction -> Lock Resources -> GET Resources -> Submit desired Transaction History (in one go) -> Commit Transaction' route from the client's POV. In that sense this protocol would nearly subsume the scenarios for the BATCH command, not a bad idea if you ask me. This slight modification would allow the protocol to be used in both ways, leaving the decision up to the client. Also the current design allows the potential for multiple clients to work on the same transaction (although we have not enabled multiple owners for a transaction yet), but the slight modification discussed above would allow both usage modes to co-exist.

    Once again, thank you for the excellent comments, we can keep this discussion going.

  5. On statelessness, I have expanded on my comments above as a blog post. Hopefully this entry clarifies the trade-offs somewhat.
    I think a batching approach would work well in this case if we are willing to break a strict interpretation of statelessness. If you want to support cross-service transactions as a two-phase commit you'll have to head down a path similar to the PosgreSQL commands on the same subject. The methods could roughly be PREPARE, COMMIT, and ROLLBACK. The trick would be that PREPARE would have to guarantee successful application of the transaction should COMMIT be requested. So long as the service itself is backed by a transactional database and doesn't interact with too much else it should be able to achieve this, although it could be difficult in the general case. Certainly from the SOA perspective of wanting to consistently update a number of different services, it would be important that this approach is supported fairly broadly. If you weren't too worried about cross-service transactions these methods would not be required and you could get away with a more straightforward BATCH method.
    Another issue to consider down this particular road is whether a failure should be considered and thereby roll back the transaction.

  6. Very nice model, but as Benjamin points out in his blog post, pessimistic locks can't really be done in conformance to REST principles. An open lock is application state, and application state of any kind doesn't conform to REST principles. The optimistic locking model I suggested in my article at does work, however. Your model is much more complete than mine. If you could reorganize to use opptimistic locks, then you'd have something that could scale to the web (if that's your interest). I'd also question your use of POST to commit your transaction. You're effectivly tunneling an RPC request through POST, which is exactly what you say you're NOT doing. Why not use PUT with a transaction state of "committed" for the payload? This would be idempotent. If you PUT the same state again it would have no more effect than the first time through.

  7. You did a great Job. I regard your work as being highly intresting, since at my university (Hamburg) we designed a RESTful architecture for object-oriented databases. We have implemented a vendor-independent REST-Layer which we used for an implementation for a real OODB. AS you can imagine transactions play an important role in that scenario. The similarities between your design and ours are quite striking. But I do have a few comments & suggestions to make:

    -You present transactions, as if they necessarily had to pessimistic. They don't, but I think they can. But using optimistic concurrency control (i.e. transactions don't lock ressources, they just work on them, and when committing they provide their Read- and Write-Set) matches the capabilities of HTTP very well (in particular caching and conditional-requests). Your locking mechanism could be refined to multiple granularity lock (MGL) taking into account the hierarchy of ressources. Also a hybrid solution is possible: optimistic concurrency by default, locking for hot-spot-objects. Your approach to locks reminds me of multiversion-concurrency-control: creating a new ressource representing the transactions current state linking to the previous state. But you omit the resulting possibility to allow concurrent reads/writes by dropping the locking altogehter.

    -In your finishing survey you (probably inadvertently) disregarded one REST-Constraint of utmost importance: the Caching. An at that point your approach really has a major problem: by PUTting changes against the lock-ressource instead of the real ressource, intermediate caches miss the chance to invalidate their cached representations (which as RFC2616 states, they should). Also locking is in principle contraditory to caching.

    -Committing the transaction by posting to a newly created resource seems counterintuitive. Why not PUT a changed representation of the transaction-resource, or introduce a transaction/status ressource which can be changed.

    -Concerning the connection between ressources of unchangeable representations (like images) and transactions: there is solution to your problem, without introdoucing a new HTTP-Header (which is for a good reason commonly regarded as last resort) - the HTTP-Link-Header ( It also allows to identify the transaction when PUTting or POSTing to a resource.

    -Your 2PC as last example is in fact none - the prepare phase is missing. Your single-step-commiting can't gurantee atomicity, nor consitency. But that might be OK for many applications, since compensations and BASE are an intresting option.

    Again, you did a great job and I'd love to see your approach evolving over time.

    @jcalcote: application state does in fact conform to REST - it has to be made expilcit in the form of ressource state. Only implicit state contradicts the statelessness.