Distributed systems and databases
I recently skipped through the youtube playlist of the “GOTO Conferences 2015”, when I noticed a talk named: “Don’t Give Up on Serializability Just Yet” by Neha Narula.
In this video she talks about the property of serializability and the benefits transactional databases are giving software system engineers. The talks a little bit about the CAP theorem and the FLP theorem and how the two concepts relate to one another. After that she continues talking about her recent work on parallel processing of serialized operations.
I highly recommend this talk and it is available at youtube.
CAP and FLP
By coincidence a few days after I saw the talk, I friend of mine pointed me to a paper that talks about the inequality of the FLP and CAP theorem.
The author points out that the two are mathematically inequal. However I found the blog post not detailed enough, thus I searched for a more scientific paper and found this post on quora: Distributed Systems: Are the FLP impossibility result and Brewer’s CAP theorem basically equivalent. This post points to a Paper by Gilbert and Lynch and even quotes the part, that addresses the CAP vs FLP question. Here are some quotes that specify the two concepts:
The FLP theorem states that in an asynchronous network where messages may be delayed but not lost, there is no consensus algorithm that is guaranteed to terminate in every execution for all starting conditions, if at least one node may fail-stop.
The CAP theorem states that in an asynchronous network where messages may be lost, it is impossible to implement a sequentially consistent atomic read / write register that responds eventually to every request under every pattern of message loss.
The most important section of the above quote is
achieving agreement is (provably) harder than simply implementing an atomic read/write register.
I highly recommend reading these articles and the paper, or at least the relevant section. I found this new insights interesting and it certainly improved my knowledge on both theorems and their relation to each other.
Parallel execution of conflicting transactions
Within the talk about serializability, Mrs. Narula mentioned, that she was currently working on improving performance of serialized operations. I notices, that there already is a paper on that topic and found her thesis named Parallel execution of conflicting transactions This paper is really interesting and shows that there is still a lot of potential to improve parallelization of serialized operations, even in case of conflicting transactions. She also includes a in Memory database that serves as a kind of PoC implementation.
In her papar Mrs. Narula mentions the importance of commutativity. Since this is a general concept I looked further and found another interesting video on youtube. The scalable commutativity rule at papers we love. Mrs. Narula presents a paper that shows the importance of commutativity and scalability. The authors of the paper show that if a set of operations commute, there exists an implementation that has the property of scaleability. The videos goes into way more detail and presents various exmaples from operating systems, where commutativity can improve scalability and parallelization.
I highly recommend watchting this video, and maybe even reading the paper.
I just thought I share this set of really interesting sources, since I really learned a lot, by spending some days on this topic. Even as a software engineer, my insights will help me in practice to not only use indempotency but also commutativity as concepts when implementing concurrent and scalable systems.
Thanks to Neha Narula for her presentation of her own work and her collegues papar.
The saga pattern
I have to admit I just learned about the so called saga pattern. I have been working with similar concepts in practice, but it is always great to laern about the formal background.
I saw Applying the Saga Pattern by Caitie McCaffrey a few days ago and I want to thank her for great talk. The managed explain the pattern and its use in distributed systems in around 30 minutes, which I find awesome.
For those who don’t know it yet, the concept is about breaking long running and/or distributed transactions into smaller sub-transactions. A worklog is kept and if any error occurs, conpensating transactions will be applied to all sub-transactions that have been executed. In addition to that the forward recovery concept uses safe-points after successfull sub-transactions so that the operation can be retried or even fixed by alternative algorithms or even manually so that the complete saga will be finished successfully, and no role back ( with potential data loss ) will happen.
This is over simplified and just a short intruduction. I encourage anyone to watch the video. Its a great presentation with a lot of practical conciderations between the lines.
After the talk I took the time to read the original paper called SAGAS by Moline et al. Its a short and interesting read, so I recommend taking a few minutes for it.
I really have to say, that Mrs. McCaffrey managed to put all the relevant content into her talk, so that watching the video teaches almost to complete concept explained in the paper.
It was great to here about the constraints that sub-transactions require in order for backward recovery and forward recovery. When building the next distributed transaction I will have a clearer knowledge about the requirements and constraints.
So thanks to Mr. Molino and Mrs. McCaffrey ( @caitie on Twitter ).