It always amazes me how simple ideas can transfer from one domain to another.
A while ago I was looking at Azure Durable Functions. In this domain you want to write a function that can call out to another service, but for scalability reasons we’d like to avoid waiting for the result. Instead we’d like to stop the current execution and then reinstate the context when the value is available. Of course the difficulty is that this sounds like special compiler logic (like the type of transform used to implement generators in many languages). Instead, in the case of Azure Durable Functions, the designers demand that the functions are deterministic, and the trick is to store the previous results in an event log. We can then get to the same execution point by replaying the function – instead of making real outgoing calls, the system returns the previously logged return value at each point where the code tries to call out of the context, knowing that this will lead us down the same path that we executed the first time. This lets us wind forward to the same place and then execute the next step.
React has always had two ways of defining a component. You can use a class and override the various lifecycle methods (like componentDidMount) or you can write more functional (Pure) components that just return the mark up that they want to render. These functional components can be a lot easier to read, and it can be easier to share logic as it is easy for functions to call other functions. Adding state into such functional components is harder though, and the React team have introduced a concept called Hooks to do this. This again requires the idea of a function where outgoing calls are made in a deterministic order, and this lets the system use this order to return different values.
There’s a good post that explains Hooks here, but the quickest introduction is a talk at React Conf with an informative excerpt here. In the example that they show in the talks, the functional approach makes the code much easier to understand than the class based approach – the latter can lead to related logic being spread across the class’s methods whereas the functional approach puts the code together.
While we are on the subject of React, there’s a post here talking about Flutter as an alternative to React Native.
I read this informative post on CORS at the weekend, and realised that the best way to get to grips with it, is to try some experiments. I hadn’t realised before how easy this would be in C#. It’s easy to write a mini-web server that handles a single call using code like this.
IPAddress ipAddress = IPAddress.Parse("127.0.0.1");
TcpListener listener = new TcpListener(ipAddress, 8182);
using (var clt = listener.AcceptTcpClient())
using (NetworkStream ns = clt.GetStream())
using (StreamReader sr = new StreamReader(ns))
using (StreamWriter sw = new StreamWriter(ns))
var msg = sr.ReadLine();
while ((line = sr.ReadLine()) != "")
sw.WriteLine("HTTP/1.1 200 OK");
With that code running, you can start Chrome in Incognito mode, and then run the following code in the console.
var h = new XMLHttpRequest()
h.open("GET", "http://localhost:8182", true)
which gives the error
(index):1 Access to XMLHttpRequest at 'http://localhost:8182/' from origin 'chrome://newtab' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
Uncommenting the Access Control line in the above code allows the request to succeed.
There are three other books that I have recently read without writing up here.
React in action by Mark Tielens Thomas, which was a good introductory book on React. It explains the various concepts and takes you through a number of examples in the first two parts of the book. The third part of the book talks about higher level architecture, covering Redux, server side rendering and there is also a chapter on React Native. I found it a useful informative read.
Redux in action by Marc Garreau, which talks about the Redux state management approach which integrates well with the React architecture (despite not exactly following the Flux architecture). Most of the book covers the use of Redux with React, though it can be used as a standalone component.
Functional Programming: Application and Implementation by Peter Henderson. I read this book while working in industry for a year before going to university (so 1984), and it blew my mind. It talks about why functional programming is such a powerful model, and defines a small Lisp language (LispKit Lisp) which it uses in subsequent chapters. It walks through writing an interpreter for Lisp in Lisp, and then goes on describe an abstract machine which we can compile the language down to, a variant of Landin’s SECD machine. The book describes how easy it is to write a compiler in Lisp for the Lisp variant. The book then looks at extensions to the semantics such as delayed evaluated (using lambda and thunks). Towards the end of the book, the author describes how to write a runtime to support LispKit and also gives the abstract machine code for the compiler itself. On GitHub you can find various implements, like this one in F#.
It was great re-reading the book, and I owe it a lot. Functional programming and Lisp spanned the first twenty years of my career, and it was this book that got me started. I taught myself C in order to implement a Lisp interpreter so I could play around, and various papers on OS implementation using LispKit got me really interested in the field.
Designing Data-Intensive Applications: The big ideas behind reliable, scalable and maintainable systems by Martin Kleppmann
I was lucky to have a six week sabbatical over the summer, and felt that it would be a good time to read up on the technologies behind some of the large scale distributed systems that are around at the moment. This book is a great read for getting up to speed.
It has three sections. The first is on the foundations of data systems, and starts with a quick discussion of what the words reliability, scalability and maintainability actually mean. The book then moves on to the various data models, where the author discusses the birth of NoSQL , query languages and the various graph databases. The underlying implementations are covered, including B-trees, SSTables and LSM-trees, and various indexing structures. The section finishes with a discussion of data encoding and evolution.
The second section covers distributed data, and there are chapters on replication, partitioning and the rather slippery notion of a transaction. Distributed systems can fail in many interesting ways, all covered in the next chapter, including some discussion of Byzantine faults. The final chapter in the section talks about consistency and consensus. In all of the discussion the author is really happy to go into low level implementation details, and all of the chapters have lists of references of papers that you can consult for more information.
The final section is on derived data – how do we process the mass of data that we have accumulated. The first chapter is on batch processing, which covers map-reduce and later variants. This is followed by a chapter on stream processing. The final chapter of the book is the author’s idea for the future.
This book is a great read. It goes into loads of implementation details which helps the reader really get to grips with the ideas, though it might take more than a single read to understand the many ideas that are covered.
I’ve been doing some reading on designing systems for scalability, and I thought I could quickly post some of the useful YouTube videos that I have found. There are numerous system design problems and solutions that have videos on YouTube, but I haven’t included the ones that I have watched.
Eventually I came across this video on system design, that actually gives a good list of the various technologies that are used in some of the most scalable applications available today.
This is an introduction to how Twitter is implemented, and mentions ideas like fanning-out to Redis and Memcached. There are videos about Facebook and Instagram
The choice of database is obviously important, and it is useful to understand the in-memory databases like Redis. Transactions also come up, via myths and surprises, and how the transaction levels relate to the CAP theorem.
Uber deal with some of the reliability data by storing data on their drivers’ mobile phones.
GraphQL came up several times as an alternative to REST APIs. It often requires fewer round trips, and makes tool support easy by using a schema. There is an introduction here and the coding of a server (which explains what you can do about the N+1 problem using an online demo system).
There is a good general talk about lessons learned here.
I had heard about Bloom Filters before, but hadn’t come across the Count-min sketch algorithm
Kubernetes: Up & Running by Kelsey Hightower, Brendan Burns and Joe Beda
Everywhere you go these days, it’s all about containers and how they should be orchestrated. Software Engineering Daily had a great series about several container management systems, and so it was time to get the book about Kubernetes, by several of the founders of the project. There is recent blog post on the history of the project here.
The book itself is really good. It explains the need for an orchestration framework, and demonstrates the various parts of the Kubernetes system. It starts by showing you how to deploy a Kubernetes cluster and works through the use of the kubectl commands. It moves on to explain pods, and the labels and annotations that you can attach to the containers that are being managed. This is very hands on, working against a demonstration container that the authors have made available.
The following chapters cover service discovery, Replicasets, Daemonsets, Jobs and ConfigMaps and then there is a chapter that covers deployments and upgrades. The last two chapters cover how you integrate storage with your applications and how to deploy some real world applications.
The book, as you would expect, covers the material really well. If you want to try the material out on the Azure cloud, the Azure documentation contains some worked tutorials.
If you need to understand Docker a little better, then I found this post useful. Ben Hall also did a recent talk on other container technologies. A competing idea is serverless, and there is a recent paper that looks at the implementation behind this for the three major cloud platforms.
I gave a quick lightning talk at work about micro-tasks in the browser, based on a recent talk by Jake Archibald. The slides are available here.