The Reactive Revolution
It is the morning, at dawn, before the fortifications. Men are ready. For some time now, things have been moving with small changes, from here to there. The foundations are cracking, challenging them. Moreover, some have already made the leap. Others hesitate. The question is no longer about whether one is doing it, or if one is resisting, but rather when will one go. All these developments converge to the same goal: a new revolution in information technology.
In this series of articles, we will elaborate on an increasingly spreading development model. Where does this model come from, why, what are the impacts on testing, on development languages, performance, etc.? We will try to answer all these questions.
What is the reactive model?
The reactive manifesto defines a reactive application around four interrelated pillars: event-driven, responsive, scalability et resilience.
A responsive application is event-driven, able to provide an optimal user experience, makes better use of the computing power and better tolerates errors and failures.
The strongest concept being the event-driven orientation, everything else is defined through this concept.
A reactive model is a development model led by events. Several names are used to describe it. It's a matter of perspective.
We can call this model:
- « event driven », driven by events
- « reactive », that reacts to events
- « push based application », the data is pushed as it becomes available
- Or still better, « Hollywood », summarized by the famous « don’t call us, we’ll call you »
On our side, we prefer the term « reactive ».
This architectural model is very relevant for applications interacting in real time with users, including:
- Shared documents (Google docs, Office 360)
- Social networks (broadcast stream, Like/+1)
- Financial analysis (market flows, auctions)
- Pooled information (road traffic or public transport, pollution, parking spaces, etc.)
- Multiplayer games
- Multi-channel approaches (the same user uses his PC, mobile and tablet)
- Mobile applications synchronisation (all at the same time on all three user devices)
- Open or private API (impossible to predict usage)
- Indicator management (GPS position sensors, connected objects)
- Massive user influx (sport events, sales, TV ad, Startup launches, opening a new mobile platform, etc.)
- Direct communications (chat, hang-out)
- And more generally to manage complex algorithms more effectively (booking tickets, graphs management, semantic web, etc).
One of the key points for all these applications is the latency management. For an application to be responsive, the user must perceive the lowest possible latency.
There are several possible architectures to address scalability and resilience but we will dedicate other articles for them. For now, let's focus on decreasing the latency.
Decreasing latency
For many years now, the competing processes have been performed in different threads. A program is basically a sequence of instructions that run linearly in a thread. To perform all requested tasks, a server will generate several threads. But these threads will spend most of their time waiting for the result of a network call, a disk read or a database query.
There are two types of threads: soft-threads and hard-threads. Soft-threads are simulations of competing processes dedicating portions of the CPU to each process alternately. Hard-threads are competing processes performed by different processor cores. Fortunately, the soft-threads allow machines to simultaneously run many more threads than they have cores.
However, to optimize performance, Intel recommends to:
- Create a thread-pool sized according to the number of hyper-cores and threads reuse
- Avoid calls to the kernel
- Avoid sharing data between threads
The reactive model aims to remove as many soft-threads as possible and use only hard-threads. This allows better use of the modern processors.
For a long time now, network technologies embedded in routers have taken advantage of this development model, with excellent performance.
The reactive approach aims to generalise this principle of development.
To reduce the number of threads, you must not share the CPU on a time basis, but on an event basis. Each call involves the processing of a piece of code. It should never be blocked, so that the CPU is released as quickly as possible to process the next event.
To enable this, we must intervene in all software layers: from operating systems to development languages passing through frameworks, hardware drivers and database. All these software layers are being migrated, allowing widespread consideration of this architectural model.
But why should we work like this? For several reasons:
- Reduce latency!
- Improve performance by increasing parallelism
- Manage peak loads by eliminating the arbitrary limit on the number of simultaneous processes
- Better use of the increasing number of cores in CPUs
- Be able to manage the process flow (in addition to request/response)
- Reduce memory consumption
These improvements allow an increase in the number of users per server. In the same proportions, it reduces the cost of Cloud.
On a technical level, this results in:
- No cost for synchronizing competing processes (only if there is only one core)
- Less memory footprint to keep the states (not stacks)
- Better scheduling based on the applications’ real priorities
The reactive mode makes it possible to alleviate the limit on the number of simultaneous users implied by an arbitrary fixed parameter on the thread pool. This approach is more likely to respond to load peaks.
These potential gains are sometimes challenged in studies. Evolution of the OS, threads implementation, processors and virtual machines (JVM-like) may affect the actual benchmarks. Finally, it’s a subtle combination of these elements which allows to compare one architecture to another. Therefore, it is necessary to generally provide several very different developments: one reactive, one classic. Not so easy! This explains why it is difficult to have reliable benchmarks.
As part of our research that we conducted and presented to PerfUG, gains are noticeable for architectures processing a flow at high frequency. Similarly, our work on JEE for classic web-use confirms performance gains and scalability of the architecture.
A data structure that avoids locks is certainly a very important lever on system performance. New functional data models are then good companions to reactive models.
Among new software making the most buzz, many are using the internal reactive model. To name a few: Redis, Node.js, Storm, Play, Vertx, Axon or Scala.
Similarly, Web giants publish their feedback experience on migrating to this model: Coursera, Gilt, Groupon, Klout, Linkedin, NetFlix, Paypals, Twitter, WallMart or Yahoo.
Why now?
« Software gets slower faster than hardware gets faster. » Niklaus Wirth - 1995
The reactive model is not new. It is used in all user interface frameworks since the invention of the mouse. Each click or keyboard input generates an event. Even client side JavaScript uses this model. There is no thread in this language, yet it is possible to have multiple AJAX requests simultaneously. Everything works using call-backs and events.
Current development architectures are the result of a succession of steps and evolutions. Some strong concepts have been introduced and used extensively before being replaced by new ideas. The environment is also changing. The way we respond to it has changed.
Have we reached the limits of our systems? Is there still space to conquer? Performance gains to discover?
In our systems, there is a huge untapped power reservoir. On doubling the number of users, we can just add a server. Our customers talked about handling about 20x more requests since the advent of mobiles. Is it reasonable to multiply the number of servers in proportion? Is that enough to make your systems function? It is not certain. It sounds better to review the architecture to finally harness the power available.
There are much more available processor cycles. That is obvious. As programs spend most of their time waiting for disks, networks or databases, they don’t harness the potential of servers.
A development model based on events is called "reactive". Now, this becomes accessible to everyone. It is becoming better integrated into modern development languages.
New development patterns are now offered. They integrate latency and performance management from the start of the projects. It is no longer a challenge to overcome when it is too late to change the application architecture.
At OCTO, we believe that this model of development will dominate in the coming years. It is time to pay attention to it.
Applications based on the request/response model (HTTP / SOAP / REST) can tolerate a thread model. In contrast, the applications based on flows like JMS or WebSocket will have everything to gain from working off a model based on events and some soft threads.
We will see in future articles where this evolution comes from and what are the impact on all software layers (database, mainframe, routing, failover, high availability, etc.).