Why Non-Blocking?

I’ve been writing non-blocking, asynchronous code for the past year. Learning how it works and how to write it is not hard. Where are the benefits coming from is what I don’t understand. Moreover, there is so much hype surrounding some programming models, that you have to be pretty good at telling marketing from rumours from facts.

So let’s first start with clarifying the terms. Non-blocking applications are written in a way that threads never block – whenever a thread would have to block on I/O (e.g. reading/writing from/to a socket), it instead gets notified when new data is available. How is that implemented is out of the scope of this post. Non-blocking applications are normally implemented with message passing (or events). “Asynchronous” is related to that (in fact, in many cases it’s a synonym for “non-blocking”), as you send your request events and then get response to them in a different thread, at a different time – asynchronously. And then there’s the “reactive” buzzword, which likely comes from the reactor pattern and reactive programming; on the other hand there’s the reactive manifesto which defines 3 requirements for practically every application out there (responsive, elastic, resilient) and one implementation detail (message-driven).

Two examples of frameworks/tools that are used to implement non-blocking (web) applications are Akka (for Scala nad Java) and Node.js. I’ve been using the former, but most of the things are relevant to Node as well.

Here’s a rather simplified description of how it works. It uses the reactor pattern (ahaa, maybe that’s where “reactive” comes from?) where one thread serves all requests by multiplexing between tasks and never blocks anywhere – whenever something is ready, it gets processed by that thread (or a couple of threads). So, if two requests are made to a web app that reads from the database and writes the response, the framework reads the input from each socket (by getting notified on incoming data, switching between the two sockets), and when it has read everything, passes a “here’s the request” message to the application code. The application code then sends a message to a database access layer, which in turn sends a message to the database (driver), and gets notified whenever reading the data from the database is complete. In the callback it in turn sends a message to the frontend/controller, which in turn writes the data as response, by sending it as message(s). Everything consists of a lot of message passing and possibly callbacks.

One problem of that setup is that if at any point in the code the thread blocks, then the whole things goes to hell. But let’s assume all your code and 3rd party libraries are non-blocking and/or you have some clever way to avoid blocking everything (e.g. an internal thread pool that handles the blocking part).

That brings me to another point – whether only reading and writing the socket is non-blocking as opposed to the whole application being non-blocking. For example, Tomcat’s NIO connector is non-blocking, but (afaik, via a thread pool) the application code can be executed in the “good old” synchronous way. Though I admit I don’t fully understand that part, we have to distinguish asynchronous application code from asynchronous I/O, provided by the infrastructure.

And another important distinction – the fact that your server code is non-blocking/asynchronous, doesn’t mean your application is asynchronous to the client. The two things are related, but not the same – if your client uses long-living connection where it expects new data to be pushed from the server (e.g. websockets/comet) then the asynchronicity goes outside your code and becomes a feature of your application, from the perspective of the client. And that can be achieved in multiple ways, including Java Servlet with async=true (that is using a non-blocking model so that long-living connections do not each hold a blocked thread).

Okay, now we know roughly how it works, and we can even write code in that paradigm. We can pass messages around, write callbacks, or get notified with a different message (i.e. akka’s “ask” vs “tell” pattern). But again – what’s the point?

That’s where it gets tricky. You can experiment with googling for stuff like “benefits of non-blocking/NIO”, benchmarks, “what is faster – blocking or non-blocking”, etc. People will say non-blocking is faster, or more scalable, that it requires less memory for threads, has higher throughput, or any combination of these. Are they true? Nobody knows. It indeed makes sense that by not blocking your threads, and when you don’t have a thread-per-socket, you can have less threads service more requests. But is that faster or more memory efficient? Do you reach the maximum number of threads in a big thread pool before you max the CPU, network I/O or disk I/O? Is the bottleneck in a regular web application really the thread pool? Possibly, but I couldn’t find a definitive answer.

This benchmark shows raw servlets are faster than Node (and when spray (akka) was present in that benechmark, it was also slower). This one shows that the NIO tomcat connector gives worse throughput. My own benchmark (which I lost) of spray vs spring-mvc showed that spray started returning 500 (Internal Server Error) responses with way less concurrent requests than spring-mvc. I would bet there are counter-benchmarks that “prove” otherwise.

The most comprehensive piece on the topic is the “Thousands of Threads and Blocking I/O” presentation from 2008, which says something I myself felt – that everyone “knows” non-blocking is better and faster, but nobody actually tested it, and that people sometimes confuse “fast” and “scalable”. And that blocking servers actually perform ~20 faster. That presentation, complemented by this “Avoid NIO” post, claim that the non-blocking approach is actually worse in terms of scalability and performance. And this paper (from 2003) claims that “Events Are A Bad Idea (for high-concurrency servers)”. But is all this objective, does it hold true only for the Java NIO library or for the non-blocking approach in general; does it apply to Node.js and akka/spray, and how do applications that are asynchronous from the client perspective fit into the picture – I honestly don’t know.

It feels like the old, thread-pool-based, blocking approach is at least good enough, if not better. Despite the “common knowledge” that it is not.

And to complicate things even further, let’s consider usecases. Maybe you should use a blocking approach for a RESTful API with a traditional request/response paradigm, but maybe you should make a high-speed trading web application non-blocking, because of the asynchronous nature. Should you have only your “connector” (in tomcat terms) nonblocking, and the rest of your application blocking…except for the asynchronous (from client perspective) part? It gets really complicated to answer.

And even “it depends” is not a good-enough answer. Some people would say that you should to your own benchmark, for your usecase. But for a benchmark you need an actual application. Written in all possible ways. Yes, you can use some prototype, basic functionality, but choosing the programming paradigm must happen very early (and it’s hard to refactor it later). So, which approach is more performant, scalable, memory-efficient? I don’t know.

What I do know, however, is which is easier to program, easier to test and easier to support. And that’s the blocking paradigm. Where you simple call methods on objects, not caring about callbacks and handling responses. Synchronous, simple, straightforward. This is actually one of the points in both the presentation and the paper I linked above – that it’s harder to write non-blocking code. And given the unclear benefits (if any), I would say that programming, testing and supporting the code is the main distinguishing feature. Whether you are going to be able to serve 10000 or 11000 concurrent users from a single machine doesn’t really matter. Hardware is cheap. (unless it’s 1000 vs 10000, of course).

But why is the non-blocking, asynchronous, event/message-driven programming paradigm harder? For me, at least, even after a year of writing in that paradigm, it’s still messier. First, it is way harder to trace the programming flow. With a synchronous code you would just tell your IDE to fetch the call hierarchy (or find the usage of a given method if your language is not IDE-friendly), and see where everything comes and goes. With events it’s not that trivial. Who constructs this message? Where is it sent to / who consumes it? How is the response obtained – via callback, via another message? When is the response message constructed and who actually consumes it? And no, that’s not “loose coupling”, because your code is still pretty logically (and compilation-wise) coupled, it’s just harder to read.

What about thread-safety – the event passing allegedly ensure that no contention, deadlocks, or race-conditions occur. Well, even that’s not necessarily true. You have to be very careful with callbacks (unless you really have one thread like in Node) and your “actor” state. Which piece of code is executed by which thread is important (in akka at least), and you can still have a shared state even though only a few threads do the work. And for the synchronous approach you just have to follow one simple rule – state does not belong in the code, period. No instance variables and you are safe, regardless of how many threads execute the same piece of code. The presentation above mentions also immutable and concurrent data structures that are inherently thread-safe and can be used in either of the paradigms. So in terms of concurrency, it’s pretty easy, from the perspective of the developer.

Testing complicated message-passing flows is a nightmare, really. And whereas test code is generally less readable than the production code, test code for a non-blocking application is, in my experience, much uglier. But that’s subjective again, I agree.

I wouldn’t like to finish this long and unfocused piece with “it depends”. I really think the synchronous/blocking programming model, with a thread pool and no message passing in the business logic is the simpler and more straightforward way to write code. And if, as pointed out by the presentation and paper linked about, it’s also faster – great. And when you really need asynchronously sending responses to clients – consider the non-blocking approach only for that part of the functionality. Ultimately, given similar performance, throughput, scalability (and ignoring the marketing buzz), I think one should choose the programming paradigm that is easier to write, read and test. Because it takes 30 minutes to start another server, but accidental complexity can burn weeks and months of programming effort. For me, the blocking/synchronous approach is the easier to write, read and test, but that isn’t necessarily universal. I would just not base my choice of a programming paradigm on vague claims about performance and scalability.

15 thoughts on “Why Non-Blocking?”

  1. Hi, Bozho.

    Great post. It somehow reflects my opinion on NIO too. I started using Undertow (http://undertow.io) and I really feel there’s nothing I can trust on (there’s only comments, reddit posts, youtube videos, but no use cases, no really detailed benchmarks, etc.).

    But after working with it, I realized that other NIO frameworks/webservers are 10+ years old and that the industry still uses them and is afraid of using other technologies and that might be the reason why we don’t have real data to populate benchmarks so we can actually make a good decision – “should I really use NIO here or would a common approach be better?”

    Also, since Node and Undertow are pretty new, there’s certainly a lot to optimize.

    I truly hope Undertow was a good choice, so let’s see.

  2. Just correcting last post, ” I realized that other ***NON-*** NIO frameworks/webservers are 10+ years old”…

  3. The thing is, either way works. Your undertow, our akka, and someone’s Node applications will work just as good. But so would they if they were synchronous.

  4. It would really depend on the type of project that you are working. But I guess it will take many years of experience to get a hang on what’s the best practice for non blocking development

  5. The async programming model on a large scale is simply not maintainable. A thread pool and a few queues is generally all that is need. Easy to reason about, scales like hell.

    Async Sucks ™

  6. This is a great post and reflects the same thoughts I have about NIO. The issue with NIO for me is that you assume that things can be asynchronous all the time, but this is seldom the case. As an example:
    Fetch id from database
    Use that id to identify data in some service
    This is a simple case but I cannot see why NIO helps me at all in this case.
    Also advocates for NIO Thinks that because you restort to NIO somehow your NIC’s gets automagically faster (or more reactive). For me, blocking execution of other threads by holding on to doing thread NOPs isn’t faster than doing a yield and let other do their work. You still have to regard the scheduler which does its job anyway, regardless of your code.

    So somehow we are solving the asynch problem by treating everything as potentially asynchronous and write “hard to follow”(TM) code, becuase we might be able to actually do some asynchronous things, which are still strictly constrained by the particular problem you are solving. And motivating it by “traditional synchronization is hard so let’s write everything so it’s hard to follow/debug/reason/not more effective” but you still end up with doing traditional synchronization with NIO because the the real World is not asynchronous.

    As for NIO servers JBoss AS have been using netty for low level messaging since 2004(?).
    Using NIO for strict domains where things are inherintly message based is a good thing but as a whole, problems are usually strictly sequential.
    /End rant

  7. What async programming tries to solve (at least on the JVM, where threads are available) is issues related to Little’s Law. As long as you’re not hitting Little’s Law’s limits, you really *shoudn’t* see much of a difference between threads and async. But as your machine probably can’t handle 30K threads well, sooner or later you will be hitting those limits.

    See a theoretical analysis here: http://blog.paralleluniverse.co/2014/02/04/littles-law/

    and a benchmark here: http://blog.paralleluniverse.co/2014/05/29/cascading-failures/

    showing some very clear results (the benchmark uses Quasar, so you can keep your simple, blocking, synchronous code, while the library turns that to async code behind the scenes).

  8. Nice Post, I felt the same way and I’m also experimenting with NIO frameworks/toolkits like Netty(Low Level), Vertx(Which uses AKKA, and mix non blocking IO with a very small thread pools, and also allows you to create workers for heavy duty and much more nice feature), AKKA, Nodejs

    I have to admit that I’m feeling kind of pressure in order to adopt this “new” and confused way to program…

    First of all there’s much propaganda out there, they tell you that is better, faster, more scalable but without too much evidence.

    The benchmarks are mediocre.

    They tend to compare the performance in a very convenience circumstances:

    1. Servlets 3 spec allows you to write critical Async code when needs it, and most of the benchmark that I saw does not compare against this feature.

    2.Some of those benchmark blocks the threads using Thread.sleep to “simulate” the block time of an IO operation.(In J2EE is not recommended block the thread but if you do that, you still work much betters than reactive frameworks toolkits that has a golden rule “Don’t block the event loop” ). And some does not take advantage of the Async feature of Servlet 3.0 like this post: https://dzone.com/articles/performance-comparison-between

    3.Containers are heavier(But pays you back with great automation of the heavy tasks like managed components life cycles,Injection of managed components, security, great APIs, such a level of abstraction on network related things of the HTTP Protocol) than small and very thin layered servers, maybe the comparison and performance claim in some cases is not related to Async, maybe is the performance related to the whole system. Containers has a lot of (But necessary) layers, that a server created with Netty, AKKA, Vertx, NodeJS does not have. They are small servers.

    I bed that if you create your own socket based http server which handles thread pools and also async when you needs it it could be faster than Tomcat, or glasfish, or other very heavy containers(Although Tomcat is middle heavy).

    A would like to see real benchmark using Jetty with Servlet 3.0 using blocking and async servlets on very heavy duty. Against one of our “reactive friends” and that lady and gentlemen will be more fairer.

    4.But there’s a light at the end of the tunnel we gotta people clarifying this topic like this gentleman:

    https://hashnode.com/post/nodejs-performance-comparison-with-java-ciibz8fnl01f4j3xt7yigvpto

    Also, project like Spark Framework and SPring Boot are helping to create self-contained apps and microservices api in a new and lightweight than ever before.

    Great article, I thought that I was the only one feeling worry about the massive media and the impact of it in our professional life.

    Best regards, Dimitri.

  9. *Corrections:
    “A would like to see real benchmark using Jetty with Servlet 3.0 using blocking and async servlets on very heavy duty. Against one of our “reactive friends” and that lady and gentlemen will be more fairer.” >>> will be fairer…

  10. Corrections No. 2 “Vertx(Which uses AKKA, and mix non blocking IO with a very small thread pools, and also allows you to create workers for heavy duty and much more nice feature)” >> Vertx(Which uses Netty…

  11. I think that much of your perception comes from the fact that we are trying to use libraries to add asynchronous behavior to languages that offer no support for this model. Things would be different if we were talking about Erlang.

    Java and Javascript are languages that were born in a sync world. We can try to add new primitives to those languages, but it will feel strange because they were not designed that way from the beginning, so it will not fit well with other parts. We can try to add that as a library, but it will feel even stranger.

    Also, must of us learned to program using similar languages. We live in a synchronous and blocking world. We have years of experience and training in this mindset. Even if we go learn Erlang now, we’ll probably try to use it to solve the same problems that we’re used to solve using Java. There’s a reason these two languages exist. People who use Erlang are used to (or want to) solve different kinds of problems. Try to use it to solve the same problems you already satisfactorily solved and you will find that the new tools are no better than the old ones.

  12. Thanks for the piece! I’m actually working on a web framework for Java/Groovy that leverage’s Vert.x Web. I found that, Vert.x for all its’ claims of more throughput and whatnot, lead to some nasty looking code that I would rather avoid having to write or maintain.

    I’ve been experimenting with using Vert.x’s server to handle incoming connections and using request handlers that execute their logic inside a dedicated thread pool. Inside of these handlers, I’ll be able to run traditional synchronous code without having Vert.x complain about blocking the event loop.

    Thus far I’m happy with the result. I’m noticing no performance hits (as a matter of fact I’m seeing some performance gains) and my code is readable, testable, debuggable and manageable.

    I guess the approach of having a non-blocking “connector” and a blocking application is working out great thus far.

Leave a Reply

Your email address will not be published. Required fields are marked *