ThreadLocal Variables and Thread Pools – It Can Go Wrong
Many libraries use ThreadLocal
variables to store some thread context data. Usually in web applications this context is a single request (= thread). However, there’s a caveat – most containers use a thread pool. It saves the cost of instantiating a new thread each time.
So both things are good, but when combined a problem arises – if the library does not clean its threadlocals, it will be served a thread containing an already filled threadlocal. Two problems with that:
- Memory leaks – tomcat was accused of leaking on redeploy. It was because of many uncleaned threadlocals, and the thread pool is not recreated on redeploy. In Tomcat 7 this is fixed, by clearing all threadlocals, and showing a warning. But that is only on redeploy. Imagine the usual case when the thread local actually stores a Map. Objects are put into the map, and it can grow indefinitely, if it is not cleaned.
- Nondeterministic behaviour – libraries might expect a clean thread, and hence an empty thread local meaning “I must do some important initializations”. But instead it is served a filled threadlocal, so the initialization does not happen, and old (possibly irrelevant) data is used instead
So it is very important to clean these threadlocals. Here are three options:
ThreadLocal.remove()
when the code has finished its job. That’s the straightforward way, but it is not always possible, because the threadlocal may need span multiple invocations of a library within the same request.
- Provide a
.cleanup()
or .close()
method – the calling code will be responsible for cleaning all used resources (include threadlocals). Many libraries already have this method for closing IO resouces
- Provide a servlet filter that will clean up at the end of each request. This does not make the library necessarily dependent on the servlet API, but it improves its usage in a web context. With Servlet 3.0 the filter can even auto-register itself, so this becomes invisible to the user
If it is your code, you are fine. But as I mentioned – many libraries have this problem. So go to them and report it.
And if you use libraries that refuse to take measures for this issue, you have another option: first investigate (by looking into the library code, or if not open-source – decompile it) whether any of the two problems above can happen. (If thread locals are used only for storing date formats, for example, then don’t worry). If indeed a problem exists, then write a Filter yourself that clears all thread locals, the way tomcat do it on shutdown. See their code here (the method is called checkThreadLocalsForLeaks()
). Of course, obtain all fields in the init() method and store them a instance fields, so that part of the “heavy” reflection is not executed on every request.
Related discussion can be found here – notable names like Joshua Bloch, Doug Lea and Bob Lee can be seen in this thread
Many libraries use ThreadLocal
variables to store some thread context data. Usually in web applications this context is a single request (= thread). However, there’s a caveat – most containers use a thread pool. It saves the cost of instantiating a new thread each time.
So both things are good, but when combined a problem arises – if the library does not clean its threadlocals, it will be served a thread containing an already filled threadlocal. Two problems with that:
- Memory leaks – tomcat was accused of leaking on redeploy. It was because of many uncleaned threadlocals, and the thread pool is not recreated on redeploy. In Tomcat 7 this is fixed, by clearing all threadlocals, and showing a warning. But that is only on redeploy. Imagine the usual case when the thread local actually stores a Map. Objects are put into the map, and it can grow indefinitely, if it is not cleaned.
- Nondeterministic behaviour – libraries might expect a clean thread, and hence an empty thread local meaning “I must do some important initializations”. But instead it is served a filled threadlocal, so the initialization does not happen, and old (possibly irrelevant) data is used instead
So it is very important to clean these threadlocals. Here are three options:
ThreadLocal.remove()
when the code has finished its job. That’s the straightforward way, but it is not always possible, because the threadlocal may need span multiple invocations of a library within the same request.- Provide a
.cleanup()
or.close()
method – the calling code will be responsible for cleaning all used resources (include threadlocals). Many libraries already have this method for closing IO resouces - Provide a servlet filter that will clean up at the end of each request. This does not make the library necessarily dependent on the servlet API, but it improves its usage in a web context. With Servlet 3.0 the filter can even auto-register itself, so this becomes invisible to the user
If it is your code, you are fine. But as I mentioned – many libraries have this problem. So go to them and report it.
And if you use libraries that refuse to take measures for this issue, you have another option: first investigate (by looking into the library code, or if not open-source – decompile it) whether any of the two problems above can happen. (If thread locals are used only for storing date formats, for example, then don’t worry). If indeed a problem exists, then write a Filter yourself that clears all thread locals, the way tomcat do it on shutdown. See their code here (the method is called checkThreadLocalsForLeaks()
). Of course, obtain all fields in the init() method and store them a instance fields, so that part of the “heavy” reflection is not executed on every request.
Related discussion can be found here – notable names like Joshua Bloch, Doug Lea and Bob Lee can be seen in this thread
1 thought on “ThreadLocal Variables and Thread Pools – It Can Go Wrong”