Some issues in the HttpServlet source code

31 12 2006

I’ve recently had to delve into the source code of the javax.servlet.http.HttpServlet implementation that is used in Tomcat 4, Tomcat 5 and Glassfish (and presumably has also been incorporated into various other Servlet API implementations). Somewhat surprisingly, I think I’ve found a number of issues with it:

  • The doHead method for handling HTTP “HEAD” requests calls “doGet” using a NoBodyResponse class to wrap the response so as to suppress the writing of any body content. However, this NoBodyResponse class isn’t a javax.servlet.http.HttpServletResponseWrapper, but is an entirely private inner class within HttpServlet. This arguably violates section SRV 8.2 of the Servlet specification. I guess it shouldn’t really do much harm, though for various reasons I think the lack of any way to access the underlying response might make it harder for some container implementations to handle “HEAD” requests correctly in specific situations (and it’s easy to imagine that any inaccurate responses to “HEAD” requests could go undetected). Now that the Servlet API includes a “standard” HttpServletResponseWrapper class, it would seem relatively straightforward and much cleaner to make the “NoBodyResponse” an HttpServletResponseWrapper subclass. This would also make the HttpServlet code more concise.
  • The NoBodyResponse class doesn’t actually conform to the javax.servlet.ServletResponse definition. It overrides the getWriter and getOutputStream methods and delegates all other API method calls to the underlying response, but this is a somewhat crude attempt at a response wrapper. It does not take into account the effects that getWriter and getOutputStream should have on each other and on the other API methods. Specifically, its getWriter implementation doesn’t throw IllegalStateException if getOutputStream has already been called and vice-versa. Hence you can successfully call both getOutputStream and getWriter on the same instance, even though the ServletResponse API Javadoc says that once you’ve called one, any calls to the other should fail. Similarly, it doesn’t implement the various other restrictions that should be imposed after getWriter has been called (for example, any subsequent calls to setCharacterEncoding should then have no effect). The implication of this is that “HEAD” requests can potentially give responses that are not consistent with the responses for the equivalent “GET” requests. For example, any servlet that mistakenly attempts to call both getWriter and getOutputStream will fail with an exception for a “GET” request, but will complete successfully for an equivalent “HEAD” request.
  • The NoBodyOutputStream private inner class includes error handling that looks-up an error message using a string as a key, but then ignores the result and instead uses a separate, hard-coded string as the actual error message.
  • The NoBodyOutputStream private inner class also includes comments saying that a negative length should “maybe” be an IllegalArgumentException, but the code actually throws an IOException. If this really should be an IllegalArgumentException, it ought to be fixed. If an IOException is OK, or if fixing this now isn’t acceptable in case it breaks existing code, then the comment ought to be removed (or updated to explain this).
  • The doTrace method includes a close of the output stream when it’s finished. However, prior to this it sets the content length and then writes that amount of content, which should always result in the output stream being automatically flushed and closed. Hence the response has already been closed when the call to close occurs. OK, it’s usually fairly safe to assume that repeating a close won’t do any harm. But the output stream is implementation-specific, and there doesn’t seem to be anything that mandates that it should allow duplicate close attempts. In any case, why attempt the close at all if it isn’t needed?
  • Whilst we’re here, the doTrace method ends with a “return;” statement, which seems rather pointless as the last statement of a method. (There are also many other such minor quirks and oddities in the layout of the code, but that’s probably a rather more general issue).

Of these, the second is the most serious problem, though the first is closely related to it. The others have only minor or arguable impact, or are purely “cosmetic”. The Apache Tomcat bug database includes a couple of entries relating to the first issue (see Tomcat bug 22290 and Tomcat bug 10555). However, these have long since been “closed” on the basis that the HttpServlet code is the responsibility of the Servlet API itself rather than Tomcat, and therefore the bug has been passed over to the Servlet API feedback e-mail address. Of course, there’s no sign of this ever being followed-up anywhere. I haven’t yet found anything relating to the other issues on the Tomcat, Glassfish, or Sun bug databases.

Although none of these issues have any great impact, and most are rather trivial, it’s kind of shocking to look at this code and see such apparent carelessness and imprecision. We’re talking about the reference implementation of a fundamental part of one of the most mature and widely used Java APIs, and which underlies pretty much all HTTP servlets and anything built on top of them. The same code has been carried forward through Tomcat 4, Tomcat 5, and into Glassfish, so it has clearly stood the test of time and proven itself to be robust and reliable. So when I look at it, I expect to find a polished jewel of precise engineering (even though I ought to know better by now).

Instead (and here comes a bit of a rant), it looks like exactly the kind of hastily written code that gives us such fragile systems. The sort of code that hasn’t quite been fully thought through, but more-or-less appears to sort-of work, at least for the common cases that actually get used by anyone that matters, at least after fixing whatever critical bugs people have hit, at least as far as anyone can tell, at least at the moment.

It’s easy to see how such code gets hacked together quickly in the early days of an API, and is then carried forward without ever being revisited or cleaned-up. It’s easy to see how the nature of the Servlet specification and its gradual development over several versions can lead to such issues. It’s easy to see how there’s never the time to tidy-up an existing code base (even when there are code-checking tools like FindBugs and PMD that could easily identify some of these problems automatically). It’s easy to see why these issues haven’t been noticed and reported before, or have been reported but haven’t been followed up. But it’s still surprising and rather disappointing to see the general state of the code in something as fundamental as this. Admittedly, it helps explain why we all spend so much time struggling against unexpected bugs and software that never quite seems to work properly, when you consider that most of the code you encounter is probably like this (end of rant… for the time being!).

Anyway, my next step with these particular issues is to raise them on the appropriate bug lists, and where possible see if I can contribute fixes. But that’s a whole issue in itself, and will be the subject of a future article.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: