Following on from my post on Gliffy's blog...
On more than a few occasions, I've been faced with making significant refactorings to an existing application. These are things where we need to overhaul an architectural component without breaking anything, or changing the application's features. For an applicaiton without any test cases, this is not only scary, but ill-advised.
I believe this is the primary reason that development shops hang on to out-dated technology. I got a job at a web development shop after four years of doing nothing but Swing and J2EE. My last experience with Java web development was Servlets, JSPs and taglibs. This company was still using these as the primary components of their architecture. No Struts, no Spring, no SEAM. Why? One reason was that they had no test infrastructure and therefore not ability to refactor anything.
Doing it anyway
Nevertheless, sometimes the benefits outweigh the costs and you really need to make a change. At Gliffy, I was hired to create an API to integrate editing Gliffy diagrams into the workflow of other applications. After a review of their code and architecture, the principals and I decided that the database layer needed an overhaul. It was using JDBC/SQL and had become difficult to change (especially to the new guy: me). I suggested moving to the Java Persistence Architecture (backed by Hibernate), and they agreed. Only problem was how to make sure I didn't break anything. They didn't have automated tests, and I was totally new to the application environment.
They did have test scripts for testers to follow that would hit various parts of the application. Coming from my previous enviornment, that in and of itself was amazing. Since the application communicates with the server entirely via HTTP POST, and recieves mostly XML back, I figured I could manually execute the tests and record them in a way so they could be played back later as regression tests.
This is suprisingly easy thanks to the filtering features of the Servlet specification: The filter code is bit more complex, because I had to create proxy classes for HttpServletRequest and HttpServletResponse. Here's an overview of how everything fits together:
The request proxy had to read everything from the requests input stream, save it, and send a new stream that would output the same data to the caller. It had to do the same thing with the Reader. I'm sure it's an error to use both in the same request, and Gliffy's code didn't do that, so this worked well.
The response recorder was a bit trickier, because I needed to save things like status codes and content types. This implementation probably wouldn't work for all clients (for example, it ignores any response headers), but since Gliffy is an OpenLaszlo app, and OpenLaszlo has almost no view into HTTP, this worked well for our purposes. Again, I had to wrap the OutputStream/Writer so I could record what was being sent back.
The filter then needs to use this and inject them into the actual servlet calls: After the call to doFilter, we can then examine the proxy request/respons and record the test. I'll spare you 20 lines of setXXX methods. I created a Java bean class and used XStream to serialize it. I then created another class that runs as a TestNG test to deserialize these files and make the same requests. I record the response and see if it matches.
Running the Tests
There were a few problems with this approach:
- The tests required certain test data to exist
- Each test potentially modifies the database, meaning the tests have to be run in the order they were created.
- The test results had temporal data in them that, while irrelevant to the tests "passing", complicated exact-match comparisions of results
I ultimately decided to group my tests into logical areas, and ensure that: a) tests were run in a predictable order, and b) the first test of a group was run against a known dataset. I created a small, but useful, test dataset and created a TestNG test that would do both (a) and (b). It wasn't pretty, but it worked. This clearly isn't the way a unit test framework should be used, and I would call these sorts of tests functional, rather than unit. But, since our CI system requires JUnit test results as output, and the JUnit format isn't documented, might as well use TestNG to handle it for me.
The last problem was making accurate comparisons of results. I did not want to have to parse the XML returned by the server. I settled on some regular expressions that stripped out temporal and transient data not relevant to the test. Both the expected and received content were run through this regexp filter and those results were compared. Parsing the XML might result in better failure messages (right now I have to do a visual diff, which is a pain), but I wasn't convinced that the existing XML diff tools were that useful.
Overall, it worked out great. I was able to completely overhaul the database layer, and the Gliffy client was none the wiser. We were even able to use these tests to remove our dependence on Struts, simplifying the application's deployment (we weren't using many features of Struts anyway). The final validation of these tests actually came recently, when we realized a join table needed to be exposed to our server-side code. This was a major change in two key data container, and the recorded tests were crucial to finding bugs this introduced.
So, if you don't have the luxury of automated tests, you can always create them. I did a similar thing with EJB3 using the Interceptors concept.