Interviewing the Interviewer: A Rubric

October 07, 2008

Sad to say, my time at Gliffy is at an end (:sniff:), so I'm heading back into the job pool. I was lucky to get some time in at Gliffy, because, living in Washington, DC, my opportunities for sexy cutting-edge jobs are about zilch. Instead, I'm facing a huge market of "Senior JBoss Portal Maintenance Archiect" type jobs.

I guess I should feel lucky that there's lots of positions out there, but I really don't want to be the cog in a huge machine. Gliffy has shielded me from the Horror That Can Be Consulting, so I need to keep my perspective in such trying times. So, calling on my experiences before Gliffy, I've made this handy rubric to make sure I explore all facets of a potential position.

I don't expect anyone to get all positives and no negatives (I've certainly never been anywhere that perfect), bit it's always good to know. For example, if I have to put up with PVCS, I better be sitting on an Aeron chair and be using a normalized database. Further, this is obviously in addition to standard questions regarding what the project is about (i.e. is it interesting) and what the people are like. A whole lot of this can be forgiven by being part of a great team or working on a really cool product.

Question	Points for	Points against
How do you fare on the Joel Test?	· High score · Good explanations for missing items	· Low score · Never heard of it
Describe your development process	· Structured · Easily described	· Overly draconian · Lack of · Not easily described
What kind of computer will I be using to develop?	· Two monitors · Mac · Linux · Administrator access	· Vague answer · Small monitor · Windows · Locked-down
Do you block certain sites or applications on your network?	· Open network	· Closed network
What is the physical environment like?	· Good Chairs · Private office · Natural light · Reasonable Temperature	· Old, crappy · Bullpen style
Am I required or encouraged to use Windows?	· No, few devs use Windows	· Yes
What collaboration tools do you use?	· Wikis · IM · Bug tracking · Sensible PM	· Email word documents · MS-project · SharePoint · Other proprietary crap (e.g. eRoom, Documentum) · No tools
What are some of your HR policies?	· Few, if any · Loose dress code · Flexible schedule	· Draconian · Dress code · Core hours
Can I see the code I will be working with?	· Letting me see it · Meaningful javadocs/API documentation · Structured, consistent style · Sensible class names and file organization · Sane build process	· Not letting me see it · Mix of styles · No javadocs · Empty javadocs · Convoluted file organization · Broken build file
How do you do testing?	· Have testers · Do unit tests · Maintain tests · TDD · Bug tracker	· Ad hoc · Lip service to test-first · No unit tests
What is your approach to configuration management?	· Having an approach · Git · Database migrations · Know the versions of 3rd party software/have a baseline configuration · Organized	· No approach · CVS · Perforce, ClearCase, other closed crud · Shared drives · Can't describe versions/configuration
Can I see the database schema I'll be working with?	· Letting me see it · Normalized · Sane names (no TBL_* bullshit) · Synthetic numeric keys · Referential integrity · Documented! · Versioned!	· Not letting me see it · Unnormalized · Dumb names · Incorrect types · String-based keys · No constraints · Undocumented

What else am I missing?

Are you emailing yourself your log errors? You should be.

September 26, 2008

Time and time again, users complain about an application crashing on them or otherwise not working. They don't provide you any info and it's hard to repeat. You check out the log, but there's thousands (or millions) of entries and you have no clue where their error occured. Worse, if you are deploying a RIA, the log may be on their computer and not available.

On my last project we experienced this scenario so much that we instituted two things

All messages logged with Level.ERROR in log4j would be emailed to us
All exceptions caught on the client would be packaged and sent back to the server and logged at Level.ERROR level (thus emailing them to us

After the initial deluge of emails, we found a lot of bugs. I mean a lot of bugs. The annoying, intermittent kind that are hard to reproduce. Further, by judicious use of logging, we discovered a lot of mis-configured environments and other problems without having to get users to mail us their logs.

At Gliffy, they are doing the same thing. Right now, we're testing a bunch of new features and the stage instance just sent me a bunch of emails, all indicating configuration problems, which is the exact kind of thing that can be hard to track down.

Setting it up using log4j is dead simple:

log4j.appender.mail=org.apache.log4j.net.SMTPAuthenticateAppender
log4j.appender.mail.SMTPHost=@SMTP_HOST@
log4j.appender.mail.UserName=@SMTP_USER@
log4j.appender.mail.Password=@SMTP_PASS@
log4j.appender.mail.Authenticate=true
log4j.appender.mail.From=errors@gliffy.com
log4j.appender.mail.To=@SMTP_LOGGER_FAILURE@
log4j.appender.mail.Subject=Errors from @SMTP_DESC@
log4j.appender.mail.BufferSize=1
log4j.appender.mail.Threshold=ERROR
log4j.appender.mail.LocationInfo=true 
log4j.appender.mail.layout=org.apache.log4j.PatternLayout 
log4j.appender.mail.layout.ConversionPattern=%d %p%n%t%n%c:%M:%L%n---%n%m%n---%n%n

In my previous job I even created a customized layout to format the emails in such a way that our code was highlighted and GMail didn't compress things into threads.

If you aren't doing this, you should be. Now.

Getting Rake's PackageTask to depend on generated files

September 17, 2008

Been playing with Rake lately and decided to use it to package up the PHP Client Library for the Gliffy integration API. Didn't seem to make sense to use ant for something that amounts to creating a tarball. make would be appropriate here, too, but I figured it would be cool to use Rake and there's not really much harm in doing so. A large annoyance is Rake::PackageTask. This is a seemingly handy task that creates tars, zips, etc. and is pretty useful. It's not a task in and of itself, but it creates the :package task:

Rake::PackageTask.new("gliffy-php-client",GLIFFY_VERSION) do |p|
    p.need_tar = true
    p.need_zip = true
    p.package_files = SRC_FILES + EXAMPLE_FILES + DOC_FILES
end

Unfortunately, this doesn't do what it seems to do. DOC_FILES is the list of documentation files output by phpDocumentor and are not checked into version control. The syntax of the PackageTask makes it appear that the code in the block will run when the :package task executes, however this is not the case. This code is initialization code. So, I tried:

# DOC_DIR is the dir generated by phpdoc, a task elsewhere

# uses this to kick off phpdoc

task :package => DOC_DIR

The result is that the tarball and zip files are created and then the documentation is generated. The reason is that PackageTask.new creates a set of tasks and the actual creation of the tarball/zip file is done via a file task, upon which :package is dependent. So, the real dependency I created was:

task :package => "gliffy-php-client.zip" "gliffy-php-client.tgz" DOC_DIR

Examining the source code, a task named for the directory created by :package is created. This task is dependent on the package_files set up in the constructor. So this is the task I need to use:

# Have to keep a reference to the PackageTask object

package_task = Rake::PackageTask.new("gliffy-php-client",GLIFFY_VERSION) do |p|
    p.need_tar = true
    p.need_zip = true
    # Executed BEFORE any other tasks; DOC_FILES don't exist  yet

    p.package_files = SRC_FILES + EXAMPLE_FILES
end

file package_task.package_dir_path => DOC_DIR

file DOC_DIR => SRC_FILES + EXAMPLE_FILES do |t|
    system("phpdoc #{PHP_DOC_ARGS}");
    doc_files = FileList.new(DOC_DIR + "/**/**");
    # Have to add these files to the package_task file list

    package_task.package_files = package_task.package_files + doc_files
end

This is definitely a hack, because I'm depending on the internal implementation of the PackageTask. It really needs a facility for including generated files. In make, I could just send the directory DOC_DIR to tar and it would pick up everything. In Ant, I'd probably have to spawn another ant, since ant sets all property vaules at startup time.

Better open-source hosting: SourceForge is looking weak

September 17, 2008

I currently host my Vim Javadoc doclet on SourceForge and every time I have to deal with it, it's just a monumental pain. The documentation is insanely long and detailed, the website looks horribly out of date and cruddy, and when compared to stuff like GitHub and Lighthouse, it's almost embarrassing how difficult it is to deal with and how bad the UI is (despite my best efforts, it still insists that the featured download is the vimdoc samples and not the doclet itself. WTF?).

I'm already hosting the code in GitHub and just moved my tickets to Lighthouse. The only thing left is where to host binary downloads and static assets. For RESTUnit, I'm using Google Code, which is pretty easy to deal with (about a zillion times simpler and easier than SourceForge), however it has no facility for hosting arbitrary HTML. Currently, I'm just using my website for static assets.

While I do like the new Web-2.0 way of doing things (one site like GitHub really focusing on source, another like Lighthouse just does ticketing, etc. and they integrate via web services), I'm not sure where the best place is to host downloads and static assets. I would need programmatic access and some liberal download/diskspace quotas for sure. It would also be nice to be able to connect to other services, for example generate a changelog based on commits or tickets closed since the last release.

Test REST Services

September 12, 2008

In my reply to a post on Tim Bray's blog about using RSpec for testing REST services, I briefly described a project I'm working on, based on the work I've been doing at Gliffy, which is a testing framework for REST services called, unsurprisingly, RestUNIT.

For Gliffy's REST-based integration API, I needed a way to test it, and hand-coding test cases using HTTPClient was just not going to cut it. Further, requests to Gliffy's API require signing (similar to how Flickr does it), and our API was going to support multiple ways of specifying the representation type as well as tunneling over POST.

So, it occured to me that there was a lazier way of doing this testing. All I really needed to specify was the relative URL, parameters, headers, method, and expected response. Someone else could do the signing and re-run the tests with the various options (such as specifying the MIME Type via the Accept: header, and then again via a file extension in the URL).

I ended up creating a bunch of text files with this information. I then used a Ruby script to generate two things: an XML file that could be deserialized into a java object useful for testing, and a PHP script to test our PHP client API.

The Ruby script would also do things like calculate the signature (the test text files contained the api and secret keys a Gliffy user would have to use the API) and generate some derivative tests (e.g. one using a DELETE, and another tunneling that over POST). The testing engine could generate some additional derivative tests (e.g. GET requests should respond to conditional gets if the server sent back an ETag or Last-Modified header). All this then runs as a TestNG test.

The whole thing works well, but is pretty hackish. So, RestUNIT was created as a fresh codebase to create a more stable and useful testing engine. My hope is to specify tests as YAML or some other human-readable markup, instead of XML (which is essentially binary for any real-sized data) and to allow for more sophisticated means of comparing results, deriving tests, and running outside a container (all the Gliffy tests require a specific data set and run in-container).

The test specification format should then be usable to generate tests in any other language (like I did with PHP). I'm working on this slowly in my spare time and trying to keep the code clean and the architecture extensible, but not overly complex.

Schema for REST services

September 11, 2008

I'm currently working the integration API for Gliffy, which is a REST-based service. The API is fairly stable and we're readying a few ancillary things for release. One of those is the documentation for the API. I found it quite difficult to completely describe the REST services and ultimately ended up creating something that lists out "objects" and "methods", even though the API is not really object-based. For example, the object "Diagram" has a "method" called "list"; to "call" it, you do an HTTP GET to accounts/your account name/diagrams.

The original spec I created to work against (and thus, our initial draft of API documentation) was basically a list of URLs and the HTTP methods they responded to. Not very easy to navigate or understand on a first sitting. Some sort of schema to describe the REST API would have been really helpful (along the lines of an XML Schema). Such a schema could facilitate documentation, testing, code generation.

As an example, consider some features of the Gliffy API: you can list the users in an account, list the diagrams in an account and reference an individual diagram via id. Here's a YAML-esque description of these services:

<b>accounts:</b>

  <b>kind:</b> literal

  <b>desc:</b> <i>"Reference to all accounts"</i>

  <b>POST:</b>

    <b>desc:</b> <i>"Creates a new account"</i>

    <b>parameters:</b>

        - account_name

            <b>required:</b> true

            <b>desc:</b> <i>"Name of the account you want to create"</i>

        - admin_email

            <b>required:</b> true

            <b>desc:</b> <i>"Email address of an administrator for the new account"</i>

  <b>children:</b>

     <b>account_name:</b>

       <b>kind:</b> variable

       <b>desc:</b> <i>"The name of your account"</i>

       <b>GET:</b>

         <b>desc:</b> <i>"Returns meta-data about the account"</i>

         <b>parameters:</b>

            - show_users

              <b>required:</b> false

              <b>desc:</b> <i>"If true, users are included, if false, they are not"</i>

       <b>children:</b>

         <b>diagrams:</b>

           <b>kind:</b> literal

           <b>desc:</b> <i>"All diagrams in the account"</i>

           <b>POST:</b>

             <b>desc:</b> <i>"Creates a new diagram"</i>

             <b>parameters:</b>

               - diagram_name

                 <b>required:</b> true

                 <b>desc:</b> <i>"Desired name for this diagram"</i>

               - template_id

                 <b>required:</b> false

                 <b>type:</b> numeric

                 <b>dsec:</b> <i>"If present, the id of the diagram to copy, instead of using the blank one"</i>

           <b>GET:</b>

             <b>desc:</b> <i>"Gets a list of all diagrams in this account"</i>

           <b>children:</b>

             <b>id:</b>

               <b>kind:</b> variable

               <b>type:</b> numeric

               desc <i>"The id of a particular diagram"</i>

               <b>GET:</b>

                 <b>desc:</b> <i>"Gets the diagram; the requested encoding type will determine the form"</i>

                 <b>parameters:</b>

                   - <b>version:</b> 

                     <b>desc:</b> <i>"The version to get, 1 is the original version.  If omitted, current version is retrieved"</i>

                     <b>required:</b> false

                     <b>type:</b> numeric

                   - <b>size:</b>

                     <b>desc:</b> "For rastered formats, determins the size

                     <b>type:</b> enumeration

                       - L

                       - M

                       - S

               <b>DELETE:</b>

                 <b>desc:</b> <i>"Deletes this image"</i>

          <b>users:</b>

            <b>kind:</b> literal

            <b>desc:</b> <i>"All users in the account"</i>

            <b>GET:</b>

              <b>desc:</b> <i>"gets a list of all users in the account"</i>

Since "accounts" is the only top-level element, we are saying that every request to this service must start with accounts/. It has one child, which is a variable value for the account name. It is untyped, so any potential string is allowed. That element has two possible children: diagrams and users. diagrams indicates that it responds to the HTTP methods POST and GET. A POST requires the parameter diagram_name, while the parameter version is optional.

A standard format like this could easily be used to generate documentation, expectations, test cases, and even stub code. This format could even be delivered by an OPTIONS call to a resource. I realize there is not much standardization around how to design and implement a REST service, but something like this could at least be a stake in the ground and support a specific method.

Didn't do Test-Driven Design? Record your test cases later

September 08, 2008

Following on from my post on Gliffy's blog...

On more than a few occasions, I've been faced with making significant refactorings to an existing application. These are things where we need to overhaul an architectural component without breaking anything, or changing the application's features. For an applicaiton without any test cases, this is not only scary, but ill-advised.

I believe this is the primary reason that development shops hang on to out-dated technology. I got a job at a web development shop after four years of doing nothing but Swing and J2EE. My last experience with Java web development was Servlets, JSPs and taglibs. This company was still using these as the primary components of their architecture. No Struts, no Spring, no SEAM. Why? One reason was that they had no test infrastructure and therefore not ability to refactor anything.

Doing it anyway

Nevertheless, sometimes the benefits outweigh the costs and you really need to make a change. At Gliffy, I was hired to create an API to integrate editing Gliffy diagrams into the workflow of other applications. After a review of their code and architecture, the principals and I decided that the database layer needed an overhaul. It was using JDBC/SQL and had become difficult to change (especially to the new guy: me). I suggested moving to the Java Persistence Architecture (backed by Hibernate), and they agreed. Only problem was how to make sure I didn't break anything. They didn't have automated tests, and I was totally new to the application environment.

They did have test scripts for testers to follow that would hit various parts of the application. Coming from my previous enviornment, that in and of itself was amazing. Since the application communicates with the server entirely via HTTP POST, and recieves mostly XML back, I figured I could manually execute the tests and record them in a way so they could be played back later as regression tests.

Recording Tests

This is suprisingly easy thanks to the filtering features of the Servlet specification:

<filter>
  <filter-name>recorder</filter-name>
  <filter-class>com.gliffy.test.online.RecordServletFilter</filter-class>
</filter>

<!-- ... -->

<filter-mapping>
  <filter-name>recorder</filter-name>
  <url-pattern>/*</url-pattern>
</filter-mapping>

The filter code is bit more complex, because I had to create proxy classes for HttpServletRequest and HttpServletResponse. Here's an overview of how everything fits together:

The request proxy had to read everything from the requests input stream, save it, and send a new stream that would output the same data to the caller. It had to do the same thing with the Reader. I'm sure it's an error to use both in the same request, and Gliffy's code didn't do that, so this worked well.

private class RecordingServletRequest extends javax.servlet.http.HttpServletRequestWrapper
{
    BufferedReader reader = null;
    ServletInputStream inputStream = null;

    String readerContent = null;
    byte inputStreamContent[] = null;

    public RecordingServletRequest(HttpServletRequest r) { super(r); }

    public BufferedReader getReader()
        throws IOException
    {
        if (reader == null)
        {
            StringWriter writer = new StringWriter();
            BufferedReader superReader = super.getReader();
            int ch = superReader.read();
            while (ch != -1)
            {
                writer.write(ch);
                ch = superReader.read();
            }
            readerContent = writer.toString();
            return new BufferedReader(new StringReader(readerContent));
        }
        return reader;
    }

    public ServletInputStream getInputStream()
        throws IOException
    {
        if (inputStream == null)
        {
            ByteArrayOutputStream os = new ByteArrayOutputStream();
            ServletInputStream superInputStream = super.getInputStream();
            int b = superInputStream.read();
            while (b != -1)
            {
                os.write(b);
                b = superInputStream.read();
            }
            inputStreamContent = os.toByteArray();
            inputStream = new ByteArrayServletInputStream(inputStreamContent);
        }
        return inputStream;
    }
}

The response recorder was a bit trickier, because I needed to save things like status codes and content types. This implementation probably wouldn't work for all clients (for example, it ignores any response headers), but since Gliffy is an OpenLaszlo app, and OpenLaszlo has almost no view into HTTP, this worked well for our purposes. Again, I had to wrap the OutputStream/Writer so I could record what was being sent back.

    private class RecordingServletResponse extends HttpServletResponseWrapper
{
    public RecordingServletResponse(HttpServletResponse r)
    {
        super(r);
    }

    int statusCode;
    StringWriter stringWriter = null;
    ByteArrayOutputStream byteOutputStream = null;
    String contentType = null;

    private PrintWriter writer = null;
    private ServletOutputStream outputStream = null;

    public ServletOutputStream getOutputStream()
        throws IOException
    {
        if (outputStream == null)
        {
            byteOutputStream = new ByteArrayOutputStream();
            outputStream = new RecordingServletOutputStream(super.getOutputStream(),new PrintStream(byteOutputStream));
        }
        return outputStream;
    }

    public PrintWriter getWriter()
        throws IOException
    {
        if (writer == null)
        {
            stringWriter = new StringWriter();
            writer = new RecordingPrintWriter(super.getWriter(),new PrintWriter(stringWriter));
        }
        return writer;
    }

    public void sendError(int sc)
        throws IOException
    {
        statusCode = sc;
        super.sendError(sc);
    }

    public void sendError(int sc, String msg)
        throws IOException
    {
        statusCode = sc;
        super.sendError(sc,msg);
    }

    public void setStatus(int sc)
    {
        statusCode = sc;
        super.setStatus(sc);
    }

    public void setContentType(String type)
    {
        contentType = type;
        super.setContentType(type);
    }
}

The filter then needs to use this and inject them into the actual servlet calls:

public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
    throws IOException, ServletException
{
    RecordingServletRequest recordingRequest = 
      new RecordingServletRequest((HttpServletRequest)request);
    RecordingServletResponse recordingResponse = 
      new RecordingServletResponse((HttpServletResponse)response);

    chain.doFilter(recordingRequest,recordingResponse);

After the call to doFilter, we can then examine the proxy request/respons and record the test. I'll spare you 20 lines of setXXX methods. I created a Java bean class and used XStream to serialize it. I then created another class that runs as a TestNG test to deserialize these files and make the same requests. I record the response and see if it matches.

Running the Tests

There were a few problems with this approach:

The tests required certain test data to exist
Each test potentially modifies the database, meaning the tests have to be run in the order they were created.
The test results had temporal data in them that, while irrelevant to the tests "passing", complicated exact-match comparisions of results

TestNG (and JUnit) are not really designed for this; they are more for proper unit testing, where each test can be run indepedent of the others and the results compared. While there are facilities for setting up test data and cleaning up, the idea of resetting the database before each of the 300 tests I would record was not appealing. Faking/mocking the database was not an option; I was creating these tests specifically to make sure my changes to the database layer were not causing regressions. I needed to test against a real database.

I ultimately decided to group my tests into logical areas, and ensure that: a) tests were run in a predictable order, and b) the first test of a group was run against a known dataset. I created a small, but useful, test dataset and created a TestNG test that would do both (a) and (b). It wasn't pretty, but it worked. This clearly isn't the way a unit test framework should be used, and I would call these sorts of tests functional, rather than unit. But, since our CI system requires JUnit test results as output, and the JUnit format isn't documented, might as well use TestNG to handle it for me.

The last problem was making accurate comparisons of results. I did not want to have to parse the XML returned by the server. I settled on some regular expressions that stripped out temporal and transient data not relevant to the test. Both the expected and received content were run through this regexp filter and those results were compared. Parsing the XML might result in better failure messages (right now I have to do a visual diff, which is a pain), but I wasn't convinced that the existing XML diff tools were that useful.

Results

Overall, it worked out great. I was able to completely overhaul the database layer, and the Gliffy client was none the wiser. We were even able to use these tests to remove our dependence on Struts, simplifying the application's deployment (we weren't using many features of Struts anyway). The final validation of these tests actually came recently, when we realized a join table needed to be exposed to our server-side code. This was a major change in two key data container, and the recorded tests were crucial to finding bugs this introduced.

So, if you don't have the luxury of automated tests, you can always create them. I did a similar thing with EJB3 using the Interceptors concept.

Daily backups are gonna save my butt

June 01, 2008

I used to never back up. Well, I'd throw some iTunes songs on a DVD every once in a while, but that's about it. Then, I started doing some pro audio work for friend's bands and figured it was time to get serious. My home computer, though, never really got the treatment. When I upgraded to Leopard, I reformatted an external drive for Time Machine, but I wanted to have something better, as a double-check on Time Machine. So, I did what I do for pro audio, which is to get a Firewire drive the exact size of my main drive, and use Super Duper! to mirror the drive every night, leaving the resulting drive bootable. A plain rsync won't work, for some reason; the drive gets duped, but isn't bootable. So, last night, I made the mistake of putting an old CD-R with a sticker on it into my slot-loading iMac (my main computer). Now, Apple needs to abandon this god-forsaken idiotic design decision that is literally designed to play Russian roulette every time you stick a disc in. The one thing I will give Windows over Mac: when you put a disc in a Windows box, you have a 100% chance of getting back out (and a 99% chance that doing will result in a bootable computer). Well, this disc wouldn't mount and wouldn't eject. A reboot of my computer resulted in...nothing. Gray screen forever. FUCK YOU APPLE. Would it really have been so bad to have a button? A tray that sticks out? Ugh. So, now I have to take my iMac to the Apple store to pray to Jobs that they can get the CD out. Meanwhile, I've got work to do. So, I grab my Macbook Pro that I use for Pro Audio, plug in my trusty firewire mirror drive, boot and...viola! It's like I never left? Thankfully, the only thing I've done on my computer today was check email and surf the web and since those things are, you know, still on the Internet, I'm back. I don't know how I'm going to sync things back up when I get my box back, but I am thanking GOD right now that I do nightly backups (and that I have another computer to fall back on). I guess when working from home, it's good to have a spare.

Using ThreadLocal and Servlet Filters to cleanly access JPA an EntityManager

May 14, 2008

My current project is slowly moving from JDBC-based database interaction to JPA-based. Following good sense, I’m trying to change things as little as possible. One of those things is that we are deploying under Tomcat and not under a full-blown J2EE container. This means that EJB3 is out. After my post regarding this configuration, I quickly realized that my code started to get littered with:

EntityManager em = null;
try
{
  em = EntityManagerUtil.getEntityManager();
  // do stuff with entity manager
}
finally
{
  try {
    if (em != null) em.close();
  } catch (Throwable t) {
    logger.error("While closing an EntityManager",t);
  }
}

Pretty ugly, and seriously annoying to have to add 13 lines of code to any method that needs to interact with the database. The Hibernate docs suggest using ThreadLocal variables to provide access to the EntityManager throughout the life of a request (which wouldn’t really work for a Swing app, but since this is servlet-based, it should work fine). The ThreadLocal javadocs contain possibly the most annoying example ever, and I didn’t follow how to use it.

Anyway, I finally got around to it, and also solved the close problem as well, by using a Servlet Filter. I guess this type of thing would normally be solvable by Spring or Guice, but I didn’t want to drag all of that into the application to refactor this one thing; I would’ve easily spent the rest of the day dealing with XML configuration and deployment.

The solution was quite simple:

/** Provides access to the entity manager.  */
public class EntityManagerUtil 
{
    public static final ThreadLocal<EntityManager> 
        ENTITY_MANAGERS = new ThreadLocal<EntityManager>();

    /** Returns a fresh EntityManager */
    public static EntityManager getEntityManager()
    {
        return ENTITY_MANAGERS.get();
    }
}

public class EntityManagerFilter implements Filter
{
    private Logger itsLogger = Logger.getLogger(getClass().getName());
    private static EntityManagerFactory theEntityManagerFactory = null;

    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
        throws IOException, ServletException
    {
        EntityManager em = null;
        try
        {
            em = theEntityManagerFactory.createEntityManager();
            EntityManagerUtil.ENTITY_MANAGERS.set(em);
            chain.doFilter(request,response);
            EntityManagerUtil.ENTITY_MANAGERS.remove();
        }
        finally
        {
            try 
            { 
                if (em != null) 
                    em.close(); 
            }
            catch (Throwable t) { 
                itsLogger.error("While closing an EntityManager",t); 
            }
        }
    }
    public void init(FilterConfig config)
    {
        destroy();
        theEntityManagerFactory = 
          Persistence.createEntityManagerFactory("gliffy");
    }
    public void destroy()
    {
        if (theEntityManagerFactory != null)
            theEntityManagerFactory.close();
    }
}

So, when the web app gets deployed, the entity manager factory is created (and closed when the web app is removed). Each thread that calls EntityManagerUtil to get an EntityManager gets a fresh one that persists for the duration of the request. When the request is completed, the entity manager is closed automatically.

Time Machine almost saved me, but git won out in the end

May 09, 2008

So, I'm working on a project that's using Subversion for version control. My network connection isn't great, plus subversion is slow, plus git is (so far) pretty awesomely awesome. The way to interact with an SVN repository is via git-svn, that I talked about setting up previously. Everything's been going great, however I don't frequently commit to subversion. This week, we started setting up continuous integration for my work, so I did an git-svn dcommit, committing two days worth of changes. I had forgotten that I had made so many changes (including adding hibernate support). I misread the commit messages and thought something bad was happening. Control-C. git log. HEAD is recent. Last commit was....yesterday. Oh. Fuck. I figure git-svn borked something, so I git-rest --hard. No effect. I'm starting to panic, now. almost 2 days of work lost is not something I'm looking forward to. I hasitly go into Time Machine and get the previous hours' backup. But, I just hate that solution. I have no idea what happened, and my trust in Git (or my ability to use it) has to be restored. After IM'ing with a co-worker, I got to the bottom of it. It turns out that I wasn't paying attention to how git-svn works. What it does when you do a rebase or dcommit (which implicitly does a rebase), is to first undo all your changes since your last rebase/dcommit, and get the changes made to the SVN repository (it even says as much as the first line of the output). It then "replays" your commits to make sure there's no conflicts. By hitting Control-C in the middle of that, I manually caused the same situation that would happen if there were conflicts. Git stops, tells you to resolve conflicts, and asks you to git-rebase --continue. If I had just git-rebase --continue'ed, I would be fine. Since I did a hard rest, I figured I was fucked. Enter the log. .git/logs/HEAD contained information about all activity, including my missing commits. I grab the version numbers (which, in Git, are hashes of the entire repository), do a git-reset --hard big.honkin.git.hash.version and viola! everything's back to how it was (the command ran instanteously, to boot).

naildrivin5.com

Website of David Bryant Copeland

Interviewing the Interviewer: A Rubric

October 07, 2008

Are you emailing yourself your log errors? You should be.

September 26, 2008

Getting Rake's PackageTask to depend on generated files

September 17, 2008

Better open-source hosting: SourceForge is looking weak

September 17, 2008

Test REST Services

September 12, 2008

Schema for REST services

September 11, 2008

Didn't do Test-Driven Design? Record your test cases later

September 08, 2008

Doing it anyway

Recording Tests

Running the Tests

Results

Daily backups are gonna save my butt

June 01, 2008

Using ThreadLocal and Servlet Filters to cleanly access JPA an EntityManager

May 14, 2008

Time Machine almost saved me, but git won out in the end

May 09, 2008