Why Github Can Open-Source Their Libraries

November 01, 2009 📬 Get My Weekly Newsletter

One thing I love about Github is that they open-source a lot of their internal tools that power the site. What's interesting is that, unlike SourceForge, they open source little bits and pieces; tiny libraries that do one specific thing. These things are supremely useful (I use Grit and Jekyll quite often).

This is a huge benefit to them; their products become higher-quality through contribution, and their talent-pool increases due to their contribution to the community; they are positioned as a technical leader and social force in the development community. I've often wondered why more companies don't do this and what's really involved?

There are three main hurdles to overcome in order to do this:

  • Usefulness - do you have code that someone will find useful?
  • Legalish - are you comfortable giving away some part of your company's intellectual property?
  • Technical - does your technical infrastructure support extraction, collaboration, and re-integration?


I think comparing Github to Sourceforge makes this point very clear. While SourceForge does more than, say, BERT, it's just a big huge all-or-nothing proposition; Github has extracted small parts of their infrastructure useful outside the realm of "software project hosting", resulting in many useful, fine-grained libraries.


The issue here is essentially to determine if the gains achieved by open-sourcing some of your code are greater than the competitive advantage lost by doing so. Again, the ability to extract small, focused, and useful pieces of functionality is key. Github isn't open-sourcing their entire infrastructure; just the parts that are really useful and not incidental to their success (though one may argue that their IP has nothing to do with their success).


Here's where things get interesting. Extracting a small, useful piece of technology from your application, without revealing any trade secrets or other IP can be a challenging task. Add to that the infrastructure needed to manage and incorporate contributions, and your technical infrastructure could be the main barrier to interacting with the community (and not the lawyers!).

My company struggles with this daily, as our tools were just not designed to make extraction and management of dependencies easy. Our problem is almost entirely based on tooling and technology choice (which I'll get into in a further post).


The advantages to open-sourcing some of your code are obvious: you can improve the quality of your codebase while improving your standing in the community (which enahnces your ability to attract top-talent). You just need to make sure your technical infrastructure is built for it, and that the lines of communication between the development staff and the legal staff are clear and functional.

That Github routinely open-sources bits of their code speaks volumes to the choices they've made as an organization, as well as their technical prowess.