Reading a File in Scala vs. Java vs Ruby
January 26, 2010 📬 Get My Weekly Newsletter ☞
Examining the code needed to read a file line by line is a a common way to examine the hoops a programming language makes you jump through. While Perl certainly has some one-liners for this, let's start with Ruby, which presents an elegant and clear way of doing it: It really doesn't get any clearer than that.
Here's the canonical Java way of doing it, complete with plenty of places to introduce bugs:
Yech. The need to call readLine()
twice kinda sucks. We could use a do-while, but that requires a second line != null
check. Personally, I like to forget the second readLine()
and wonder why my code runs forever :) That being said, this was extremely easy to figure out, even the very first time I did it in 1998. The class names are obvious, and the documentation is excellent.
Scala to the rescue, right?
This was a slight pain figure out. I looked in scala.io
and, of the few classes that were there (including a curiously named BytePickle
), it appeared as though Source
was the class to use. Of course, there's no easy way to create one from the constructor, and the scaladoc doesn't just say "Dude, look at the Source
object". Once I looked through the Source
object's scaladoc, the solution presented itself.
Of course, unlike every other line-traversing library in the known universe, Source
leaves the line endings on. This is thankfully fixed in 2.8 (by which I mean 2.8 breaks 2.7's implementation, which is a strange thing for a point release to do). The real question is: "Is this how I'm supposed to read files in Scala?". With a class called Source?!
reportError
and reportWarning
. I guess this is only for writing the Scala compiler? If so, scala.io
seems an odd place to put this.
So, my answer is "No, this cannot be how to canonically read files in Scala". Since the Java way kinda, well, sucks, what alternatives are there? There's scalax.io
, which seems to implement this as a class called, curiously, FileExtras
. I'm not sure if this code is actively maintained, but it's documented in classic Scala style: terse and full of loaded terms like "nonstrict". Nevertheless, there seems to be some code here to easily read a file "the easy way" (despite some distracting names).
This points out a big difference between "Scala the language" and "Scala the library". Scala the language is very interesting and has a lot of potential. Scala the library is schizophrenic at best; it's not sure if it wants to be OO, functional, or what. The documentation ranges from sparse to absent, and the overall designs of the classes and package range for sublime to baffling. Years different from Java 1.1.