ForComprehensions
The Gist
Sequence Comprehensions describes Scala’s for “loop”
My Interpretation
You can read the wikipedia entry on list comprehensions if you like, but what we’re talking about here is the many ways in which Scala allows you to traverse and process a list.
We’ve seen some of this in ScalaFunctions and XmlLiterals , but these deal with more specific means of traversing a list (e.g. to find
a specific element or map
elements to other elements). These are all specialized versions of the so-called “for-comprehension”, which is a fancy name for a for
loop.
Suppose we have a list of U.S. States that have been tagged with their general location in the U.S. (e.g. “east” vs. “midwest” vs. “south”). Now suppose we wish to get a list of all the states that aren’t on the east, except for Washington, DC, which, for some reason, we don’t consider east coast
The expression state <- states
is a generator, which assigns a value from states
to the value state
in succession. Each time this happens, the “guard condition” that starts with if
is evaluated. If it evaluates to true (or if there is no guard condition), we “mark” this item for iteration. Once the collection has been processed, we iterate over each “marked” item executing the body of the for-comprehension (this is the part that comes after the final paren). In this case, we use the yield
keyword, which essentially means “yield this value back to the list we are creating”. Yes, we are creating a list, here. The end result of this is a list of states that matched our guard condition.
Note the subtle difference here; in Java we would iterate over the entire collection; in Scala we are building up a collection over which to iterate via the guard condition and then executing the body of the for-comprehension. This means that, in general, conditions inside the for-comprehension body cannot affect conditions in the guard condition:
var done = false for (state <- states if !done) if state.code == "DC" done = true
When we hit DC, we will not stop; this is because the guard condition has already been evaluated for every item in the list before we got to the body. Yikes (See this mind-blowing article on the crazy subtleties of the for
loop)
If we didn’t want to actually yield a list, we can omit the yield
keyword alltogether:
for( state <- states if state.location != 'east || state.code == "DC") println(state)
Nested Loops
Instead of nesting for
loops, we can simply add expressions to the comprehension. Suppose we wish to find pairs of states that could be “sister” states; states that aren’t in the same geographic area.
val sisters = for( state1 <- states; state2 <- states if state1.location != state2.location) yield (state1,state2)
Here, we loop through each pair of states, yield a tuple of states that don’t have the same location
.
What about map
, filter
, etc.?
In reality, the for
comprehensions are translated by the compiler into calls to map
, filter
, and flatMap
. Section 10.3 of Scala By Example details this.
My Thoughts on this Feature
I thought I understood this feature until I read this article . Now, I’m a bit lost, mostly as to why it works this way. Perhaps the name for
is misleading; in most programming languages, for
means “do something for everything in the list”. I guess the definition of the “list” is what’s unclear. I find it hard to know what a particular for
statement in Scala will actually do. That doesn’t seem good.
I suppose once I’ve internalized how it works, it will seem more obvious.