The Law of Demeter Creates More Problems Than It Solves
January 22, 2020 📬 Get My Weekly Newsletter ☞
Most developers, when invoking the “Law of Demeter” or when pointing out a “Demeter Violation”, do so when a line of code has more than one dot: person.address.country.code
. Like the near-pointless SOLID Principles, Demeter, too, suffers from making a vague claim that drives developers to myopically unhelpful behavior.
Writing SOLID is not Solid, I found the backstory and history of the principles really interesting. They were far flimsier than I had expected, and much more vague in their prescription. The problem was in their couching as “principles” and the overcomplex code that resulted from their oversimplification. Demeter is no different. It aims to help us manage coupling between classes, but when blindly applied to core classes and data structures, it leads to convoluted, over-de-coupled code that obscures behavior.
What is this Law of Demeter?
Update Jan 24, 2020: My former collegue Glenn Vanderburg pointed me to what he believes it he source of the “Law of Demeter”, which looks like a fax of the IEEE Software magazine in which it appears! It’s on the Universitatea Politehnica Timisoara’s website.
It does specifically mention object-oriented programming, and it states a few interesting things. First, it mentions pretty explicitly that they have no actual proof this law does what it says it does (maybe then don’t call it law? I dunno. That’s just me). Second, it presents a much more elaborate and nuanced definition than the paper linked below. The definitions of terms alone is almost a page long and somewhat dense.
Suffice it to say, I stand even more firm that this should not be called a “Law” and that the way most programmers understand by counting dots is absolutely wrong. This paper is hard to find and pretty hard to read (both due to its text, but also its presentation). I would be surprised if anyone invoking Demeter in a code review has read and understood it.
It’s hard to find a real source for the Law of Demeter, but the closest I could find is this page on Northeastern’s webstie, which says:
End of Update
This page on Northeastern’s webstie, summarizes the Law as stated in the paper above:
Each unit should have only limited knowledge about other units: only units “closely” related to the current unit.
The page then attempts to define “closely related”, which I will attempt to restate without the academic legalese:
- A unit is some method
meth
of a classClazz
- Closely related units are classes that are:
- other methods of
Clazz
. - passed into
meth
as arguments. - returned by other methods of
Clazz
. - any instance variables of
Clazz
.
- other methods of
Anything else should not be used by meth
. So for example, if meth
takes an argument arg
, it’s OK to call a method other_meth
on arg
(arg.other_meth
), but it’s not OK to call a method on that (arg.other_meth.yet_another_meth
).
It’s also worth pointing out that this “Law” was not developed for the sake of object-oriented programming, but for help defining aspect-oriented programming, which you only tend to hear about in Java-land, and even then, not all that much.
That all said, this advice seems reasonable, but it does not really allow for nuance. Yes, we want to reduce coupling, but doing so has a cost (this is discussed at length in the book). In particular, it might be preferable for our code’s coupling to match that of the domain.
It also might be OK to be overly coupled to our language’s standard library or to the framework components of whatever framework we are using, since that coupling mimics the decision to be coupled to a language or framework.
Code Coupling can Mirror Domain Coupling
Consider this object model, where a person has an address, which has a country, which has a code.
Suppose we have to write a method to figure out taxes based on country code of a person. Our method, determine_tax_method
takes a Person
as an argument. The basic logic is:
- If a person is in the US and a contractor, we don’t do tax determination.
- If they are in the US and not a contractor, we use the US-based tax determination, which requires a zipcode.
- If they are in the UK, we use the UK based determination, which requires a postcode.
- Otherwise, we don’t do tax determination.
Here’s what that might look like:
class TaxDetermination
def determine_tax_method(person)
case person.address.country.code
when "US"
if person.contractor?
NoTaxDetermination.new
else
USTaxDetermination.new(person.address.postcode)
end
when "UK"
UKTaxDetermination.new(person.address.postcode)
else
NoTaxDetermination.new
end
end
end
If address
, country
, and code
are all methods, according to the Law of Demeter, we have created a violation,
because we are depending on the class of an object returned by a method called on an argument. In this case,
the return value of person.address
is a Country
and thus not a “closely related unit”.
But is that really true?
Person
has a well-defined type. It is defined as having an address, which is an Address
, another well-defined
type. That has a country, well-defined in the Country
class, which has a code
attribute that returns a
string. These aren’t objects to which we are sending messages, at least not semantically. These are data
structures we are navigating to access data from our domain model. The difference is meaningful!
Even still, it’s hard to quantify the problems with a piece of code. The best way to evaluate a technique is to compare code that uses it to code that does not. So, let’s change our code so it doesn’t violate the Law of Demeter.
A common way to do this is to provide proxy methods on an allowed class to do the navigation for us:
class TaxDetermination
def determine_tax_method(person)
case person.country_code
# ^^^^^^^^^^^^
when "US"
if person.contractor?
NoTaxDetermination.new
else
USTaxDetermination.new(person.postcode)
# ^^^^^^^^
end
when "UK"
UKTaxDetermination.new(person.postcode)
# ^^^^^^^^
else
NoTaxDetermination.new
end
end
end
How do we implement country_code
and postcode
?
class Person
def country_code
self.address.country.code
end
def postcode
self.address.postcode
end
end
Of course, country_code
now contains a Demeter Violation, because it calls a method on the return type of a
closely related unit. Remember, self.address
is allowed, and calling methods on self.address
is allowed, but
that’s it. Calling code
on country
is the violation. So…another proxy method.
class Person
def country_code
self.address.country_code
# ^^^^^^^^^^^^
end
end
class Address
def country_code
self.country.code
end
end
And now we comply with the Law of Demeter, but what have we actually accomplished? All of the methods we’ve been dealing with are really just attributes returning unfettered access to public members of a data structure.
We’ve added three new public API methods to two classes, all of which require tests, which means we’ve incurred both an opportunity cost in making them and a carrying cost in their continued existence.
We also now have two was to get a person’s country code, two ways to get their post code, and two was to get the country code of an address. It’s hard to see this as a benefit.
For classes that are really just data structures, especially when they are core domain concepts that drive the reason for our app’s existence, applying the Law of Demeter does more harm than good. And when you consider that most developers who apply it don’t read the backstory and simply count dots in lines of code, you end up with myopically overcomplex code with little demonstrable benefit.
But let’s take this one step further, shall we?
Violating Demeter by Depending on the Standard Library
Suppose we want to send physical mail to a person, but our carrier is a horrible legacy US-centric one that requires
being given a first name and last name. We only collected full name, so we fake it out by looking for a space in
the name. Anyone with no spaces in their names is handled manually by queuing their record to a customer service
person via handle_manually
.
class MailSending
def send_mailer(person)
fake_first_last = /^(?<first>\S+)\s(?<last>.*)$/
match_data = fake_first_last.match(person.name)
if match_data
legacy_carrier(match_data[:first], match_data[:last])
else
handle_manually(person)
end
end
end
This has a Demeter violation. A Regexp
(created by the /../
literal) returns a MatchData
if there is match. We can’t call methods on an object returned by one of our closely related units’ methods. We can call match
on a Regexp
, but we can’t call a method on what that returns. In this case, we’re calling []
on the returned MatchData
. How do we eliminate this egregious problem?
We can’t make proxy methods for first name and last name in Person
, because that method will have the same problem as this one (it also would put use-case specific methods on a core class, but that’s another problem). We really do need to both match a regexp and examine its results. But the Law does not allow for such subtly! We could create a proxy class for this parsing.
class LegacyFirstLastParser
FAKE_FIRST_LAST = /^(?<first>\S+)\s(?<last>.*)$/
def initialize(name)
@match_data = name.match(FAKE_FIRST_LAST)
end
def can_guess_first_last?
!@match_data.nil?
end
def first
@match_data[:first]
end
def last
@match_data[:last]
end
end
Now, we can use this class:
class MailSending
def send_mailer(person)
parser = LegacyFirstLastParser.new(person.name)
if parser.can_guess_first_last?
legacy_carrier(parser.first, parser.last)
else
handle_manually(person)
end
end
end
Hmm. LegacyFirstLastParser
was just plucked out of the ether. It definitely is not a closely-related unit
based on our definition. We’ll need to create that via some sort of private method:
class MailSending
def send_mailer(person)
parser = legacy_first_last_parser(person.name)
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
if parser.can_guess_first_last?
legacy_carrier(parser.first, parser.last)
else
handle_manually(person)
end
end
private
def legacy_first_last_parser(name)
LegacyFirstLastParser.new(name)
end
end
Of course, legacy_first_last_parser
has the same problem as send_mailer
, in that it pulls
LegacyFirstLastParser
out of thin air. This means that MailSending
has to be given the class, so let’s invert
those dependencies:
class MailSending
def initialize(legacy_first_last_parser_class)
@legacy_first_last_parser_class = legacy_first_last_parser_class
end
def send_mailer(person)
parser = legacy_first_last_parser(person.name)
if parser.can_guess_first_last?
legacy_carrier(parser.first, parser.last)
else
handle_manually(person)
end
end
private
def legacy_first_last_parser(name)
@legacy_first_last_parser_class.new(name)
end
end
This change now requires changing every single use of the MailSending
class to pass in the
LegacyFirstLastParser
class. Sigh.
Is this all better code? Should we have not done this because Regexp
and MatchData
are in the standard
library? The Law certainly doesn’t make that clear.
Just as with all the various SOLID Principles, we really should care about keeping the coupling of our classes low and the cohesion high, but no Law is going to guide is to the right decision, because it lacks subtly and nuance. It also doesn’t provide much help once we have a working understanding of coupling and cohesion. When a team aligns on what those mean, code can discussed directly—you don’t need a Law to help have that discussion and, in fact, talking about it is a distraction.
Suppose we kept our discussion of send_mailer
to just coupling. It’s pretty clear that coupling to the
language’s standard library is not a real problem. We’ve chosen Ruby, switching programming languages would be a
total rewrite, so coupling to Ruby’s standard library is fine and good.
Consider discussing coupling around determine_tax_method
. We might have decided that since people, addresses,
and countries are central concepts in our app, code that’s coupled to them and their interrelationship is
generally OK. If these concepts are stable, coupling to them doesn’t have a huge downside. And the
domain should be stable.
Damn the Law.