Writing even more legible SQL
November 04, 2016 📬 Get My Weekly Newsletter ☞
Craig Kerstiens wrote a great short blog post about writing more legible SQL. It’s a great read on an important topic—SQL can be very hard to read and modify—but I don’t think he goes far enough about “one thing per line”, so I’d like to show how I write SQL.
Rules of Thumb
I’m kind of a nut for code typography, but I think it’s important enough to get right and almost always worth going a bit farther than you’d think so code looks great.
- Write your SQL to be understood and read, even at the cost of modification (no leading commas FFS).
- Keywords in all-caps.
- Align, align, align, align.
- Don’t use table aliases unless required for disambiguation.
- Be consistent to a degree you never thought possible.
- Almost always one thing per line. Exceptions are
AND
and not much else.
Example
Craig writes:
SELECT foo,
bar
FROM baz
This is not one thing per line. This is (and is how I wold write that statement):
SELECT
foo,
bar
FROM
baz
I treat each part of a statement like a scoping block:
SELECT
«fields»
FROM
«tables»
WHERE
«where clauses»
ORDER BY
«ordering fields»
GROUP BY
«grouping »
HAVING
«good ole HAVING :)»
; -- semi on the last line only if needed; usually I
-- omit this since it's not needed in code
Let’s take a more extreme case, because when I say “align, align, align, align”, and “be consistent to a degree you never thought possible” I’m not kidding.
A Realer Example
Suppose we have a table transactions
that contains credit card transactions, users
containing, well, users and addresses
containing
addresses (to which a user has a shipping address). We want to get a report of transactions that includes pre-tax amount, tax amount, username, and shipping zipcode. We only want to show successful transactions and only those that used PayPal and only for users who signed up recently.
Here’s how I would write this query:
SELECT
transactions.id AS transaction_id,
transactions.amount AS pre_tax_amount,
transactions.tax AS tax,
users.username AS username,
addresses.zip AS shipping_zip
FROM
transactions
JOIN
users ON user.id = transactions.user_id
JOIN
addresses ON addresses.id = user.shipping_address_id
JOIN
transaction_types ON transaction_types.id = transactions.type_id
WHERE
users.signed_up_at > now() - interval '1 month' AND
transactions.success = true AND
transaction_type.name = 'PayPal'
ORDER BY
transactions.created_at
;
Here’s some great things about the way this SQL is typeset:
- Each line has a lot of context since we aren’t using aliases.
- The
FROM
andWHERE
clauses are easy to take-in as a whole, since we’ve used aggressive typography to line things up. - Similarly, the
SELECT
’s use ofAS
even for fields whose names we aren’t changing means you can easily see all the column names in one place. - The trailing
AND
means an errant copy/paste will cause a syntax error, not an incorrect execution. - In the
JOIN
clauses, the table with the.id
is always on the left, creating a nice rhythm when reading the statement.
When you have a lot of SQL to maintain—and you will if your application does anything complex—extra care toward formatting is crucial. SQL is notoriously hard to test, and the general lack of abstractions available make it hard to organize in any other way.
Also, let’s be honest, it’s fun as hell to bikeshed other people’s coding style :)