Example of Query Clarity with BETWEEN

A bit ago ago I posted some optimizer hints and recommended using them to write more readable queries. Today I stumbled across an example of my own work that I changed. This is not quite a case of using BETWEEN instead of AND statements, but it sort of is. Basically, I needed to reuse this query:

SELECT count(*),city,
CASE WHEN lastLogin < now ()-interval 90 day THEN 90 WHEN lastLogin < NOW()-interval 60 day THEN 60 WHEN lastLogin < NOW()-interval 30 day THEN 30 WHEN lastLogin > NOW()-interval 30 day THEN 0
ELSE -1
END
FROM . . .

I wrote this query less than a month ago.

I looked at this query today and wondered, “why did I not put lastLogin < now ()-interval 60 AND lastLogin>NOW()-interval 90 in there?” I then realized what I did.

Because the CASE statement evaluates in order, to make the query more concise I used a standard procedural programming trick — I put the older cases first. In this way, something older than 90 days gets caught in the first case, something between 60-90 days gets caught in the 2nd case, etc.

This is great, and probably even standard for a procedural language. And in fact, MySQL’s CASE statement works the same.

However, it is much more clear (in my mind) to write:

SELECT count(*),city,
CASE WHEN lastLogin < (NOW()-interval 90 day) THEN 90 WHEN lastLogin BETWEEN (NOW()-INTERVAL 60 DAY) AND (NOW()-INTERVAL 90 DAY) THEN 60 WHEN lastLogin BETWEEN (NOW()-INTERVAL 30 DAY) AND (NOW()-INTERVAL 60 DAY) THEN 30 WHEN lastLogin < (NOW()-INTERVAL 30 DAY) THEN 0 ELSE -1 END FROM . . .

It's more characters, which is less efficient. But I feel it is more readable, because it really does spell out declaratively what I am looking for.

A bit ago ago I posted some optimizer hints and recommended using them to write more readable queries. Today I stumbled across an example of my own work that I changed. This is not quite a case of using BETWEEN instead of AND statements, but it sort of is. Basically, I needed to reuse this query:

SELECT count(*),city,
CASE WHEN lastLogin < now ()-interval 90 day THEN 90 WHEN lastLogin < NOW()-interval 60 day THEN 60 WHEN lastLogin < NOW()-interval 30 day THEN 30 WHEN lastLogin > NOW()-interval 30 day THEN 0
ELSE -1
END
FROM . . .

I wrote this query less than a month ago.

I looked at this query today and wondered, “why did I not put lastLogin < now ()-interval 60 AND lastLogin>NOW()-interval 90 in there?” I then realized what I did.

Because the CASE statement evaluates in order, to make the query more concise I used a standard procedural programming trick — I put the older cases first. In this way, something older than 90 days gets caught in the first case, something between 60-90 days gets caught in the 2nd case, etc.

This is great, and probably even standard for a procedural language. And in fact, MySQL’s CASE statement works the same.

However, it is much more clear (in my mind) to write:

SELECT count(*),city,
CASE WHEN lastLogin < (NOW()-interval 90 day) THEN 90 WHEN lastLogin BETWEEN (NOW()-INTERVAL 60 DAY) AND (NOW()-INTERVAL 90 DAY) THEN 60 WHEN lastLogin BETWEEN (NOW()-INTERVAL 30 DAY) AND (NOW()-INTERVAL 60 DAY) THEN 30 WHEN lastLogin < (NOW()-INTERVAL 30 DAY) THEN 0 ELSE -1 END FROM . . .

It's more characters, which is less efficient. But I feel it is more readable, because it really does spell out declaratively what I am looking for.

Comments are closed.