Skip to content

A more readable alternative to SQL’s CASE expression

Abstract

Introducing SQL’s function decode as a more readable and maybe more performant alternative to the case-expression, usable when the conditions all have a certain structure.

For a certain kind of case-when-then-expression, you can use the SQL function decode as an alternative that I find more readable and that I prefer whenever I can.

Granted, the case expression is much more flexible than decode and can thus be used under many more circumstances, but it’s exactly the limits of decode for which I immediately valued it once I first heard of decode. The restrictions of decode mean that whenever I see decode being used, I know exactly what awaits me reading the coming code and, more importantly, what cannot await me. Whenever I see decode, I can be certain that it’s that particular use case.

At my workplace, I’m the code owner of much of our SQL code and I thus review a lot of SQL code that was written by other people. After I learned about SQL’s decode function, I taught it to my colleagues and am always grateful when they use it.

As a data engineer, part of my job is to tidy up raw/unprocessed/"dirty" data.

Example#

I review many merge requests that contain SQL code like this:

select
  case
    when is_active = 'true' then true
    when is_active = 'false' then false
    else null
  end as is_active
from some_table

The logic is not the point, it’s more about the structure/syntax of the code. In this case-expression, we examine one (and only one) column, is_active, and look out for certain values it might contain. For a set of pre-defined values, we replace the original value with a different value. In this particular example, it bothers us that the column is_active is of type text rather than of type boolean, and we want to fix that. The three expected values would be 'true', 'false' and null. There could also be unexpected strings in that column. As I said, the logic isn’t the point, and there are several approaches to achieve this particular goal. The point in this (maybe contrived) example is to replace the values of one particular column with other values. We want to use the/some values of the given column as "keys" of a pre-defined dictionary and replace all values in the given column either with the according "dictionary value" or a default value (null in this example). Some values of a given column serve as code for another value, similar to an identifier, and we want to de-code that column.

I could replace the code above with this (in my opinion more readable) code:

select
  decode(
    is_active,
    'true', true,
    'false', false
  ) as is_active
from some_table

I know I could have solved that particular problem in different ways. To show you the true value of decode, I’ll show you another example, and then I’ll explain why I like decode so much.

I also didn’t specially handle the else null case, since null is the default and fallback. I could have explicitly handled the "default default" and write this equivalent, more verbose code instead:


    'false', false,
    null
  ) as is_active

Another example#

Consider this code:

select
  case
    when importance = 1 then 'high'
    when importance = 2 then 'medium'
    when importance = 3 then 'low'
    else 'unimportant'
  end as importance
from some_table

We have a column, importance, that obviously contains numbers, and we want to map those numbers to (pre-defined) strings.

Personally, I would write this equivalent code instead:

select
  decode(
    importance,
    1, 'high',
    2, 'medium',
    3, 'low',
    'unimportant'
  ) as importance
from some_table

How decode works#

The arguments the function decode expects can be grouped into basically three groups:

  • the column to be examined
  • at least one pair X,YX, Y of a pre-defined XX and a pre-defined YY where XX will be substituted with YY
  • an optional default value that will replace the column’s value if it’s not one of the pre-defined XX values and will be null if it’s not specified

Why I prefer decode whenever it can be used#

When decode is used, I (as a reviewer) no longer have to pay special attention to the when part to check whether the condition consists of something more than the basic comparison of the column value with a specific value. As soon as I see the keyword decode, I no longer have to remain alert to the possibility that the condition could surprise me; that there could come a case that would require tricky special logic that I could miss if I don’t keep a watchful eye.

The flexibility of the case-statement allows that logic which looks simple at a first glance could contain a twist; that one of the conditions could, for example, be importance = 3 and some_other_column is true instead of just importance = 3 like all the previous conditions; or that a condition could be entirely unrelated to the column importance. When decode is used, I can be sure from the beginning that no such thing can happen.

When my co-workers use decode whenever they can, it makes reviewing their SQL code more pleasant for me. I can be sure that I can skim over the conditions and won’t overlook a special condition because there can’t be a special condition. The limitations of decode simply don’t allow for deviations in the structure of the conditions which means they’ll all have the same structure.

Final words#

Today I learned about the SQL function decode which can be used as a more readable alternative to the case-expression if all conditions follow a certain pattern. Hopefully, you could learn something from this post too. In any case, be well.

Buy Me A Coffee

Comments