Skip to main content

Using Apply To Append Columns

·3 mins

Recently I was working with a complex SQL query that had been in production for some time. It was a situation where a CASE statement was being reused and called multiple times. As a result, with each call the CASE statement had to be rewritten. So the end output looked something like this:

SELECT pk.*
      ,fm.name as move_name
      ,fm.power
      ,fm.accuracy
      -- SOME MORE COMPLEX SQL CASE STATEMENT
      ,CASE WHEN fm.move_type IN ('normal', 'grass')
            THEN IFNULL(fm.power, 0) / (IFNULL(fm.accuracy, 1) * 1.0)
            WHEN fm.move_type = 'flying'
            THEN IFNULL(fm.power, 0) * 2 / (IFNULL(fm.accuracy, 1) * 1.0)
            ELSE IFNULL(fm.power, 0) * 3 / (IFNULL(fm.accuracy, 1) * 1.0) END AS new_var
      -- SOME MORE COMPLEX SQL CASE STATEMENT, AGAIN...
      ,SQRT(
       CASE WHEN fm.move_type IN ('normal', 'grass')
            THEN IFNULL(fm.power, 0) / (IFNULL(fm.accuracy, 1) * 1.0)
            WHEN fm.move_type = 'flying'             
            THEN IFNULL(fm.power, 0) * 2 / (IFNULL(fm.accuracy, 1) * 1.0)
            ELSE IFNULL(fm.power, 0) * 3 / (IFNULL(fm.accuracy, 1) * 1.0) END 
       ) AS new_var_sqrt
       -- ....
       -- ....
       -- MORE VERSIONS OF THE SAME CASE STATEMENT
FROM   pokemon.fact_pokemon pk
JOIN   pokemon.dim_moves mv
ON     pk.id = mv.id
AND    mv.version_group = 'diamond-pearl'
JOIN   pokemon.fact_moves fm
ON     mv.move = fm.name
AND    fm.version = 'diamond-pearl'

This is a bit of an oversimplification (especially using a simple pokemon database), but one can see how repeating a CASE statement can become unwieldy and annoying to handle. One could argue the code shouldn’t have been written this way to begin with, but changing or rewriting production code can be time-consuming and risky. This led me down the path of using a cross apply, so we only had to select the CASE statement once and then we could just call the column name at any time. So the updated code would be something like this:

SELECT pk.*
      ,fm.name as move_name
      ,fm.power
      ,fm.accuracy
      ,nw.new_col
      ,sqrt(nw.new_col) as new_col_sqrt
      -- ....
      -- ....
      -- MORE VERSIONS OF THE new_col 
FROM   pokemon.fact_pokemon pk
JOIN   pokemon.dim_moves mv
ON     pk.id = mv.id
AND    mv.version_group = 'diamond-pearl'
JOIN   pokemon.fact_moves fm
ON     mv.move = fm.name
AND    fm.version = 'diamond-pearl'
CROSS APPLY (
       SELECT CASE WHEN fm.move_type IN ('normal', 'grass')
                   THEN IFNULL(fm.power, 0) / (IFNULL(fm.accuracy, 1) * 1.0)
                   WHEN fm.move_type = 'flying'
                   THEN IFNULL(fm.power, 0) * 2 / (IFNULL(fm.accuracy, 1) * 1.0)
                   ELSE IFNULL(fm.power, 0) * 3 / (IFNULL(fm.accuracy, 1) * 1.0) END  AS new_col
) as nw

We now only have one CASE statement to update and the code becomes a lot easier to understand. Of course, with SQL we are always limited by the functions available to us with the SQL flavor we are using. Nonetheless, remembering something like this can help clean up code and prevent missing END keywords in CASE statements down the road.