Unexpected behavior of binary conversions (COALESCE vs. ISNULL)

2

Can you comment on what approach shown below is preferable? I hope the question will not be blocked as "opinionated". I would like to believe there is an explanation that makes that clear.

Context: I have a code for mirroring 3rd party table contents to my own table (optimization). It worked some time flawlessly until the size/modification of the database reached some threshold.

The optimization is based on row version values of more tables, and remembering the maximum of the values from the source tables. This way I am able to update my local table incrementally, much faster than rebuilding it from time to time from scratch.

The problem started to appear when the row-version value exceeded the 4byte value. After some effort, I have spotted that the upper 4 bytes of the binary(8) value were set to 0. Later, the suspect was found to have a form COALESCE(MAX(row_version), 1).

The COALESCE was used to cover the case when the local table is fresh, containing now data -- for comparing the MAX(row_version) of source tables with something meaningful.

The examples to show the bug: To simulate the last mentioned situation, I want to convert the NULL value of the binary(8) column to 1. I am adding also the ISNULL usage that was added later. The original code contained the COALESCE only.

DECLARE @bin8null binary(8) = NULL

SELECT 'bin NULL' AS the_variable, @bin8null AS value
SELECT 'coalesce 1' AS op, COALESCE(@bin8null, 1) AS good_value
SELECT 'coalesce 1 + convert' AS op, CONVERT(binary(8), COALESCE(@bin8null, 1)) AS good_value
SELECT 'isnull 1' AS op, ISNULL(@bin8null, 1) AS good_value
SELECT 'isnull 0x1' AS op, ISNULL(@bin8null, 0x1) AS bad_value

(There is a bug in the image coalesce 0x1 + convert fixed later in the code to coalesce 1 + convert, but not fixed in the image.)

The application bug appeared when the binary value was bigger than the part that could be stored in 4 bytes. Here the 0xAAAAAAAA was used. (Actually, the 0x00000001 was the case, and it was difficult to spot that the single 1 was changed to 0.)

DECLARE @bin8 binary(8) = 0xAAAAAAAA01BB3A35

SELECT 'bin' AS the_variable, @bin8 AS value
SELECT 'coalesce 1' AS op, COALESCE(@bin8, 1) AS bad_value
SELECT 'coalesce 1 + convert' AS op, CONVERT(binary(8), COALESCE(@bin8, 1)) AS bad_value
SELECT 'coalesce 0x1 + convert ' AS op, CONVERT(binary(8), COALESCE(@bin8, 0x1)) AS good_value
SELECT 'isnull 1' AS op, ISNULL(@bin8, 1) AS good_value
SELECT 'isnull 0x1' AS op, ISNULL(@bin8, 0x1) AS good_value

When executed in Microsoft SQL Server Management Studio on MS-SQL Server 2014, the result looks like this: The results

Description -- my understanding: The COALESCE() seems to derive the type of the result from the type of the last processed argument. This way, the non-NULL binary(8) was converted to int, and that lead to the loss of upper 4 bytes. (See the 2nd and 3rd red bad_value on the picture. The difference between the two cases is only in decimal/hexadecimal form of display.)

On the other hand, the ISNULL() seems to preserve the type of the first argument, and converts the second value to that type. One should be careful to understand that binary(8) is more like a series of bytes. The interpretation as one large integer is only the interpretation. Hence, the 0x1 as the default value does not expand as 8bytes integer and produces bad value.

My solution: So, I have fixed the bug using ISNULL(MAX(row_version), 1). Is that correct?

sql
sql-server
asked on Stack Overflow Aug 20, 2020 by pepr • edited Aug 20, 2020 by Aaron Bertrand

2 Answers

3

Let me start of by saying:

This is not a "bug".

ISNULL and COALESCE are not the same function, and operate quite differently.

ISNULL takes 2 parameters, and returns the second parameter if the first has a value NULL. If the 2 parameters are different datatypes, then the dataype of the first datatype is returned (implicitly casting the second value).

COALESCE takes 2+ parameters, and returns the first non-NULL parameter. COALESCE is a short hand CASE expression, and uses Data Type Precendence to determine the returned data type.

As a result, this is why ISNULL returns what you expect, there is no implicit conversion in your query for the non-NULL variable.

For the COALESCE there is implicit conversion. binary has the lowest precedence of all the data types, with a rank of 30 (at time of writing). The value 1 is an int, and has a precedence of 16; far higher than 30.

As a result COALESCE(@bin8, 1) will implicitly convert the value 0xAAAAAAAA01BB3A35 to an int and then return that value. You see this as SELECT CONVERT(int,0xAAAAAAAA01BB3A35) returns 29047349, which your first "bad" value; it's not "bad", it's correct for what you wrote.

Then for the latter "bad" value, we can convert that int value (29047349) back to a binary, which results in 0x0000000001BB3A35, which is, again the result you get.

TL;DR: checking return types of functions is important. ISNULL returns the data type of first parameter and will implicitly convert the second if needed. For COALESCE it uses Data Type Precedence, and will implicitly convert the returned value to the data type of with the highest precedence of all the possible return values.

answered on Stack Overflow Aug 20, 2020 by Larnu • edited Aug 20, 2020 by Larnu
3

This is not a bug. They're documented to handle data type precedence differently. COALESCE determines the data type of the output based on examining all of the arguments, while ISNULL has a more simplistic approach of inspecting only the first argument. (Both still need to contain values which are all compatible, meaning they are all possible to convert to the determined output type.)

From the COALESCE topic:

Returns the data type of expression with the highest data type precedence.

The ISNULL topic does not make this distinction in the same way, but implicitly states that the first expression determines the type:

replacement_value must be of a type that is implicitly convertible to the type of check_expression.

I have a similar example (and describe several other differences between COALESCE and ISNULL) here. Basically:

DECLARE @int int, @datetime datetime;
SELECT COALESCE(@int, CURRENT_TIMESTAMP);

-- works because datetime has a higher precedence than the chosen output type, int
2020-08-20 09:39:41.763

GO
DECLARE @int int, @datetime datetime;
SELECT ISNULL(@int, CURRENT_TIMESTAMP);

-- fails because int, the first (and chosen) output type, has a lower precedence than datetime
Msg 257, Level 16, State 3
Implicit conversion from data type datetime to int is not allowed. Use the CONVERT function to run this query.

answered on Stack Overflow Aug 20, 2020 by Aaron Bertrand • edited Aug 20, 2020 by Aaron Bertrand

User contributions licensed under CC BY-SA 3.0