A NULL value in a database record can create some interesting problems for a variety of database operations, but people get especially confused about how to best deal with them for joins. Let's say you have these records in your SalesPerson table:
LastName FirstName Phone
Humphrey John 214.555.1111
Steinberg Michael NULL
Boswell Aaron 214.555.2222
Orrell Brian 214.733.3333
And you have an older version of the table with these records:
LastName FirstName Phone
Humphrey John 972.555.4444
Steinberg Michael NULL
Boswell Aaron 214.555.2222
If you wanted to know which records from the old version were exact matches in the new version, you could join them and see what matches. Since you don't have a unique primary key and because you want to return records that are complete matches, you have to use all fields in the join, which would look like this:
SELECT
SalesPerson.LastName
, SalesPerson.FirstName
, SalesPerson.PhoneNumber
FROM
SalesPerson
JOIN
SalesPerson_OLD
ON SalesPerson.LastName = SalesPerson_OLD.LastName
AND SalesPerson.FirstName = SalesPerson_OLD.FirstName
AND SalesPerson.PhoneNumber = SalesPerson_OLD.PhoneNumber
Only one record is returned from this query: Aaron Boswell is a complete match in both tables. John Humphrey was left out because his phone number changed and Brian Orrell only existed in the new version of the table. But why didn’t the record for Michael Steinberg result in a match?
Of course, the reason is the NULL value for the phone number. The important thing to keep in mind about a NULL value is that it does not just mean “empty” or “nothing.” It can also mean “unknown.” Since 2 unknowns can not be said to be equal, it follows that NULL <> NULL.
So, what’s the right solution? One option is to use ISNULL to convert all NULL values to something that can be compared:
SELECT
SalesPerson.LastName
, SalesPerson.FirstName
, SalesPerson.PhoneNumber
FROM
SalesPerson
JOIN
SalesPerson_OLD
ON ISNULL(SalesPerson.LastName, '') = ISNULL(SalesPerson_OLD.LastName, '')
AND ISNULL(SalesPerson.FirstName, '') = ISNULL(SalesPerson_OLD.FirstName, '')
AND ISNULL(SalesPerson.PhoneNumber, '') = ISNULL(SalesPerson_OLD.PhoneNumber, '')
This was the solution I initially came up with in a real world problem, until Samir pointed me in the right direction. The query gets the results we are expecting (both Aaron and Michael are matches), but it has a couple of major drawbacks:
1. By using a function to convert the data, we are preventing the use of any indexes on these columns for the join.
2. We would implicitly be making a business decision to treat NULLs as Empty Strings. This could lead to false positives if one of the fields was a real empty string and the other started off as NULL. False positives become even more problematic when dealing with numeric and other data types where you would have to treat NULLs as zeroes or some other arbitrary number.
The better option is to check for NULL values in the relevant columns without converting the values. To do that, the query would look like this:
SELECT
SalesPerson.LastName
, SalesPerson.FirstName
, SalesPerson.PhoneNumber
FROM
SalesPerson
JOIN
SalesPerson_OLD
ON (
SalesPerson.LastName = SalesPerson_OLD.LastName
OR (SalesPerson.LastName IS NULL AND SalesPerson_OLD.LastName IS NULL)
)
AND (
SalesPerson.FirstName = SalesPerson_OLD.FirstName
OR (SalesPerson.FirstName IS NULL AND SalesPerson_OLD.FirstName IS NULL)
)
AND (
SalesPerson.PhoneNumber = SalesPerson_OLD.PhoneNumber
OR (SalesPerson.PhoneNumber IS NULL AND SalesPerson_OLD.PhoneNumber IS NULL)
)
This is a lot of code, but only because it is explicit about how you want to use NULLs. For those of you that prefer EXISTS to JOINs for this type of query, you’ll run into the same problem. You’ll have to modify the WHERE clause of the subquery in the same way we adjusted the JOIN criteria above.
A commom problem in Visio is how to make your controls the same size, or to align them properly. Many of us have become addicted to the tools that Visual Studio provides for this functionality. Visio does not provide the same tools, but I did find an easy workaround: use “snap“ more efficiently.
1. Under Tools - Snap & Glue, uncheck all of the “Snap to” options other than “Grid.”
2. Under Tools - Ruler & Grid, set your Horizontal and Vertical Grid spacing to both be “Normal.”
This gives you significantly more control to visually line up your controls and to make them the same size. Its still not as cool or easy to use as what you get with Visual Studio, but it's a start.