A good friend of mine Rick Kruger (@DataOgre|blog) is hosting this month’s T-SQL Tuesday Blog party. The T-SQL Tuesday Blog parties were started by Adam Machanic (@AdamMachanic|blog) back in December of 2009. This month’s topic is on Rube Goldberg Machines. Rick asked us to look through our closets to dig up a skeleton, something that we made T-SQL do, which might make other DBA’s cringe a bit. After hearing this, I knew exactly what I was going to blog about.
I remember when I implemented this feature and shook my head when I came up with the idea of using bitwise matching. The requirements I was given was to match two entities based on assigned lookup values. This would not have been that bad, except the lookup values were not static, they were dynamically entered through a web UI. When I first pitched the idea I used a dating site as a way to explain the logic, so in this blog post I will use that same concept.
Within a dating site one person is matched to another based on activities they both enjoy. Activities can be grouped into different categories. In this example we have two different categories; Exercise and Things to do. People must have at least one matching activity in each category to be a match. We are going to try to match Bob with another person.
People and their activities
- Bob enjoys Walking, Biking, and Camping.
- Sally enjoys Walking and Biking
- Nancy enjoys Camping
- Beth enjoys Walking, Running, Golfing, and Camping.
The following T-SQL implements a few temporary tables we will use to implement the bitwise matching logic.
-- Create the People that can be matched. CREATE TABLE #Person ( PersonID smallint , Name nvarchar(25)); INSERT INTO #Person (Name) VALUES (1, N'Bob'), (2, N'Sally'), (3, N'Nancy'), (4, N'Beth'); -- Create the Activities that people may have in common. CREATE TABLE #Activities ( ActivityID smallint , Activity nvarchar(25) , CategoryID smallint); INSERT INTO #Activities (Activity, CategoryID) VALUES (1, N'Walking', 1), (2, N'Biking', 1), (3, N'Running', 1), (4, N'Yoga', 1), (5, N'Movies', 2), (6, N'Golf', 2), (7, N'Camping', 2); -- Create the Weak Entity Table to store the Persons Activities. CREATE TABLE #PersonActivities ( ActivityID smallint , PersonID smallint); INSERT INTO #PersonActivities (PersonID, ActivityID) VALUES (1, 1), (1, 2), (1, 7), (2, 1),(2, 2), (3, 7), (4, 1),(4, 3),(4, 6),(4, 7);
Using the activities we can assign a bitwise value based on the identity and category.
|Activity||Activity ID||Category ID||Bitwise Value|
|Walking||1||1||21 = 2|
|Biking||2||1||22 = 4|
|Running||3||1||23 = 8|
|Yoga||4||1||24 = 16|
|Movies||1||2||21 = 2|
|Golf||2||2||22 = 4|
|Camping||3||2||23 = 8|
If we summarize the bitwise values we can use the binary and operator (&) to determine if any matches exist. Example:
|Summed Bitwise Value||23||22||21||20|
The value of 6 represents Walking and Biking, the value of 12 represents Biking and Running. The intersection of the two is Biking, this would be the matched value of 4. Being that we have a matched value, the result is greater than 0. Using this logic we can implement the following query.
WITH PeopleSums AS ( SELECT p.Name , p.PersonID , a.CategoryID , BitWiseSum = SUM(POWER(2, pa.ActivityID)) FROM #Person p JOIN #PersonActivities pa ON p.PersonID = pa.PersonID JOIN #Activities a ON a.ActivityID = pa.ActivityID GROUP BY p.Name , p.PersonID , a.CategoryID ) SELECT ps2.Name FROM PeopleSums ps JOIN PeopleSums ps2 ON ps.PersonID != ps2.PersonID AND ps.BitWiseSum & ps2.BitWiseSum <> 0 AND ps.CategoryID = ps2.CategoryID WHERE ps.Name = 'Bob' GROUP BY ps.Name , ps2.Name HAVING COUNT(*) = 2;
This query uses a CTE to calculate and then summarize the bitwise values, grouped by person and category. We then self-reference the CTE using the binary AND operator (&) where the result is anything but zero. This concept can show us who Bob matches, we can use the table below to illustrate this.
|Persons that Match Bob||Category 1||Category 2|
To make sure that we have a match for each category, we do a count and ensure that it matches the number of categories we currently have. And voila Beth is a match for Bob, because Beth is the only candidate that had matches in both categories.