Tuesday 22 January 2013

What is and where is my GUID?

Looking after a SQL database which was created by CRM, leaves you either loving or loathing GUIDs. Personally, I like GUIDs. They aren't suitable all of the time as a unique identifier compared to say, a SMALLINT, but they have their place. I like them because they are unique, not just in my database, but there are so many potential values that any GUID I will have will never be in your database. Ever! This means I could give you my schema, you could populate it with data (including your own GUIDs) and at some point in the future we could merge our two databases together seamlessly. You couldn't do that with integers, even a 64-bit one.

Lets rewind a bit and review what a GUID is, before we start hunting our database for one. GUIDs come in differing forms, feel free to read all about them on Wikipedia.

In SQL Server a GUID is a 16-byte hexadecimal number with a datatype of UNIQUEIDENTIFIER, and we can generate a new one using the NEWID() function. When queried, the GUID is represented in groups of bytes in a 4-2-2-2-6 pattern. e.g.:


DECLARE @myGUID UNIQUEIDENTIFIER = NEWID()
SELECT @myGUID

Result:

------------------------------------
399E4198-490B-4A32-B824-F01DC785B56F

Great, so we have a GUID. Now what?

In my job, I am often presented with a GUID, and asked what it is. Very often, someone will have been investigating a log file and found a GUID in an error message, but not know what object type it belongs to, just that it exists somewhere in the database.

So I created a little script to spit out some SQL that would generate a query for every table/column combination in the entire database where the column datatype is UNIQUEIDENTIFER. Be sure to extend the Query Analyzer output to more than the default 256 characters per column and to send the results to text so that you can copy the script to the clipboard.

The SQL is:


SELECT 'PRINT ''Table: ' + OBJECT_NAME(C.[object_id]) + ' Column: ' + C.[name] + ''';
SELECT * FROM ' + OBJECT_NAME(C.[object_id]) + ' WITH (NOLOCK) WHERE ' + C.[name] + '=''' + CAST(@myGUID AS CHAR(36)) + ''';' AS [SQL]
FROM sys.columns C
INNER JOIN sys.types T ON C.user_type_id = T.user_type_id
WHERE T.[name] = 'UNIQUEIDENTIFIER'
AND OBJECTPROPERTY(C.[object_id], N'IsUserTable') = 1
ORDER BY OBJECT_NAME(C.[object_id]), C.[name]

The output is:

PRINT 'Table: myTable1 Column: Col1';
SELECT * FROM myTable1 WITH (NOLOCK) WHERE Col1='954FF5D8-3442-4CD5-B25C-323A616CBD61';
PRINT 'Table: myTable2 Column: Col2';
SELECT * FROM myTable2 WITH (NOLOCK) WHERE Col2='954FF5D8-3442-4CD5-B25C-323A616CBD61';
...

Now paste the SQL into a new query window, and leave it spin for a while. As the column is going to be an identifier of some sort, the chances are it will be indexed so shouldn't take too long. Once all the queries of run, you should be able to search the results for your GUID, and find all the instances of it. This will then show you what table(s) it is found and and in which column. You may find it in a relational table joining two tables together that you hadn't even previously considered. I find that a lot when I inherit a Dynamics CRM database and am presented with this problem.

Of course there is plenty of room for refinement. I could have put each row into its own SQL variable and executed it, I could have put a READ UNCOMMITTED at the front of the query instead of putting a NOLOCK on each query, or set the ROWCOUNT for the query batch (in case the GUID is something like a CRM object owner and appears thousands or millions of times). But I'll leave that to you, now you know what the gaps are and what is unique to your database. Happy hunting!










No comments:

Post a Comment