About a year ago I was working on a project that required that data be extracted from one database and inserted into an database. Not the most earth shattering concept. This is done quite often and doesn’t necessarily require a discussion of foreign keys.
So why talk about them… well… in this case the source and the destination databases were identical. They had the same table and foreign key structure. But no reason to be sold on this discussion quite yet… because it would be east to just build the ETL process around this. It would need to be built with the ordering of the tables controlled with precedence constraints or inserts statements in the proper order.
So here’s the rub… this process needed to move data between hundreds of tables. When hundreds of tables are in the discussion, I lose interest very quickly in trying to determine by hand how to order the tables. I did in this case and looked for a method to accomplish this through foreign key meta data.
And of course, there wouldn’t be a post if there wasn’t an answer. Well, maybe I’d post a question, but I not so nanner nanner. The solution uses a function with a query to get the necessary data. There might be better ways to do this, but this way worked quite well and had pretty decent performance. One variation of this used a single CTE that I let run for a couple hours before I decided it was running sub-optimal.
There are two caveats with this solution:
- Self-referencing foreign keys will cause a recursion error. When row A needs to be inserted before row B can’t really be helped through metadata. Since the environment I was working in didn’t have this issue, the solution didn’t address it either.
- Circular-referencing foreign keys will also cause a recursion error. There are situations where table A references table B which references table C which then references table A. I run into these from time to time but not in the environment that this was designed for.
With the caveats out of the way, here is the solution:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[ListFKParents]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT')) DROP FUNCTION [dbo].[ListFKParents] GO /*================================================================================ Function: dbo.ListFKParents Author: Jason Strate Date: July 23, 2009 Synopsis: This function returns a comma separated list of parents for a table. ================================================================================ Revision History: Date: By Description ---------------------------------------------------------------------------------- ================================================================================*/ CREATE FUNCTION [dbo].[ListFKParents] (@ObjectID int) RETURNS table AS RETURN ( WITH cteParentChild (Object_ID, parent_Object_ID, referenced_Object_ID, name) AS ( SELECT fk.parent_Object_ID, t1.name, fk.referenced_Object_ID, t2.name FROM sys.foreign_keys fk INNER JOIN sys.tables t1 ON fk.parent_Object_ID = t1.Object_ID INNER JOIN sys.tables t2 ON fk.referenced_Object_ID = t2.Object_ID WHERE t1.Object_ID = @ObjectID UNION ALL SELECT fk.parent_Object_ID, t1.name, fk.referenced_Object_ID, t2.name FROM sys.foreign_keys fk INNER JOIN sys.tables t1 ON fk.parent_Object_ID = t1.Object_ID INNER JOIN sys.tables t2 ON fk.referenced_Object_ID = t2.Object_ID INNER JOIN cteParentChild cte ON cte.referenced_Object_ID = fk.parent_Object_ID WHERE fk.referenced_Object_ID fk.parent_Object_ID ) SELECT (SELECT ',' + CONVERT(varchar(50), x.referenced_Object_ID) as [text()] FROM cteParentChild x WHERE x.referenced_Object_ID @ObjectID ORDER BY x.referenced_Object_ID FOR XML PATH ('')) + ',' AS FKParents ) GO IF OBJECT_ID('tempdb..#ParentChild') IS NOT NULL DROP TABLE #ParentChild CREATE TABLE #ParentChild ( object_id int ,name sysname ,parents varchar(max) ) --List all tables and return a list of all of the parent tables to the table INSERT INTO #ParentChild SELECT object_id, name, x.FKParents FROM sys.tables t CROSS APPLY dbo.ListFKParents(object_id) x IF OBJECT_ID('tempdb..#TableOrder') IS NOT NULL DROP TABLE #TableOrder CREATE TABLE #TableOrder ( object_id int ,name sysname ,rank int ) DECLARE @Loop int SET @Loop = 0 --Insert all tables with out parents as 0-level rank INSERT INTO #TableOrder SELECT object_id, name, @Loop FROM #ParentChild WHERE parents IS NULL WHILE 1=1 BEGIN SET @Loop = @Loop + 1 --Remove any items that are in #TableOrder and in #ParentChild DELETE FROM #ParentChild WHERE object_id IN (SELECT object_id FROM #TableOrder) ;WITH cteParentChild (varchar_object_id, object_id, name, parents) AS ( --Select items that match to parent #TableOrder in #ParentChild SELECT DISTINCT '%,' + CONVERT(varchar(50),t.object_id) + ',%', t.* FROM #ParentChild t INNER JOIN ( SELECT '%,' + CONVERT(varchar(50),object_id) + ',%' as object_id FROM #TableOrder ) x ON parents LIKE x.object_id ) --Insert items into that have all parents in #TableOrder INSERT INTO #TableOrder SELECT pc1.object_id, pc1.name, @Loop FROM cteParentChild pc1 LEFT OUTER JOIN cteParentChild pc2 ON pc1.parents LIKE pc2.varchar_object_id WHERE pc2.varchar_object_id IS NULL IF @@RowCount = 0 BREAK END SELECT * FROM #TableOrder ORDER BY rank, name