A few weeks back I was working on a process that would utilize foreign keys to crawl from table to table to export data from a production database to an archive database. More on some of what went into that process probably at some point in the future.
When the project was released to production a strange issue occurred. I was getting duplicates in a few of the tables that previously didn’t cause duplicates to occur. I went through the code and was able to determine that the issue was tied to duplicated INSERT statements that derived from foreign keys.
To solve the issue I ended up creating the query below that reviewed the foreign keys on the tables to determine if there were duplicates in the database.
;WITH ForeignKeys1 AS ( SELECT fk.object_id ,OBJECT_NAME(fk.parent_object_id) as table_name ,fk.name as foreign_key_name ,fk.create_date ,(SELECT CAST(parent_object_id as varchar(50)) + SPACE(1) + CAST(parent_column_id as varchar(50)) + SPACE(1) + CAST(referenced_object_id as varchar(50)) + SPACE(1) + CAST(referenced_column_id as varchar(50)) as [data()] FROM sys.foreign_key_columns fkc WHERE fk.object_id = fkc.constraint_object_id ORDER BY constraint_column_id FOR XML PATH('')) foreign_key FROM sys.foreign_keys fk ) ,ForeignKeys2 AS ( SELECT object_id, table_name, foreign_key_name, foreign_key ,ROW_NUMBER() OVER (PARTITION BY foreign_key ORDER BY create_date) as ForeignKeyRank FROM ForeignKeys1 ) SELECT x.table_name as [Table Name] ,x.foreign_key_name as [Foreign Key] ,y.foreign_key_name as [Foreign Key Duplicate] FROM ForeignKeys2 x INNER JOIN ForeignKeys2 y ON x.foreign_key = y.foreign_key AND x.ForeignKeyRank > y.ForeignKeyRank
Hope someone else finds some use out of this.
8 thoughts on “Find Duplicate Foreign Keys”
Hit this issue the other week on a new system that we are building, something went awry during a build. Done a bit more digging and SQL Server lets you create multiple foreign keys on a column, they can even reference different tables which I find a bit strange.
That’s an interesting way to find this issue. I guess if the database is leveraging sub-types and super-types in the table design, then you’d see foreign key relationships like this. I’d imagine this would kill performance on deletes.
Why wouldn’t you just do something like:
SELECT OBJECT_NAME(parent_object_id) AS ‘Parent’,
OBJECT_NAME(referenced_object_id) AS ‘References’,
COUNT(referenced_object_id) AS ‘Count”’
GROUP BY parent_object_id, referenced_object_id
HAVING COUNT(referenced_object_id) > 1
ORDER BY OBJECT_NAME(parent_object_id)
That wouldn’t work in situations where the same parent table was connected to multiple columns in the same table. For instance, this would happen fairly often in data warehouses where fact tables were related across multiple columns to the date dimension.
Somehow I missed this comment, but the reason that script doesn’t work is it doesn’t take into account the tables between the two tables. For instance, a date dimension would likely have multiple foreign key relationships between itself and the fact table.
Comments are closed.