The foreign key constraint is an important aspect of database design. This article explains why.
Foreign key constraint advantages
The purpose of the foreign key constraint is to enforce referential integrity but there are also performance benefits to be had by including them in your database design.
Firstly lets look at an example of how they are used in database design.
So here are my two tables.
[sourcecode language=’sql’]CREATE TABLE Accounts
(
ID INT PRIMARY KEY IDENTITY(1,1),
Name VARCHAR(100)
)
GO
CREATE TABLE Orders
(
ID INT PRIMARY KEY IDENTITY(1,1),
OrderDate DATETIME DEFAULT(GETDATE()),
AccountID INT NOT NULL CONSTRAINT FKAccountID REFERENCES Accounts(ID)
)
GO[/sourcecode]
Now we’ll insert some data.
[sourcecode language=’sql’]INSERT INTO Accounts(Name)
VALUES(‘Test Company 1’),(‘Test Company 2′);
INSERT INTO Orders(AccountID)
VALUES
(1),(2),(1),(2),(1),(2),(1),(2),(1),(2),(1),(2)
,(1),(2),(1),(2),(1),(2),(1),(2),(1),(2),(1),(2);[/sourcecode]
Let’s have a quick look at the data.
[sourcecode language=’sql’]SELECT * FROM Accounts;
SELECT TOP (5) * FROM Orders;[/sourcecode]
ID Name ----------- --------------- 1 Test Company 1 2 Test Company 2 (2 row(s) affected)
ID OrderDate AccountID ----------- ----------------------- ----------- 1 2011-12-04 11:03:08.533 1 2 2011-12-04 11:03:08.533 2 3 2011-12-04 11:03:08.533 1 4 2011-12-04 11:03:08.533 2 5 2011-12-04 11:03:08.533 1 (5 row(s) affected)
So we have inserted rows into table “Orders” which relate to “Accounts” by the AccountID and ID columns respectively. No problems with that. What happens if we try and insert a new row into “Orders” for an account which does not exist in table “Accounts”?
[sourcecode language=’sql’]INSERT INTO Orders(AccountID)
VALUES(3);[/sourcecode]
We get an error.
Msg 547, Level 16, State 0, Line 1 The INSERT statement conflicted with the FOREIGN KEY constraint “FK__Orders__AccountI__0AD2A005”. The conflict occurred in database “DBADiaries”, table “dbo.Accounts”, column ‘ID’.
The statement has been terminated.
So the foreign key constraint is doing its job and only allowing recognized account ids to be added to the “Orders” table.
Now lets say for whatever reason someone attempted to remove a row from table “Accounts” which had related records in table “Orders”
[sourcecode language=’sql’]DELETE Accounts WHERE ID = 2;[/sourcecode]
We get an error.
Msg 547, Level 16, State 0, Line 1
The DELETE statement conflicted with the REFERENCE constraint “FK__Orders__AccountI__0AD2A005”.
The conflict occurred in database “DBADiaries”, table “dbo.Orders”, column ‘AccountID’.
The statement has been terminated.
Cascading deletes are turned off in this instance so as well as stopping bad data getting into the table, the foreign key constraint is preventing data from being deleted which in this case is exactly what I need it to do.
Foreign key constraint performance benefits
How can a foreign key constraint benefit performance? Well let’s have a look at this simple example using the tables previously created.
Activate “Include Actual Execution Plan” in Management Studio using either Ctrl + M or the button on the toolbar. Run a simple query checking for records in table “Orders” which relate to a row in table “Accounts” and then check the execution plan
[sourcecode language=’sql’]SELECT *
FROM Orders O
WHERE EXISTS (SELECT * FROM Accounts A WHERE A.ID = O.AccountID);[/sourcecode]
Execution plan output:
Now we will remove the foreign key constraint
[sourcecode language=’sql’]ALTER TABLE Orders DROP CONSTRAINT FKAccountID;[/sourcecode]
Re run the preceeding SQL statement and check the execution plan again and it has changed.
So why is it different? The optimizer has to now execute the EXISTS part of the query because it cannot be sure whether table “Accounts” has any valid references. Having the foreign key in there meant that the optimizer could trust it and therefore by definition it did not have to check table “Accounts” when returning all rows from “Orders”. This is because a valid reference in “Accounts” must exist for a row to be stored in “Orders”
Could a foreign key constraint become untrusted?
The answer is yes.
For example you might decide to disable a foreign key when loading in large amounts of data. It is easier to batch insert consistent rows of data into a database without foreign keys enabled.
An untrusted foreign key would mean that the second execution plan would be used for the query which will not perform as fast as the first. If you had tables with lots of rows in, this could make a massive difference to performance.
For the purposes of this explanation, I have added the FKAccountID foreign key constraint and I ran this statement:
[sourcecode language=’sql’]ALTER TABLE Orders NOCHECK CONSTRAINT FKAccountID;[/sourcecode]
So how do you tell whether your foreign key is trusted? Run this query:
[sourcecode language=’sql’]SELECT Name, Is_Not_Trusted
FROM sys.foreign_keys
WHERE Name = ‘FKAccountID'[/sourcecode]
Which outputs this information.
Name Is_Not_Trusted -------------------- -------------- FKAccountID 1 (1 row(s) affected)
To correct this run this SQL:
[sourcecode language=’sql’]ALTER TABLE Orders WITH CHECK CHECK CONSTRAINT FKAccountID;[/sourcecode]
You could also look for all untrusted foreign keys in your database as part of a performance tuning exercise.
So a foreign key constraint has advantages and should be part of your design to ensure that you have a consistent database and to help ensure that the database performs optimally.
Michal Pawlikowski says
Don’t forget about the other side of using FKs. In case of heavy load on table (insert,update) having integrity checks forces engine to check every inserted/updated value and it slows down db dramatically. This is why many dw projects implements validation (integrity checks) in ETL as part of a transform, dw database design doesn’t allow FK at all (or accept them at minimal level)
But if we have small/medium loaded database the cost of check constraints is lower than gained performance using SELECT statements.
admin says
That’s great information Michal, thanks.
Mumtaz Alam says
i agree with applying minimal amount of FK in database….then how to normalize table and how to make relation between two table for retrieve the data (is use joined condition?)….
Andy Hayes says
“I agree with applying minimal amount of FK in database”. Why would you go with a minimal approach using foreign keys only in certain places in your database design?
Mumtaz Alam says
if i don’t want to use FK relation between two tables then can i create same column name with same datatype in both table for applying JOIN condition…? is it right approach ?
please explain am i right or wrong ?
Andy Hayes says
Yes. If data types are not the same then the server has to convert them during the query or they have to be explicitly converted in code. Either of these has a performance overhead because of the conversion.
Mumtaz Alam says
thanks…
Mumtaz Alam says
I AM WORKING AS DATABASE DEVELOPER BEFORE TWO MONTH I SWITCH TO ORACLE DATABASE TO SQL SERVER…
WHICH DATATYPE IS SUITABLE FOR PRIMARY KEY COLUMN..I PREFER INT DATATYPE BUT I SEEN SOME DATABASE FLOAT DATATYPE FOR PRIMARY KEY COLUMN…
PLEASE ADVICE ME ABOUT PRIMARY KEY COLUMN DATATYPE…