Foreign Key Constraint to better performance

At the time of software development, we usually do various types of input validations. At web application client end, at server end, at business logic, at database layer etc. To validate input data at database label is done with database constraints. By using this we can prevent invalid data to enter in database and thus to keep our database in a good state. Database constraints are many types. Primary Key, Foreign Key, Check, Default and Unique.

Today I am talking about Foreign Key constraint. Foreign Key constraint is a special type of check constraint whose value should match with the value of the reference table column. So when you are inserting, updating and deleting from your table, DBMS actually checks the Foreign Key value to check it is valid (match with the reference column) or not. Depending on that your DBMS will act. At the time of DML it is taking slightly overhead for the Foreign Key validation. But not only that, Foreign Key constraint can be gain some performance benefit also.

To show you an example suppose we have one master table and one child table with Primary Key and Foreign Key relation. I am using SQL Server. The master table is Publishers and the child table is Books. One publisher can have many books, one book can have one publisher. So it is a one-to-many relation between publisher and book. In table you can see Publisher Id is the Primary Key of Publishers table and Publisher Id is the Foreign Key in Books table.

CREATE TABLE [dbo].[Publishers] (
    [PublisherId] INT          IDENTITY (1, 1) NOT NULL,
    [FirstName]   VARCHAR (10) NOT NULL,
    [MiddleName]  VARCHAR (10) NULL,
    [LastName]    VARCHAR (10) NOT NULL,
    
    -- Primary Key.
    CONSTRAINT [pkPublishers] PRIMARY KEY CLUSTERED ([PublisherId] ASC)
);
CREATE TABLE [dbo].[Books] (
    [BookId]        INT             IDENTITY (1, 1) NOT NULL,
    [Title]         VARCHAR (10)    NOT NULL,
    [PublishDate]   DATE            NOT NULL,
    [NumberOfPages] INT             NOT NULL,
    [Price]         DECIMAL (10, 2) NOT NULL,
    [Comments]      VARCHAR (50)    NULL,
    [PublisherId]   INT             NOT NULL,
    
    CONSTRAINT [pkBooks] PRIMARY KEY CLUSTERED ([BookId] ASC),
    
    -- Foreign Key which refer Publishers Primary Key.
    CONSTRAINT [fkBooksWithPublishers] FOREIGN KEY ([PublisherId]) 
        REFERENCES [dbo].[Publishers] ([PublisherId])
);

Now because we have the Primary Key and Foreign Key relationship between these two tables, we can gain performance benefit sometimes. Lets run the following SQL query.

SELECT B.Title AS [Book Title]
FROM BOOKS AS B
WHERE EXISTS(SELECT * FROM PUBLISHERS AS P WHERE B.PublisherId = P.PublisherId)

If we check the Execution Plan, we can see that actually SQL Server do not access the Publishers table at all. DBMS know that is has a Foreign Key relation, so be sure that every Foreign Key column match with Primary Key column, so its do not fetch that. We get slightly performance benefit.

Primary Foreign Key - Performance Benefit

Primary Foreign Key – Performance Benefit

Conclusion

We should use database constraint where it is necessary. It helps us to maintain our database in a good state, prevent invalid data to get entry to database and performance benefit sometimes.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s