Inner Join vs Outer Join Performance

At work, a colleague and I discussed the performance of inner joins and against outer joins, particularly in the case where both types of joins would return the same number of rows. So if it turned out that you always had a match for each inner record, would you pay a penalty for having a left join? In most cases an outer join would return more records, since you would usually have unmatched records.
I posted a script on GitHub to create two tables that would be joined on RecordId.

create table dbo.Test1(
RecordId int not null identity(1,1) primary key clustered,
CreatedDate datetime not null default getdate()
);

create table dbo.Test2(
RecordId int null,
ItemId tinyint null,
CreatedDate datetime not null default getdate()
);

I created two million records in Test1, and then 5 Test2 records for each one in Test1. The Test2.RecordId column was left nullable, so we wouldn’t give the optimizer any hints as far as unmatched records.
So next we return all records in two queries, one with an inner join and one with an outer join.

select *
from dbo.Test1 as a
join dbo.Test2 as b
	on b.RecordId = a.RecordId;

select *
from dbo.Test1 as a
left join dbo.Test2 as b
	on b.RecordId = a.RecordId;

The two queries run in the same amount of time with virtually the same execution plan. Since Test1 has a clustered index, the key column is already sorted, so a Merge Join is used. So in this case, we’re running down the clustered index and finding a match for each record in the Test2 table, so it makes sense that the same amount of work would be done.
Dropping the clustered primary key on Test1 results in a different execution plan, where a Hash join is used instead of the Merge join, but the run time is still the same for both queries.
Deleting every 4th row from the detail table resulted in the inner join running slightly faster, but it was a 52-49 split in the execution plan.
I was a little surprised at first there wouldn’t be a difference in query performance between the two join types, but now it makes sense that the optimizer would go through the same plan for both types of join in this case. Inner vs Outer join does make a lot of difference logically in the records returned, but if both joins will return the same number of records then the plans turn out to be the same and the execution time is the same as well.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: