Home > Ask the Data management / BI Experts > DBMS and data warehousing Questions & Answers > Definition of primary, super, foreign and candidate key in the DBMS
Ask The Data Management Expert: Questions & Answers
EMAIL THIS

Definition of primary, super, foreign and candidate key in the DBMS

Mark Whitehorn EXPERT RESPONSE FROM: Mark Whitehorn

Pose a Question
Other Data Management Categories
Meet all Data Management Experts
Become an Expert for this site


Tips, expert advice and sample chapters
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


>
QUESTION POSED ON: 07 October 2009
What is the primary key, super key, foreign key and candidate key in the DBMS?


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
DBMS and data warehousing
What is the difference between a logical and physical warehouse design?
What are some emerging data warehouse and DBMS trends?
How to get data/database independence with a three-tier architecture
How to select an MPP database: DB2 vs. Teradata
What comes first — the data mart or the data warehouse?
What are the top database management systems (DBMS)?
What is the role of DBMS in RDBMS?
Is an Inmon-modeled BI system, like Madison, the future of data warehousing?
What are the benefits and disadvantages of a RDBMS?
Are there benefits to using both Teradata and a DB2 database?

Data warehouse basics
What is the difference between a logical and physical warehouse design?
Best practices for data warehouse access and reports
Similarities and differences between ROLAP, MOLAP and HOLAP
An introduction to data warehousing and decision support systems
Data warehouse architectures, concepts and phases
Advantages of the multidimensional database model and cube modeling
Data warehousing, data mining and data querying: Terms and definitions
What is an operational data store vs. a data warehouse?
Why you should build a data warehouse
Data mart vs. data warehouse

Data warehouse strategy
Advantages and disadvantages of XML shredding
How to shred XML with the DB2 XMLTABLE function
Shredding XML docs into relational tables with annotated XML schemas
Examples of single and bulk XML shredding of XML documents
What are some emerging data warehouse and DBMS trends?
Improving ODBC application performance and coding
Selecting ODBC functions for optimized SQL statements
Guidelines for managing data updates to optimize ODBC performance
How to capture metadata information, ETL rules with CA Erwin Data Modeler
Designing for performance: Strategic database application deployments

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


A primary key is a column (or columns) in a table that uniquely identifies the rows in that table.

CUSTOMERS
CustomerNoFirstNameLastName
1SallyThompson
2SallyHenderson
3HarryHenderson
4SandraWellington

For example, in the table above, CustomerNo is the primary key.

The values placed in primary key columns must be unique for each row: no duplicates can be tolerated. In addition, nulls are not allowed in primary key columns.

So, having told you that it is possible to use one or more columns as a primary key, how do you decide which columns (and how many) to choose?

Well there are times when it is advisable or essential to use multiple columns. However, if you cannot see an immediate reason to use multiple columns, then use one. This isn't an absolute rule, it is simply advice. However, primary keys made up of single columns are generally easier to maintain and faster in operation. This means that if you query the database, you will usually get the answer back faster if the tables have single column primary keys.

Next question — which column should you pick? The easiest way to choose a column as a primary key (and a method that is reasonably commonly employed) is to get the database itself to automatically allocate a unique number to each row.

In a table of employees, clearly any column like FirstName is a poor choice since you cannot control employee's first names. Often there is only one choice for the primary key, as in the case above. However, if there is more than one, these can be described as 'candidate keys' — the name reflects that they are candidates for the responsible job of primary key.

Now, given the definition above, that a primary key can be made up of more than one column and must uniquely identify the rows, we could choose, for example, EmployeeNo AND FirstName as the primary key. That fulfils the requirement but it is clearly foolish because we are adding complexity for no gain. It is also a great example of a super key. Super, in this case, is not a synonym for 'great' but a contraction of supernumerary. Super keys are to be avoided.

Foreign keys are columns that point to primary key columns. So, for example, OrderNo is the primary key of the table ORDERS below and CustomerNo is a foreign key that points to the primary key in the CUSTOMERS table.

ORDERS
OrderNoEmployeeNoCustomerNoSupplierPriceItem
1142 Harrison$235Desk
241Ford$234Chair
3168 Harrison$415Table
42112Ford$350Lamp
5342Ford$234Chair
62112Ford$350Lamp
7242 Harrison$235Desk

If we want to be really pedantic, foreign keys don't have to point to a primary key. The only true requirement of the column at which a foreign key points is that it must contains unique values. Imagine, for example, that our employee tables looked like this:

EMPLOYEES
SSecurityNo EmployeeNo FirstNameLastName DateOfBirth DateEmployed
AF-234323341MannyTomanny12 Apr 196601 May 1999
DQ-654444442RosanneKolumns21 Mar 197701 Jan 2000
GF-543545433CasKade01 May 197701 Apr 2002
JK-343334324NormaLyzation03 Apr 196601 Apr 2002
VB-485654445JuanTomani12 Apr 196601 Apr 2002
FG-235665536DelEats01 May 196701 May 2004

The social security number is actually the primary key of the table but we issue each employee with a number which is also unique. Under these circumstances, ORDERS.EmployeeNo can be a foreign key pointing to EMPLOYEES.EmployeeNo; even though that column is not a primary key. So the actual rule is slightly more subtle. A foreign key must point to a candidate key.

A candidate key is a column that meets all of the requirements of a primary key. In other words, it has the potential to be a primary key.

Having told you this, I can also tell you that I cannot remember the last time I saw this done in a real, live, production database. In practice, foreign keys are almost invariably pointed at primary keys.

Most of this answer was shamelessly plagiarized from Inside Relational Databases, a book by by Mark Whitehorn and Bill Marklyn, published by Springer Verlag. However, since Mark is one of the authors of the book, the chances of him suing himself are low.




Search and Browse the Expert Answer Center
Search and browse more than 25,000 question and answer pairs from more than 250 TechTarget industry experts.
Browse our Expert Advice

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts