Skip links

Composite Primary Keys in database design

Life gets sometimes a little bit more complicated, than one would expect. Database design is no exception, and therefore you need to think outside the box from time to time and come up with new ideas. As the title suggests, I am going to show you a more complex way to create your table identifiers, i.e. Primary Keys.

Most usualy, the data rows stored in tables are identified by unique identifier called Primary Key. Primary Key is a column, by which you can distinctly identify the data, you were looking for. As you would expect, the column should be an integer, bigint, long or really any other numerical data type depending on the size of your data and database vendor you are using.

In fact, you can have multiple Primary Keys in one table, in which case the combination of these Primary Keys if called Composite Primary Key. A Composite Primary key is a combination of two or more columns in a table, that can be used to uniquely identify each row in the table. Uniqueness is only guaranteed when the columns are combined; when taken individually the columns do not guarantee uniqueness.

An example is worth a thousand words

Ok, so what are the cases for using a Composite Primary Key instead of a simple single column Primary Key, probably set as an auto-increment? Let’s imagine that you are designing a table for a financial institution, where transactions are being stored. In this case, you need to have a possibility to store a so-called “pool transaction” or “batch payment”, where multiple transactions are being stored under the same “main identifier”.

In the example below, you can see that the first two columns id_transaction and id_transaction_2 form together a Composite Primary Key. Then, when you look at the transaction where id_transaction = 6 you will notice, that this is the case of a pool transaction with a total amount of 4 000 (2000 + 1200 + 800) probably made from different client accounts.

id_transactionid_transaction_2transaction_datesettlement_dateamountid_currency
112014-01-202014-01-202000,001
212014-01-282014-01-281000,001
312014-02-032014-02-034500,001
412015-08-112015-08-11300,001
422015-08-112015-08-115000,001
512015-08-112015-08-112000,001
612016-06-272016-06-272000,001
622016-06-272016-06-271200,001
632016-06-272016-06-27800,001
712016-11-162016-11-162800,001

Now you can see, how the uniqueness is guaranteed only when both columns id_transaction and id_transaction_2 are being considered.

It’s nothing new

Sure this concept is not a breakthrough and lot’s of you would design the table in other manner, which is completely ok. For instance, you could set the id_transaction as an auto-increment Primary Key and hence making it a unique identifier for each transaction and then storing the “main identifier” (of the superior transaction) under id_transaction_2.

This approach would be sufficient in case, you wouldn’t need to store a complete history of changes made to the transaction, because once you would make a change to the transaction, either you would lost the former data or create a new record with new id_transaction, which is not a good practice in my opinion. I will show you the way I prefer storing a complete history of record changes in some future post.

Always think forward

There’s a lot’s of other use cases, where Composite Primary Key is the best way to go when designing your new table, but be aware that there are also lot’s of cases, where it is not appropriate. It complicates things a bit and hence you should always think forward, what data are you storing, do you want to store the data in one table with multiple column identifier or store the data in multiple tables? It depends.

Leave a comment

Name*

Website

Comment