What are aggregate tables and aggregate fact tables?

Here you'll see examples and read definitions of aggregate tables and aggregate fact tables. You'll also learn how to use aggregate tables in data warehouses and databases.

Mark Whitehorn, University of Dundee

Published: 24 Mar 2010

In databases, what are aggregate tables and aggregate fact tables?

Aggregate tables, in general, are simply database tables that contain aggregated values. OK, I admit it: that answer is accurate but useless. So let's try again, and this time we'll use a fact table as an example. Imagine that you have a fact table like this in which the granularity is date, product and customer:

Customer ID	Item No	Order Date	Unit Sales	Profit
1	143	04/05/2012	1	1.52
1	150	04/05/2012	3	3.9
1	8	10/06/2012	1	2.48
1	77	10/06/2012	1	1.37
1	92	10/06/2012	1	1.33
1	95	10/06/2012	1	2.87
1	18	28/07/2012	1	2.3
1	37	28/07/2012	1	1.03
1	61	28/07/2012	1	2.01
1	83	28/07/2012	1	2.37
1	120	28/07/2012	1	2.24
1	8	13/08/2012	1	2.48
1	58	13/08/2012	1	2.79
1	100	16/12/2012	1	2.18
1	122	16/12/2012	1	2.47
1	83	12/05/2013	1	2.37
1	148	12/05/2013	1	2.17
1	37	29/11/2013	1	1.03
1	116	29/11/2013	1	2.03
2	46	07/04/2012	1	2.24
2	110	07/04/2012	1	2.84
2	49	06/05/2012	1	2.2
2	60	06/05/2012	1	1.91
2	26	22/05/2012	1	1.69

Every value that you see for "unit sales" is the number of units of a particular product that were sold to a particular customer on a particular day. (Here, to help make the point of aggregation slightly clearer, I've shown the day of sale as an actual date rather than as a pointer. And yes, I realize that these European-style dates are still in the future – but we'll treat them as the past for purposes of this demonstration.)

We could run queries against this fact table, and it would return data. For example, we could ask for the total unit sales of Item No. 150 to Customer 1 on May 4, 2012, and the answer we'd get is three.

We could also run a query that returns the total sales of Item No. 150 to the same customer not just for the 4th of May but for that entire month. In order to do this, the system would scan the fact table looking for 31 separate date entries listing sales of that product to Customer 1 and then aggregate the unit-sale values from the returned rows. That would be a relatively slow process.

An alternative is to create a fact table that already contains one or more levels of aggregation. For example, we could aggregate this fact table by month. That would involve finding all the sales of Item No. 150 to Customer 1 in, say, January 2012, aggregating the data, and putting the results in a single row. We then, of course, would have to do the same for the other months, the other products and the other customers.

The new table would look like this:

Customer ID	Item No	Order date	Unit Sales	Profit
1	150	Jan 2012	11	11.52
1	150	Feb 2012	3	3.9
1	150	March 2012	2	2.48
1	150	April 2012	1	1.37
1	150	May 2012	1	1.33
1	150	June 2012	2	2.87

Now when we had a query that looked for a monthly sales total, we could run it against the aggregate fact table and find the answer much more rapidly. And of course, this isn't the only aggregate fact table that we could generate. For example, if we knew the counties that different customers were located in, we could aggregate their data on that basis.

We could also aggregate by both date and customer up to the level of the month and county, as in this table excerpt (with Herefordshire being a county in England, for the uninitiated):

Customer ID	Item No	Order date	Unit Sales	Profit
Herefordshire	150	Jan 2012	111	121.52
Herefordshire	150	Feb 2010	63	73.9

In other words, we could create a series of aggregate fact tables, and when a query comes in, we could run it against the appropriate aggregated table.

By now, it is (hopefully) clear where this is all leading to: An aggregate fact table is simply one that combines multiple rows of data, loses some detail and aggregates numerical values.

What are aggregate tables and aggregate fact tables?

Here you'll see examples and read definitions of aggregate tables and aggregate fact tables. You'll also learn how to use aggregate tables in data warehouses and databases.

Dig Deeper on Database management

mean time to detect (MTTD)

decimal

Future workplace planning emphasizes collaboration, flexibility

CXL: A Basic Tutorial

Related Q&A from Mark Whitehorn

DBMS keys: 8 types of keys defined

Do predictive analytics projects require data scientists?

Key issues to consider when building a data warehouse