What is the difference between the persistent staging area (PSA) and the operational data store (ODS)? What is the definition of each?
An operational data store (ODS) would actually have user processing done to it -- usually some reporting although it could be more in-depth than that. An ODS then moves the data on to the data warehouse and, in this respect, it's similar to a staging area.
A staging area is simply a landing ground where (depending on your approach) a little or a lot of transformation is done to the data in preparation for loading into the data warehouse.
In a persistent staging area, historical data is not aged off of the staging area. In other words, you maintain history in the staging area (likely) as well as the data warehouse itself. Persistency is usually done in conjunction with a strategy of triage, which means you are "over-sourcing" the source environment so you have history in the staging area. Should you ever need it in the data warehouse, you can load it from staging. Source environments are changing data and you would be isolated from those changes with a persistent staging area. I've always viewed this as gambling that I've never seen pay off.
This was first published in February 2009