Why are temporal extensions in PostgreSQL important? Quite simply, managing time data is one of the most common requirements, and current general-purpose database systems don’t provide us with the basic tools to do it. Every general-purpose DBMS falls short both in terms of usability and performance when trying to manage temporal data.
What is already done?
- PERIOD data type, which can represent anchored intervals of time; that is, a chunk of time with a definite beginning and a definite end (in contrast to a SQL INTERVAL, which is not anchored to any specific beginning or end time).
- Critical for usability because it acts as a set of time, so you can easily test for containment and other operations without using awkward constructs like BETWEEN or lots of comparisons (and keeping track of inclusivity/exclusivity of boundary points).
- Critical for performance because you can index the values for efficient “contains” and “overlaps” queries (among others).
- Temporal Keys (called Exclusion Constraints, and will be available in the next release of PostgreSQL, 9.0), which can enforce the constraint that no two periods of time (usually for a given resource, like a person) overlap. See the documentation (look for the word “EXCLUDE”), and see my previous articles (part 1 and part 2) on the subject.
- Critical for usability to avoid procedural, error-prone hacks to enforce the constraint with triggers or by splitting time into big chunks.
- Critical for performance because it performs comparably to a UNIQUE index, unlike the other procedural hacks which are generally too slow to use for most real systems.
What needs to be done?
- Range Types — Aside from PERIOD, which is based on TIMESTAMPTZ, it would also be useful to have very similar types based on, for example, DATE. It doesn’t stop there, so the natural conclusion is to generalize PERIOD into “range types” which could be based on almost any subtype.
- Range Keys, Foreign Range Keys — If Range Types are known to the Postgres engine, that means that we can have syntactic sugar for range keys (like temporal keys, except for any range type), etc., that would internally use Exclusion Constraints.
- Range Join — If Range Types are known to the Postgres engine, there could be syntactic sugar for a “range join,” that is, a join based on “overlaps” rather than “equals”. More importantly, there could be a new join type, a Range Merge Join, that could perform this join efficiently (without a Range Merge Join, a range join would always be a nested loop join).
- Simple table logs — The ability to easily create an effective “audit log” or similar trigger-based table log, that can record changes and be efficiently queried for historical state or state changes.
I’ll be speaking on this subject (specifically, the new Exclusion Constraints feature) in the upcoming PostgreSQL Conference EAST 2010 (my talk description) in Philadelphia later this month and PGCon 2010 (my talk description) in Ottawa this May. In the past, these conferences and others have been a great place to get ideas and help me move the temporal features forward.
The existing features have been picking up a little steam lately. The temporal-general mailing list has some traffic now — fairly low, but enough that others contribute to the discussions, which is a great start. I’ve also received some great feedback from a number of people, including the folks at PGX. There’s still a ways to go before we have all the features we want, but progress is being made.