The solution ORMs provide is to “map” an object class, which is a type (or domain), onto a table, which is a relation variable (a.k.a. relvar). This supposedly abstracts away an “impedance mismatch” between the two. ORMs are already off to a bad start, mapping a type to a variable, but I’ll continue.
The real impedance mismatch is a set of very fundamental differences between application data and data in a well-designed relational database.
- is ephemeral in nature (that is, temporary)
- is subject to changes in structure between revisions of the same application
- is subject to changes in semantics (meaning) between revisions of the same application
- the meaning is highly context-sensitive
- the data is highly dependent on physical structure
- not all of the data that has been processed by the application is held at once, meaning that contradictions are often impossible to detect (for instance, something as simple as a duplicate serial number).
While data in a well-designed database:
- has a definite predicate that maps the data to real world facts independent of the application
- these facts are kept as long as the fact is still believed to be true and relevant
- these facts are context-insensitive
- these facts are independent of any physical structure or implementation
- is held all at once, allowing detection of contradictions through complex logical relationships, enforced by declarative constraints
Some people like to think that an ORM is something that should just be plugged in to add “persistence” without requiring database design. This idea views the difference between application data and relational data as an impedance mismatch that can be overcome with clever code. The entire premise behind that idea is that an RDBMS is a legacy technology, or that the only reason an RDBMS is needed is for performance, reliability, stability, backup procedures, and other services that any DBMS provides.
In reality, the impedance mismatch is the difficulty in mapping application data to facts in the real world. The process of inserting new information into a database is not just the process of making that data persistent; it is the processes of reconciling that new information through automated logical inferences with all the other data in the system and, if a logical contradiction is detected, rejecting the new information with a meaningful error.
Back to ORMs. Why “map” the object, and not just store it directly? Modern relational databases support a wide range of types, and also the ability to declare your own types, including sophisticated types (like an object). You can define input and output routines (for transmission to/from the client application), and then all of the operators on that type. There is some duplicity in first defining the object in the application, and then defining it again in the database, but I don’t think that’s the only reason.
If you put the entire object into one field, you can see right away that no meaning has been stored. Moreover, you will realize soon after that the implementation of the object has been solidified in the database, and therefore it’s no longer so easy to change the implementation details of the application. However, when you dump the internals of an object into separate fields in a table, there is the illusion that the meaning of that data has been stored as well, and the illusion that the data is independent of the implementation. This is the same illusion as when an object is serialized as XML: it looks like there’s meaning there, but there really isn’t, it’s just serialized data made from some internal application state with no meaning outside of context.
If you actually take the time to design predicates that have meaning in the real world, and from which inferences can be made when logically combined with other predicates, and then map the application data to real world facts that match those predicates; only then are you free from the implementation details of the specific revision of the specific application that inserted the data, and have enough information that you can make automated inferences from that data.
The point of all this is not that users of ORMs are wrong necessarily, my point is that there is no “holy grail” ORM solution that will solve these problems for you. Often, ORMs get in the way of you trying to solve those problems. Recognize the limits of an ORM, and as long as you don’t sacrifice data integrity, then work within those limits.