Why Normalization Failed to Become the Ultimate Guide for Database Designers?
While trying to find marshall's claim that Alberto Mendelzon says the universal relation is an idea re-invented once every 3 years (and later finding a quote by Jeffrey Ullman that the universal relation is re-invented 3 times a year), I stumbled across a very provocative rant by a researcher/practitioner: Why Normalization Failed to Become the Ultimate Guide for Database Designers? by Martin Fotache. It shares an interesting wealth of experience and knowledge about logical design. The author is obviously well-read and unlike usual debates I've seen about this topic, presents the argument thoroughly and comprehensively.
The abstract is:
The body of the paper presents an explanation for why practitioners have rejected normalization. The author also shares his opinion on potentially underexplored ideas as well, drawing from an obviously well-researched depth of knowledge. In recent years, some researchers, such as Microsoft's Pat Helland, have even said Normalization is for sissies (only to further this with later formal publications such as advocating we should be Building on Quicksand). Yet, the PLT community is pushing for the exact opposite. Language theory is firmly rooted in formal grammars and proven correct 'tricks' for manipulating and using those formal grammars; it does no good to define a language if it does not have mathematical properties ensuring relaibility and repeatability of results. This represents and defines real tension between systems theory and PLT.
I realize this paper focuses on methodologies for creating model primitives, comparing mathematical frameworks to frameworks guided by intuition and then mapped to mathematical notions (relations in the relational model), and some may not see it as PLT. Others, such as Date, closely relate understanding of primitives to PLT: Date claims the SQL language is to blame and have gone to the lengths of creating a teaching language, Tutorial D, to teach relational theory. In my experience, nothing seems to effect lines of code in an enterprise system more than schema design, both in the data layer and logic layer, and often an inverse relationship exists between the two; hence the use of object-relational mapping layers to consolidate inevitable problems where there will be The Many Forms of a Single Fact (Kent, 1988). Mapping stabilizes the problem domain by labeling correspondances between all the possible unique structures. I refer to this among friends and coworkers as the N+1 Schema Problem, as there is generally 1 schema thought to be canonical, either extensionally or intensionally, and N other versions of that schema.
Question: Should interactive programming languages aid practitioners in reasoning about their bad data models, (hand waving) perhaps by modeling each unique structure and explaining how they relate? I could see several reasons why that would be a bad idea, but as the above paper suggests, math is not always the best indicator of what practitioners will adopt. It many ways this seems to be the spirit of the idea behind such work as Stephen Kell's interest in approaching modularity by supporting evolutionary compatibility between APIs (source texts) and ABIs (binaries), as covered in his Onward! paper, The Mythical Matched Modules: Overcoming the Tyranny of Inflexible Software Construction. Similar ideas have been in middleware systems for years and are known as wrapper architecures (e.g., Donâ€™t Scrap It, Wrap It!), but haven't seen much PLT interest that I'm aware of; "middleware" might as well be a synonym for Kell's "integration domains" concept.
Active forum topics
New forum topics