Login or Sign Up to become a member!
LessThanDot Site Logo

LessThanDot

A Technical Community for IT Professionals

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Highly Rated Users

Forum
No Posts Rated

Top 50
Given
Received

Links

Wiki
Blog

Forum Statistics

Users
Members:
1547
Members Online:
3
Guests Online:
7

Total Post History
Posts:
80398
Topics:
18391

7-Day Post History
New Posts:
21
New Topics:
5
Active Topics:
9

Our newest member
gauthya

Other

FAQ
All times are UTC [ DST ]

Google Ads

Thoughts About Database Organization

Microsoft SQL Server
Please wait...

Thoughts About Database Organization

Postby Emtucifor on Sat Mar 05, 2011 1:22 am

To a user, database indexes are sort of behind-the-scenes type things that in many ways function transparently.

But in reality, an index is just a table with the special case that it has a one-to-one relationship with the rows (or a subset of the rows) in another table, and has the special requirement of matching rows/columns between the tables being forced to have the same values (you can almost think of them as reference types, at least in the way they function).

So, what if there was a database that had all indexes exposed as their own tables?

For some reason what made me think of this is a little settings table I created:

AppSettingID tinyint
Descr varchar(128)
Value nvarchar(200)
ValueTypeID tinyint

The description and ValueTypeID are not needed during value lookup time, just during data setup time (where valuetypeid enforces number or text or whatever) or when someone is reviewing the settings such as in a report.

So I made the table clustered on AppSettingID, unique on Descr, and then added a nonclustered unique index on AppSettingID with the Value column included.

So most requests to the table for actually looking up setting values for application function will hit the nonclustered index only. I realize that I probably will never have more than a handful of app settings anyway, so it makes little difference, but the concepts are still interesting to me.

The interesting part was that in a sense, I'm treating the nonclustered index as the "main" table, really. Then, I have a separate one-to-one table with the extra information that isn't looked up so often, only the engine enforces the data so there can never be any denormalization.

This got me thinking about what it could be like if you could make the "clustered" index be what is now the nonclustered index, and the additional information actually be in a separate table, with some kind of referential constraint that forces them to act as if they were a single (though horizontally split) table.

Being able to move seldom-used columns out of a table into a second, lower-priority table, but have the table and all its segments (partitions/parts/subtables) still function as if they were a single table would be immensely valuable.

Some adaptation of this could also support supertype/subtype relationships, where without a view and without triggers and special management to enforce it, rows in two different tables could be treated as a single row in one table.

For example:

Supertype Table: Party
PartyID
Name
CreateDate

Subtype Table: Person

PartyID
SSN
Gender

Subtype Table: Company
PartyID
CEOPartyID
TIN

Normally, to see all the attributes of a person we'd have to do this:

  1. SELECT
  2.    P.PartyID,
  3.    P.Name,
  4.    N.SSN,
  5.    N.Gender
  6. FROM
  7.    Party P
  8.    INNER JOIN Person N ON P.PartyID = N.PartyID

And we'd need to do something similar for company. We also have the problem of having to manually create a Party row with its values first, then insert a row to one of the subtype tables with the PartyID. To delete, we'd have to do so in reverse order or have CASCADE DELETE on. And to update, we'd have to go through more conniptions, even with CASCADE UPDATE on we couldn't update the child table's value.

Instead, if columns could be moved to "lower-priority" tables, if rows were not created in these tables at all when all values in them are NULL, and if referential constraints could refer to these tables as well as "real" tables, we could do something much more elegant. Notice here that I'm implicitly creating Party and Company tables in this script:

  1. CREATE TABLE Party (
  2.    PartyID INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
  3.    Name NVARCHAR(128) NOT NULL,
  4.    SSN CHAR(11) NULL INSIDE Person REFERENCING Party,
  5.    Gender CHAR(1) NOT NULL INSIDE Person REFERENCING Party,
  6.    CEOPartyID NULL INSIDE Company REFERENCING Party CONSTRAINT FK_Party_CEOPartyID FOREIGN KEY REFERENCES Person(PartyID),
  7.    TIN CHAR(11) NULL INSIDE Company REFERENCING Party
  8. )

Behind the scenes, of course, Party and Company have the PartyID of the Party hidden in them.

Then, SELECT * FROM Party would yield all columns in the table and the subtables. If SSN and Gender were NULL for one row, there would be no row for this in the Person subtable at all.

SELECT PartyID, Name, SSN, Gender FROM Party would be equivalent to my above query and would have to do a join.

SELECT PartyID, SSN, Gender FROM Party would be equivalent to SELECT * FROM Person and would only have to hit the Person table.

Do you see that there's a foreign key constraint in there referring to Person.PartyID (not Party.PartyID)? This is a very substantial part of why people use subtype tables in the first place, so that with referential constraints we can restrict to only Parties of a certain type.

Another thing that could be done is have sub-subtype tables. If more columns were added to Party INSIDE Employee REFERENCING Person we could add yet more details that are only specific to some people, but not all in the database. Careful column selection would enable the engine to only read the tables it needs to read from to do its job, and we could do even more selective referential integrity against Employee.PartyID.

Yet, in cases where we just want any old person or company (such as when placing an order) we can use a foreign key to Party.PartyID.

One thing to work out here, though, is whether the relationships between the various peer subtables are mutually exclusive or not. Could a Party be both a Person and a Company? Actually, that may be true, but in some supertype/subtype relationships the subtypes would be mutually exclusive. Having some way to distinguish this would be useful.

Perhaps an advance creation of a named container would be required, so that instead of just being able to implicitly create a Person segment with INSIDE Person we'd have to do something like this:

  1. ALTER TABLE Party ADD SUBTYPEGROUP PartyType
  2. ALTER TABLE Party ADD SUBTYPE Person AS PartyType
  3. ALTER TABLE Party ADD SUBTYPE Company AS PartyType
  4. ALTER TABLE Party ADD SSN CHAR(11) NOT NULL INSIDE Person
  5. ALTER TABLE Party ADD TIN CHAR(1) NOT NULL INSIDE Company

Note that SSN NOT NULL means not null is required only if a row exists in Person at all. If all the columns of Person are NULL this means the values don't exist, rather than that they are present with NULLs. So I supposed it could be confusing to have a NULL in a NOT NULL column, but we can already get this when we do a LEFT JOIN, anyway. For a composite table like this, people would just have to learn what it means.

Then, a constraint to make the PartyTypes mutually exclusive would look like this:

  1. ALTER TABLE Party ADD CONSTRAINT EX_PartyID_PartyType EXCLUSIVE (PartyID, PartyType)


Additionally, queries with explicit reference to the PartyType column would be optimized:

  1. SELECT * FROM Party WHERE PartyType IS Person
  2. SELECT * FROM Party WHERE PartyType IS Person OR PartyType IS Company
  3. SELECT * FROM Party WHERE PartyType ISIN (Person, Company)

Of course there would be a lot of details to work out, but I love thinking about possibilities. :)

Also, I almost forgot to say that INSERTs, UPDATEs and DELETEs would be really easy. Yeah, you couldn't insert to the Person table directly, but that's not a problem:

  1. INSERT Party(Name, SSN, Gender) VALUES ('Joe', '123-45-6789', 'M')

And boom, you've got your Person subtype row already created. If you try to insert values to SSN and TIN and you have a unique partytype constraint going, you'd get an error about duplicate PartyTypes and the insert would fail just the same as any other constraint violation.

You could convert a person to a company like so:

  1. UPDATE Party SET SSN = NULL, Gender = NULL, TIN = SSN WHERE PartyID = 3

This would be allowed and would delete the row from Person, add one to Company, and leave Party unchanged.

With some polish, would this be worthy of a blog post?
God cries a little bit every time someone builds a database.
User avatar
Emtucifor
Guru
Guru
LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030
LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030
LTD Gold - Rating: 1030
 
Posts: 2835
Joined: Fri May 30, 2008 9:30 pm
Location: Bellingham, WA
Unrated

Re: Thoughts About Database Organization

Postby SQLDenis on Sat Mar 05, 2011 10:24 pm

Interesting concept, perhaps you should make this into a blog post(like you mentioned) to get some feedback
User avatar
SQLDenis
LTD Admin
LTD Admin
LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454
LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454
LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454
 
Posts: 21712
Joined: Wed Oct 10, 2007 6:43 pm
Location: Princeton, New Jersey, USA,World, Solar System, Milky Way, Universe and Beyond
Unrated

Re: Thoughts About Database Organization

Postby Emtucifor on Sat Mar 05, 2011 11:40 pm

It's really vertical partitioning, now that I think about it...
God cries a little bit every time someone builds a database.
User avatar
Emtucifor
Guru
Guru
LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030
LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030
LTD Gold - Rating: 1030
 
Posts: 2835
Joined: Fri May 30, 2008 9:30 pm
Location: Bellingham, WA
Unrated

Re: Thoughts About Database Organization

Postby SQLDenis on Sun Mar 06, 2011 12:06 am

Emtucifor wrote:It's really vertical partitioning, now that I think about it...



which also could (or could not) be called further normalization of a normalized set of tables
User avatar
SQLDenis
LTD Admin
LTD Admin
LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454
LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454
LTD Gold - Rating: 3454LTD Gold - Rating: 3454LTD Gold - Rating: 3454
 
Posts: 21712
Joined: Wed Oct 10, 2007 6:43 pm
Location: Princeton, New Jersey, USA,World, Solar System, Milky Way, Universe and Beyond
Unrated

Re: Thoughts About Database Organization

Postby Thirster42 on Mon Mar 07, 2011 4:17 pm

tldr: eric makes my brian hurt trying to understand him sometimes
Ask three DBAs a question, and you’ll get four different answers.

-Brent Ozar
User avatar
Thirster42
Guru
Guru
LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517
LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517
 
Posts: 4639
Joined: Tue Nov 11, 2008 5:49 pm
Location: Jefferson City, Mo
Unrated

Re: Thoughts About Database Organization

Postby Emtucifor on Mon Mar 07, 2011 6:30 pm

Ouch! Sorry to make your brian hurt. There are times I wonder if I even have a brian.
God cries a little bit every time someone builds a database.
User avatar
Emtucifor
Guru
Guru
LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030
LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030LTD Gold - Rating: 1030
LTD Gold - Rating: 1030
 
Posts: 2835
Joined: Fri May 30, 2008 9:30 pm
Location: Bellingham, WA
Unrated

Re: Thoughts About Database Organization

Postby Thirster42 on Mon Mar 07, 2011 6:51 pm

Emtucifor wrote:Ouch! Sorry to make your brian hurt. There are times I wonder if I even have a brian.


<O_O>
Ask three DBAs a question, and you’ll get four different answers.

-Brent Ozar
User avatar
Thirster42
Guru
Guru
LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517
LTD Silver - Rating: 517LTD Silver - Rating: 517LTD Silver - Rating: 517
 
Posts: 4639
Joined: Tue Nov 11, 2008 5:49 pm
Location: Jefferson City, Mo
Unrated

Re: Thoughts About Database Organization

Postby damber on Mon Mar 07, 2011 7:44 pm

Thirster42 wrote:
Emtucifor wrote:Ouch! Sorry to make your brian hurt. There are times I wonder if I even have a brian.


<O_O>

I swapped my brian for a geoff - he's less sensitive to this kind of thing, so less complaints all round - and brian is now leading a better life because of it.
a smile is worth a thousand kind words, so smile, it's easy! :-)


CODE: $5
WORKING CODE: $500
PROPERLY DESIGNED & WORKING CODE: Priceless
User avatar
damber
LTD Admin
LTD Admin
LTD Silver - Rating: 660LTD Silver - Rating: 660LTD Silver - Rating: 660LTD Silver - Rating: 660LTD Silver - Rating: 660
LTD Silver - Rating: 660LTD Silver - Rating: 660LTD Silver - Rating: 660LTD Silver - Rating: 660LTD Silver - Rating: 660
 
Posts: 3134
Joined: Tue Oct 09, 2007 1:48 pm
Location: North Wales, UK
Unrated