Question

I need to store contact information for users. I want to present this data on the page as an hCard and downloadable as a vCard. I'd also like to be able to search the database by phone number, email, etc.

What do you think is the best way to store this data? Since users could have multiple addresses, etc complete normalization would be a mess. I'm thinking about using XML, but I'm not familiar with querying XML db fields. Would I still be able to search for users by contact info?

I'm using SQL Server 2005, if that matters.

Was it helpful?

Solution

Consider two tables for People and their addresses:

People (pid, prefix, firstName, lastName, suffix, DOB, ... primaryAddressTag )

AddressBook (pid, tag, address1, address2, city, stateProv, postalCode, ... )

The Primary Key (that uniquely identifies each and every row) of People is pid. The PK of AddressBook is the composition of pid and tag (pid, tag).

Some example data:

People

1, Kirk

2, Spock

AddressBook

1, home, '123 Main Street', Iowa

1, work, 'USS Enterprise NCC-1701'

2, other, 'Mt. Selaya, Vulcan'

In this example, Kirk has two addresses: one 'home' and one 'work'. One of those two can (and should) be noted as a foreign key (like a cross-reference) in People in the primaryAddressTag column.

Spock has a single address with the tag 'other'. Since that is Spock's only address, the value 'other' ought to go in the primaryAddressTag column for pid=2.

This schema has the nice effect of preventing the same person from duplicating any of their own addresses by accidentally reusing tags while at the same time allowing all other people use any address tags they like.

Further, with FK references in primaryAddressTag, the database system itself will enforce the validity of the primary address tag (via something we database geeks call referential integrity) so that your -- or any -- application need not worry about it.

OTHER TIPS

Why would complete normalization "be a mess"? This is exactly the kind of thing that normalization makes less messy.

Don't be afraid of normalizing your data. Normalization, like John mentions, is the solution not the problem. If you try to denormalize your data just to avoid a couple joins, then you're going to cause yourself serious trouble in the future. Trying to refactor this sort of data down the line after you have a reasonable size dataset WILL NOT BE FUN.

I strongly suggest you check out Highrise from 36 Signals. It was recently recommended to me when I was looking for an online contact manager. It does so much right. Actually, my only objection so far with the service is that I think the paid versions are too expensive -- that's all.

As things stand today, I do not fit into a flat address profile. I have 4-5 e-mail addresses that I use regularly, 5 phone numbers, 3 addresses, several websites and IM profiles, all of which I would include in my contact profile. If you're starting to build a contact management system now and you're unencumbered by architectural limitations (think gmail cantacts being keyed to a single email address), then do your users a favor and make your contact structure as flexible (normalized) as possible.

Cheers, -D.

I'm aware of SQLite, but that doesn't really help - I'm talking about figuring out the best schema (regardless of the database) for storing this data.

Per John, I don't see what the problem with a classic normalised schema would be. You haven't given much information to go on, but you say that there's a one-to-many relationship between users and addresses, so I'd plump for a bog standard solution with a foreign key to the user in the address relation.

If you assume each user has one or more addresses, a telephone number, etc., you could have a 'Users' table, an 'Addresses Table' (containing a primary key and then non-unique reference to Users), the same for phone numbers - allowing multiple rows with the same UserID foreign key, which would make querying 'all addresses for user X' quite simple.

I don't have a script, but I do have mySQL that you can use. Before that I should mentioned that there seem to be two logical approaches to storing vCards in SQL:

  1. Store the whole card and let the database search, (possibly) huge text strings, and process them in another part of your code or even client side. e.g.

    CREATE TABLE IF NOT EXISTS vcards (
    name_or_letter varchar(250) NOT NULL,
    vcard text NOT NULL,
    timestamp timestamp default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
    PRIMARY KEY (username)
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

Probably easy to implement, (depending on what you are doing with the data) though your searches are going to be slow if you have many entries. If this is just for you then this might work, (if it is any good then it is never just for you.) You can then process the vCard client side or server side using some beautiful module that you share, (or someone else shared with you.)

I've watched vCard evolve and know that there is going to be some change at /some/ time in the future so I use three tables.

The first is the card, (this mostly links back to my existing tables - if you don't need this then yours can be a cut down version). The second are the card definitions, (which seem to be called profile in vCard speak). The last is all the actual data for the cards.

Because I let DBIx::Class, (yes I'm one of those) do all of the database work this, (three tables) seems to work rather well for me, (though obviously you can tighten up the types to match rfc2426 more closely, but for the most part each piece of data is just a text string.)

The reason that I don't normalize out the address from the person is that I already have an address table in my database and these three are just for non-user contact details.

 CREATE TABLE `vCards` (   
 `card_id` int(255) unsigned NOT NULL AUTO_INCREMENT,   
 `card_peid` int(255) DEFAULT NULL COMMENT 'link back to user table',   
 `card_acid` int(255) DEFAULT NULL COMMENT 'link back to account table',      
 `card_language` varchar(5) DEFAULT NULL COMMENT 'en en_GB',
 `card_encoding` varchar(32) DEFAULT 'UTF-8' COMMENT 'why use anything else?',
 `card_created` datetime NOT NULL,  
 `card_updated` datetime NOT NULL,
 PRIMARY KEY (`card_id`) )
 ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='These are the contact cards'

   create table vCard_profile (
    vcprofile_id int(255) unsigned auto_increment NOT NULL,
    vcprofile_version enum('rfc2426') DEFAULT "rfc2426" COMMENT "defaults to vCard 3.0",
    vcprofile_feature char(16) COMMENT "FN to CATEGORIES",
    vcprofile_type enum('text','bin') DEFAULT "text" COMMENT "if it is too large for vcd_value then user vcd_bin",
  PRIMARY KEY (`vcprofile_id`)
) COMMENT "These are the valid types of card entry";
INSERT INTO vCard_profile VALUES('','rfc2426','FN','text'),('','rfc2426','N','text'),('','rfc2426','NICKNAME','text'),('','rfc2426','PHOTO','bin'),('','rfc2426','BDAY','text'),('','rfc2426','ADR','text'),('','rfc2426','LABEL','text'),('','rfc2426','TEL','text'),('','rfc2426','EMAIL','text'),('','rfc2426','MAILER','text'),('','rfc2426','TZ','text'),('','rfc2426','GEO','text'),('','rfc2426','TITLE','text'),('','rfc2426','ROLE','text'),('','rfc2426','LOGO','bin'),('','rfc2426','AGENT','text'),('','rfc2426','ORG','text'),('','rfc2426','CATEGORIES','text'),('','rfc2426','NOTE','text'),('','rfc2426','PRODID','text'),('','rfc2426','REV','text'),('','rfc2426','SORT-STRING','text'),('','rfc2426','SOUND','bin'),('','rfc2426','UID','text'),('','rfc2426','URL','text'),('','rfc2426','VERSION','text'),('','rfc2426','CLASS','text'),('','rfc2426','KEY','bin');

create table vCard_data (
    vcd_id int(255) unsigned auto_increment NOT NULL,
    vcd_card_id int(255) NOT NULL,
    vcd_profile_id int(255) NOT NULL,
    vcd_prof_detail varchar(255) COMMENT "work,home,preferred,order for e.g. multiple email addresses",
    vcd_value varchar(255),
    vcd_bin blob COMMENT "for when varchar(255) is too small",
    PRIMARY KEY (`vcd_id`)
) COMMENT "The actual vCard data";

This isn't the best SQL but I hope that helps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top