Question

Say I have a large table that holds the user's info and another table that holds several locations. Then I use another table that holds the user_id and the location_id.

In order to retrieve the data I have to use Left Join query. Doesn't that make the whole process longer to retrieve rather than having it all in one table? Eg I could have the location as text on the same table.

EDIT: Here is an example.

CREATE TABLE  `user` (
`id` int(11) NOT NULL,
  `name` varchar(45) DEFAULT NULL,
  `gender` enum('M','F') DEFAULT NULL
);

CREATE TABLE `user_location` (
  `user_id` int(11) NOT NULL,
  `location_id` int(11) NOT NULL
);

CREATE TABLE `location` (
`id` int(11) NOT NULL,
  `location` varchar(45),
  `parent_id` varchar(45) 
);

Note: Please assume that all related fields are properly indexed between them.

Edit: I currently have a large database with users that retrieve their location via a junction table as described above. I was asked to optimize the database because the search results are slow. I've added memcache and it improved significantly but now I am just wondering about Left Joins.

For example, The current query is something like that:

SELECT * FROM users 
LEFT JOIN user_location 
ON user_location.user_id = user.id 
LEFT JOIN location
ON location.id = user_location.location_id;

And that is just to get the location. They have several other fields that are retrieved through junctions and they are all needed to view a user's profile. We have phone numbers, addresses, passwords, D.O.B and many others all in different tables.

In order for me to create a page for the user profile I have to send the server a large query. Now after the first time it gets cached and it's fine. But I was just wondering why would someone build their database like that?

Was it helpful?

Solution

If you put everything in one table, you will have a bigger, redundant table.

If all the tables are properly indexed, the 3 tables solution will be fast, because a small number of rows will be read for each query.

OTHER TIPS

Junction tables are a very standard practice in relational database design. It's covered in database 101. If you have a many-to-many relationship between two entities, the standard way to represent them is with three tables.

Two of the tables are entity tables, with a primary key. A junction table lies between them (logically) and contains two foreign keys, one that references each entity table. Often, these two foreign keys will be the only two columns in the junction table.

I can't understand why anyone would ask whether or not this is a good practice, unless they have never covered database 101.

"Please assume that all related fields are properly indexed between them." No, I won't do that. I see too many users who have never heard of "composite" indexes, much less understand their importance.

In particular, you should have:

CREATE TABLE user_location(
    # No surrogate id for this table
    user_id     MEDIUMINT UNSIGNED NOT NULL,   -- For JOINing to one table
    location_id MEDIUMINT UNSIGNED NOT NULL,   -- For JOINing to the other table
    # Include other fields specific to the 'relation'
    PRIMARY KEY(user_id, location_id),            -- When starting with user
    INDEX      (location_id, user_id)             -- When starting with location
) ENGINE=InnoDB;

Further notes are in my blog.

Your approach with DB is wrong. A table is not a bunch of fields to hold data that you handle adding/removing columns without criteria. DB structure is the result of a analysis. This part of Db born from specific requirements: A user live in one or more location. In the same location can live one or more user. A user is identified by a name and gender. A Location is identified by id. Based on this requirements you identify 2 entities: Users And Locations. Since the association between those entities is a many To many, transforming the conceptual schema to ER, you will (matematically) obtain a specific table that regards UsersLocations, composed (at least) by two foreign keys that points to both entities. Since the name can't be used as primary Key (because People can have the same name), you use an ID (probably with an autoincrement).

If you have an EAV you just have to INSERT a default value of location_id = 0 or 1 and it description would be Undefined or Not Set in your locations table. Make a trigger that INSERT by default in the table with the user_id and location_id.

So, you don't need to use LEFT JOIN and make the search slowly, just a JOIN. If the user has the location_id= 0 or 1 (what you took) is going to return the default location_name.

By the way, the LEFT JOIN syntax will depend of your index. If you have an index on those fields, I don't see the problem if your users table isn't big (Assuming).

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top