Question

I'm enrolled in DBM/BI certificate program (crash course more like) and I decided to embark on an independent project to sort of implement everything i'm learning in real time. Long story short, Ill be analyzing data (boxofficemojo.com) about the top grossing 130 movies from the last 13 years ( using MySQL server/workbench. ). First i'd like to map out a schema and then do some data mining/visualization. Here's how i've split it up so far:

"Movies"
 Movie_ID (Primary )
 Dom_Revenue
 Int_Revenue
 OpWe_Revenue
 Budget


"Rating"
Rating_ID (P)
Rating

"Release"
Release_ID (P)
Year
Month
Day
Movie_ID (F)

"Cast"
Director_Gender (P)
Lead_Gender (P)
Director_Name
Director_Name
Movie_ID (F)

"Studio"
Studio_ID (P)
Studio_Name

and these are my relationships so far:

rating to movies - one to many ( many movies can be rated R , a movie can only have 1 rating )
release to movies - one to many ( many movies can be released on the same weekend, a movie can only be released once)
cast to movies - one to many (directors/actors can make many movies, a movie can only have one cast)
studio to movies - many to many (movies can be attached to more than one studio, a studio can make more than one movie)

I know the schema is most likely not 100% correct so should i include the primary keys from all the other tables as foreign keys in the "movies" table? and how are my relationships?

thanks in advance

Was it helpful?

Solution

This is related to the first answer by Leo but I'll be more specific and I add more observations.

First, Release attributes are functionally dependent on Movie_ID (or Movies in general) so it should not be a separate Entity.

Second, and in relation to the first, you have Year, Month and Day in your Release Entity why not make it as Release_Date which has Year, Month and Day anyway? Then you could make again your Release attributes as part of your Movie.

Third, and in relation to the first why not add a Movie_Title field?

So, in all-in-all then you could have the following schema:

"Movies"
Movie_ID (Primary )
Movie_Title
Dom_Revenue
Int_Revenue
OpWe_Revenue
Budget
Release_Date

You could easily query movies that are release in a certain Year like:

SELECT Movie_Title, Year(Release_Date) as Release_Year
FROM Movies
WHERE Year(Release_Date) = 2011

Or you could count it also by Year (or by Month)

SELECT Year(Release_Date) as Release_Year, COUNT(*) Number_of_Movies_in_a_Year
FROM Movies
GROUP BY Year(Release_Date)
ORDER BY Year(Release_Date)

Fourth, in your Cast entity you said "Directors/Actors can make many movies, a movie can only have one cast". But looking at your Cast you have a Movie attribute which is a FK (Foreign Key) from Movies and that means by the way that a Movie could have many Cast because the FK is always in the many side. And besides this entity is almost like a violation of the 4NF (Fourth Normal Form). So, the best way probably to do this is to make specialization in your Cast table and relate it to Movies table so that it would have One-to-Many relationship or a Cast or Director could have many movies. So, it would look like this:

 "Cast"
 Cast_ID (PK)
 Cast_Name
 Cast_Gender
 Cast_Type (values here could either be Director or Lead or could be simply letters like D or L)

And your Movies table could now be changed to like this:

"Movies"
Movie_ID (Primary )
Movie_Title
Dom_Revenue
Int_Revenue
OpWe_Revenue
Budget
Release_Date
Lead_ID (FK)
Cast_ID (FK)

Lastly, you said "movies can be attached to more than one studio, a studio can make more than one movie". A Many-to-many relationship usually has a bridge table to create the many-to-many relationship between entities. So, let's say you have a Studio_Movie entity/table as your bridge table then you will have like this:

"Studio_Movie"
Studio_ID (PK, FK1)
Movie_ID (PK, FK2)

OTHER TIPS

it looks ok for me.

I just think the "release" entity maybe a little bit overkill (what's the use to know what movies were released at the same time?) so I think it could just be a set of movie attributes.

And also your "cast" entity has two directors. Maybe you could normalize that and keep only 1 director (since movie 1<-->N director, it's just a matter of adding relationships)

About FKs, yes, you should add them. Your relationships look fine.

Good luck.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top