Question

I want to make a database where I record the guests of each episodes of a tv show. I want to be able to do the following stastics:

What guest had the most appearences?

Which two (or three) guests appeared together the most?

How many unique guests did the show have each month/year?

What are the gender distribution between the guests?

What show had the highest number of guests?

What is the average number of guests on each episode of the show?

I also collect the description of each episode, so I want to find the most common topic for the episodes, which topic has which guests etc. The topics is retrieved from keywords in the episode description. An episode has everywhere from 2 to 10 topics.

...and possible more statistics I find that I want along the way.

However, how should I model it? Since each guest can have hundreds of apparences, I don't think an relational database where I map guest to episode is smart ->

| Guest Name | | Episode appeared in |

Since the second column can be filled with values.

Note that each episode can have ten to twenty guests as well.

Is a FACT table a better design, where I have the following:

|Date/Episode number | | Guest ID | 

with an own guest info table?

Was it helpful?

Solution

enter image description here

This is what I recommend for your requirements

OTHER TIPS

I think given the requirements and that at some point you may want additional information that wasn't listed here, I would go with a fully normalized approach.
3 Tables: Guests, Episodes, Episode_Guests Then depending on if you want to do this for more than one show, another table for Shows(or series).
As Paparazzi mentioned, the Guests table should contain sex. The Episode table should contain a date. Also, if you are going to do this for multiple shows, the Episode table should also have a foreign key back to the Shows table.
The Episode_Guests table should record every instance of a Guest appearing on an Episode so all it would need is a foreign key relationship back to guests and another for Episode.

It is a simple many to many table

episodeID 
guestID

The episode table should have a date

The guest table should have a sex

The following table structure is what I would use.

It covers all the information you're indicating you capture, and allows scalability if your shows/episodes grow, without having to make many modifications.

Guest
-GuestId (PK)
-Gender
-GuestName (Optional, as long as you have a way of identifying ID-Name)

Show
-ShowID (PK)
-ShowName

Episode
-EpisodeID (PK)
-ShowID
-EpisodeDate
-TopicType (Used for a generalization/grouping of topics)
-TopicDescription

Appearances
-AppearanceID (Optional PK if Guest appearance should be logged 2+ per episode)
-EpisodeID (PK)
-GuestID (PK)

Then you simply add records to each table individually, possibly creating constraints in the tables to not allow you to add an Appearance for a show, guest, or episode that doesn't yet exist.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top