Question

I'm trying to create a database schema in which I can store statistics for a sports league where each game has team statistics.

I have the following:

Home Team | Away Team | Venue | Home stat #1 | Away stat #1 | Home stat #2 | Away stat #2 | ... | Home stat #n | Away stat #n

There are many more than two stats. For example, one stat may be goals and one may be shots on goal. Both the home and away teams have stats for the same categories recorded but they're not correlated (for eg, no way to find the number of goals the home team scored by knowing how many the away team scored).

What is the best schema to store this in?

At the moment, I was thinking:

Teams(TeamID, TeamName)
Venues(VenueID, VenueName)
Games(GameID, HomeTeamID, AwayTeamID, VenueID)
Stats(GameID, TeamID, Stat#1, Stat#2, ... , Stat#n) 

This avoids having to duplicate each statistic for the home team and away team in different columns which is what I'd have to do if I wanted to include everything in the 'Games' table. I'm unsure if this good schema design though and would appreciate any feedback.

Was it helpful?

Solution

In the interest of actually answering your question, and not just commenting on it, here are some additional thoughts.

From your description this is how your tables look:

GameStats DB Tables

Personally, I try to use natural keys as much as I can, but I can see in this case the Games table would end up with a composite key of home_team_id, away_team_id, venue_id and an additional game_date to make sure a game row is unique.

This would then have the knock-on effect of needing a silly number of foreign keys in the Stats table; so stick with a surrogate key.

Using a surrogate game_id means it's possible to create duplicate games, so add a unique index on the home_team_id, away_team_id, venue_id & (additional field) game_date. Then you have simpler design but still protecting against duplicates.

As I said in my comment, if you envisage adding many more stats, or adding them frequently, it would be better to make them row data to avoid updating the model too often and having to rewrite your queries. If you get to that situation, post another question :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top