Scheduling Employees - what data structure to use?

https://stackoverflow.com/questions/1634248

06-07-2019
|

Question

I'm trying to write a simple employee Scheduling software for about 10-20 people in my software development company. After some consideration I settled on writing a web app in Python, Ruby or PHP + Postgres/MySQL DB. While designing database models I began to wonder what data structure would actually be the best for that kind of application.

What it will look like

Example of app showing the month view would be similar to this:

 OCTOBER    1 2 3 4 5 6 7 8 9 ...
John Apple  M M A A N N O O O ...
Daisy Pear  O O O M M A A N N ...
Steve Cat   A A N N O O O M M ...
Maria Dog   N N O O O M M A A ...

where M -> for Morning shift; A -> Afternoon shift etc. (letters can be changed to codes)

What data structure or database design would be the best for this? I was thinking about storing strings (max of 31 characters -> 1 char , 1 day) similar to -> "MMAANNOOOAAMMNNAAOO..." for each user; Month table would contain such strings for each employee.

What would you suggest?

Solution

I would go with three-table Kimball star (Date, Employee, Schedule), because sooner or later you will be asked to create (demanding) reports out of this. Who worked most nights? Who worked most weekends? Who never works weekends? Why am I always scheduled Friday afternoon? On which day of a week are certain employees most likely not to show up? Etc, etc...

Tables would be:

TABLE dimDate (KeyDate, FullDate, DayOfWeek, DayNumberInWeek, IsHoliday,... more here)
You can pre-fill dimDate table for 10 years, or so -- may need to tweek the "IsHoliday" column from time to time.

Employee table also changes (relatively) rarely.
TABLE dimEmployee (KeyEmployee, FirstName, LastName, Age, ... more here)

Schedule table is where you would fill-in the work schedule, I have also suggested "HoursOfWork" for each shift, this way it is easy to aggregate hours in reports, like: "How many hours did John Doe work last year on holidays?"

TABLE factSchedule ( KeySchedule, -- surrogate PK KeyDate, -- FK to dimDate table KeyEmployee, -- FK to dimEmployee table Shift, -- shift number (degenerate dimension) HoursOfWork, -- number of work hours in that shift )

Instead of having the surrogate KeySchedule, you could also combine KeyDate, KeyEmployee and Shift into a composite primary key to make sure you can not schedule same person on the same shift the same day. Check this on the application layer if the surrogate key is used.
When querying, join tables like:

SELECT SUM(s.HoursOfWork) FROM factSchedule AS s JOIN dimDate AS d ON s.KeyDate = d.KeyDate JOIN dimEmployee AS e ON s.KeyEmployee = e.KeyEmployee WHERE e.FirstName='John' AND e.LastName='Doe' AND d.Year = 2009 AND d.IsHoliday ='Yes';

If using MySQL it is OK to use MyISAM for storage engine and implement your foreign keys (FK) as "logical only" -- use the application layer to take care of referential integrity.

Hope this helps.

OTHER TIPS

A quick answer first:

EmployeeID
Date
ShiftType

That said, the best database design largely depends on what you're going to do with the data. If all you need to do is store the records and display them in a table similar to you example, your approach (while not elegant) would work.

However, if you're going to retrieve the data or run reports, you're going to want something a little more structured than a string where each character represents the type of shift assignment.

I'd suggest a more noramlized database, e.g. a table for persons and one which is the product of shift information for a perdon and a date.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow