Question

I'm just wondering, as a hypothetical example, what would be the best way to layout a table for the following scenario:

Let's say I'm writing an app for tracking student attendance. At the beginning of every year, I want to add all the students in (I'll do this manually - now, should a student ID be assigned to each one here? Let's call that table Students). Now, each day, I'm going to display all the Students in table Students and will allow the user to choose attendance.

So, how should I lay my table out? (If you don't understand what I mean, I mean what data should be entered in each column, row...) For example, maybe have a Students table with Student IDs and for each student every day create a new row in the Attendance table with Column 1: Student ID, Column 2: Date, Column 3: Status (present/absent). However, that doesn't seem to be very efficient. What do you think?

UPDATE: From all these first answers, it seems that one student is in each row in the student-attendance table (where he/she is designated as present/absent), but what if I were to include more than one student id per row, say for all students absent on some day? Would that be better or worse (probably it's ambiguous)? Actually, I'm starting to think efficiency would be decreased because the only actions this move even helps can be easily accomplished already. Hmm...

Was it helpful?

Solution

STUDENTS table

  • STUDENT_ID, pk
  • FIRST_NAME
  • LAST_NAME

STUDENT_ATTENDANCE table

  • STUDENT_ID, pk, fk
  • ABSENT_DATE, pk

There's not need for an IS_ABSENT column - having a date indicates that both that the student was absent, and on what date. There's likely to be less days absent than attended, so only store the absent dates.

Making the primary key to be composite of the two columns ensures that you won't have duplicates.

what if I were to include more than one student id per row, say for all students absent on some day? Would that be better or worse

Then you are either storing the additional student_ids as either a comma separated list in a single column, or additional columns for each additional student_id. Additional columns for each student_id would never work - you'd be adding a column for every new student, every year. Concatenating a list of student_ids is more realistic, but will be a pain to pull out details if you want to report on a specific student or group of students. Due to character limits, it runs the risk of not being able to store every student_id that could be absent with a single column.

I recommend using the STUDENT_ATTENDANCE table I suggested.

OTHER TIPS

The key point about database design is to provide a model with integrity. So in your example you wouldn't want to record student absences for dates which fall in weekends, holidays or inset days. So you also need a CALENDAR table. STUDENT_ABSENCE would be an intersection table between STUDENT and CALENDAR. That is, it would have foreign keys to both an ID in the STUDENT table and a DAY in the CALENDAR.

This may seem like over-engineering but pretty much everything that happens in a school involves scheduling, so a CALENDAR is essential. You might as well use it as much as possible, to build the best model you can.

Also, consider what other attributes the STUDENT_ABSENCE table needs. Off the top of my head you might what to record whether the absence was notified in advance (e.g. for a family holiday during term-time), whether the absence was approved, whether the absence was due to sickness.

I'd have a table MissedClasses with StudentID as a foreign key, the date, the course, and perhaps the period, and perhaps another column for excused or not. Place an entry if they didn't attend.

My reasoning: Hopefully most will attend most of the classes, so you only need to keep track of the misses.

If you combine multiple values into a list of values and store that in a single cell, your table is no longer in first normal form, as originally defined by Codd. You can conform to first normal form as redefined by Date by storing a table inside a table. Most newbies don't do that. They usually mash the list of values into a comma separated character string, and store the whole list as if it were a single atomic value.

What they find later on is that they can no longer exploit the power of relational operators, particularly the join, in order to express complex operations in a simple manner. This usually costs the newbie more than "inefficiency" would have. Even if you do put a table inside a table, you're going to find that doing routine things with the data is much harder than it has to be.

Most of the good suggestions you have gotten involve decomposing tables in order to acheive a normalized schema. That's usually the best plan for you to follow, until you learn at some later time when to break the rules of normalization and when to follow them.

This isn't written for a typical 12 year old. You don't sound like a typical 12 year old. So I'm trying to give you a leg up on the fundamentals of good database design, instead of letting you learn bad database design in high school, and then have to unlearn that and start all over again later.

I would have a student table with student ids and information specific to each student like name, grade, etc. Then have a student_attendance table with student id, date, status(present/absent).

Although this does collect alot of data, it is really not that much data and it allows you to run several kinds of reports on attendance very easily.

You would not want to put more that one student id in a single row because although you would have fewer rows, you would have just as much data and querying the table/reporting would be awkward.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top