Question

I have two tables (timesheet and tasks) each contains an hour value column "allocated hours" and "actual hours" of which I am trying to get the sum of both of these values. also the timesheet table contains a integer value for "staff_id" which corresponds to the "assigned_to" in the task table

the task table contains:

task_id INT(11)
assigned_to INT(11)
date_start DATE
hrs DECIMAL (10,0)

the timesheet table contains:

timesheet_id (int)
name varchar(100)
hours decimal(10,0)
staff_id(INT 11)

my query looks like:

    SELECT
        timesheet.staff_id,
        task.assigned_to,
        SUM(task.hrs) AS assigned_hrs,
        timesheet.name,
        SUM(timesheet.hours) AS actual_hours
    FROM timesheet
    INNER JOIN task
    ON timesheet.staff_id = task.assigned_to
    GROUP BY timesheet.name

which will (incorrectly) result in:

staff_id       |assigned_to |assigned_hrs    | name.         |  actual_hours |
---------------|------------|----------------|---------------|---------------|
4              |4           | 1364           | John Smith    |52          
2              |2           | 80             | Jane Doe      |14.5        
6              |6           | 454            | Test User 1   |40          
9              |9           | 262            | Test User 2   |4           

The above is what I am trying to get, However all of the results are correct but John Smith's assigned hours get doubled. I know it has to do with the "Grouping Pitfall" as described here:

http://wikido.isoftdata.com/index.php/The_GROUPing_pitfall

but I just go cross eyed trying to figure this out. can someone point me in the right direction?

(edit again) If I run a query just on the task table:

    SELECT
    task.assigned_to,
    SUM(task.hrs) AS allocated_hrs
    FROM task
    GROUP BY task.assigned_to

It (correctly) results in:

assigned_to | allocated_hrs |
----------------------------
4           |    682
7           |    378
2           |    40
6           |    227
9           |    262

you can see that the user ID of "4" which is John Smith has doubled (and also ID 6)

running a query on just the timesheet table :

    SELECT
    timesheet.name,
    SUM(timesheet.hours) AS actual_hours
    FROM timesheet
    GROUP BY timesheet.name

correctly results in :

    name    |  Actual_hrs
    -------------------------
    Jane Doe   | 19.5
    John Smith | 6.5
    Test User1 | 4
    Test User2 | 5

running the query supplied by JoachimL results in :

    staff_id |  assigned_to |   assigned_hrs |  name |  actual_hours
    ----------------------------------------------------------------------
    2   2   40  Jane Doe    19.5
    4   4   24  John Smith  6.5
    4   4   7   John Smith  6.5
    4   4   21  John Smith  6.5
    4   4   210 John Smith  6.5
    4   4   28  John Smith  6.5
    4   4   91  John Smith  6.5
    6   6   14  Test User 1 8
    6   6   91  Test User 1 8
    6   6   28  Test User 1 8
    6   6   3   Test User 1 8
    9   9   24  Test User 2 1
    9   9   91  Test User 2 1
    9   9   56  Test User 2 1

Here's a fiddle http://sqlfiddle.com/#!2/ef680

Was it helpful?

Solution

SELECT x.*
     , SUM(y.hrs) n
  FROM
     ( SELECT t.staff_id
            , t.name
            , SUM(t.hours) actual_hours
         FROM timesheet t
        GROUP 
           BY t.staff_id
     ) x
  JOIN task y
    ON y.assigned_to = x.staff_id
 GROUP
    BY staff_id;

http://sqlfiddle.com/#!2/ef680/14

OTHER TIPS

No comment privs...

Does ID 4 and 6 have two rows in timesheet? The others just one? Then task.hrs would be doubled.

Something like this should avoid that. If task_id is unique you don't have to sum that. (test data would help)

EDIT

SELECT
        ts.staff_id,
        task.assigned_to,
        task.hrs AS assigned_hrs,
        ts.name,
        ts.actual_hours
    FROM task
    INNER JOIN (SELECT staff_id, name, SUM(hours) as actual_hours FROM timesheet GROUP BY staff_id, name) as ts
    ON ts.staff_id = task.assigned_to

The above: group timesheet by staff_id/name Then join with tasks, which should be just one row per task

    SELECT
        timesheet.staff_id,
        task.assigned_to,
        SUM(task.hrs) AS assigned_hrs,
        timesheet.name,
        SUM(timesheet.hours) AS actual_hours
    FROM task
    LEFT JOIN timesheet ON timesheet.staff_id = task.assigned_to
    GROUP BY timesheet.staff_id

Try a LEFT JOIN and make sure you group by a UNIQUE field. "name" may not be unique.

Note: the LEFT JOIN will leave out any timesheets that are not assigned to a task. You can reverse this by SELECT FROM timesheet LEFT JOIN task instead.

Edit: See this answer: Select multiple sums with MySQL query and display them in separate columns

Sorry, no comment privileges yet.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top