Question

I have a Google spreadsheet maintaining a history of events and their dates of occurence like so (column B is start date, column C is end date):

Sheet A
| | A      | B         | C         |
|1| Event1 | 1/15/2013 | 2/20/2013 |
|2| Event2 | 3/1/2013  | 3/10/2013 |
|3| Event3 | 5/7/2013  | 5/9/2013  |

What I want to do is be able to find out, given another date range, how many total days of overlap there were with any of those events. For example, if I had a date range of 2/1/2013 thru 3/15/2013, then the calculation ought to be able to determine there were 20 days of overlap with Event1, 10 days of overlap with Event2, and 0 days of overlap with Event3. Put it all together and the value I want to display is 30.

Here's an example second sheet showing the values I want to calculate as column C for a couple example date ranges:

Sheet B
| | A         | B         | C  |
|1| 2/1/2013  | 3/15/2013 | 30 |
|2| 3/20/2013 | 3/25/2013 | 0  |
|3| 5/1/2013  | 5/7/2013  | 1  |

The problem I'm having is that while I can write cell expressions to show that SUMs and FILTERs could almost get me there, the last step I try (and I've tried a number of ways to get this to work) always seem to reduce my calculations over all events to a single scalar before I can sum it per event.

For example, if I write something like the below for Sheet B's C1 cell as one step to figuring out how to calculate this:

=SUM(MIN(B1,'Sheet A'!C:C) - MAX(A1,'Sheet A'!B:B))

Then the SUM clearly includes negative values where the events don't overlap with the time period in question. (You can adjust the formula and expand it out for each row of Sheet A to see the unsummed values.)

However, if I try to filter those non-intersecting rows out with something like the below I just get an '#N/A' result. I'm guessing this is because the FILTER function doesn't like to use the same ranges in the values being summed and the criteria, but that's just a guess:

=SUM(FILTER( MIN(B1,'Sheet A'!C:C)-MAX(A1,'Sheet A'!B:B) ; MIN(B1,'Sheet A'!C:C)>MAX(A1,'Sheet A'!B:B) ))

If I instead try to use a MAX() function, or an IF(), to weed out the negative values resulting from the non-intersecting rows, the SUM returns 0. I think this is because the MAX and IF are reducing the data ranges I want to sum over to a single scalar variable prior to the summing being done. i.e. the SUM() operates on a single scalar.

=SUM( MAX(0, MIN(B1,'Sheet A'!C:C) - MAX(A1,'Sheet A'!B:B)) )

or

=SUM(IF( MIN(B1,'Sheet A'!C:C)>MAX(A1,'Sheet A'!B:B), MIN(B1,'Sheet A'!C:C) - MAX(A1,'Sheet A'!B:B), 0))

Any ideas how to calculate what I want? The only other idea I have is to explode Sheet A out so that each row covers only a single day. Then finding the sum of intersecting days should be a bit easier, but the effort to add new events to Sheet A goes up immensely.

Was it helpful?

Solution

After trying many different ways to get Google Spreadsheet to allow IFs, MAXs, MINs, SUMs, FILTERs, SUMIFs, ARRAYFORUMLAs, etc. to be evaluated per row across the columnar range (multiple single-column ranges were involved in the calc, but they were all the same size and everything should work if I could get Google to treat each row as a DB-entry type thing,) I gave up and refactored my table of events so that a single event spanned multiple rows and each row represented a single calendar day. That is, the above Sheet A now looks like:

Sheet A
|  | A      | B         |
|1 | Event1 | 1/15/2013 |
...
|17|        | 1/31/2013 |
[18|        | 2/1/2013  |
...
|38|        | 2/20/2013 |
|39| Event2 | 3/1/2013  |
...
|49|        | 3/10/2013 |
|50| Event3 | 5/7/2013  |
|51|        | 5/8/2013  |
|52|        | 5/9/2013  |

This way there were no complex range vs range intersection calculations so a simple FILTER() based on whether the event day was within the querying range period worked.

OTHER TIPS

To get the SUM of overlaps between a given date range (start date in SheetB!A2, end date in SheetB!B2, end dates are INCLUSIVE), and an array of start and INCLUSIVE end dates in SheetA!B2:B and SheetA!C2:C respectively, use the following formula in C2:

=SUM(ArrayFormula(IF((SheetA!$B$2:$B="")+(SheetA!$C$2:$C=""),"",IF((TRANSPOSE(INDEX(ArrayFormula(QUERY(TRANSPOSE({SheetA!$C$2:$C,$B2+0*SheetA!$C$2:$C}),"Select "&"MIN(Col"&JOIN(", MIN(Col",SEQUENCE(ROWS(SheetA!$C$2:$C))&")"),0)),2))-TRANSPOSE(INDEX(ArrayFormula(QUERY(TRANSPOSE({SheetA!$B$2:$B,$A2+0*SheetA!$B$2:$B}),"Select "&"MAX(Col"&JOIN(", MAX(Col",SEQUENCE(ROWS(SheetA!$B$2:$B))&")"),0)),2)))>=0,(TRANSPOSE(INDEX(ArrayFormula(QUERY(TRANSPOSE({SheetA!$C$2:$C,$B2+0*SheetA!$C$2:$C}),"Select "&"MIN(Col"&JOIN(", MIN(Col",SEQUENCE(ROWS(SheetA!$C$2:$C))&")"),0)),2))-TRANSPOSE(INDEX(ArrayFormula(QUERY(TRANSPOSE({SheetA!$B$2:$B,$A2+0*SheetA!$B$2:$B}),"Select "&"MAX(Col"&JOIN(", MAX(Col",SEQUENCE(ROWS(SheetA!$B$2:$B))&")"),0)),2)))+1,0))))

If your end dates are EXCLUSIVE use:

=SUM(ArrayFormula(IF((SheetA!$B$2:$B="")+(SheetA!$C$2:$C=""),"",IF((TRANSPOSE(INDEX(ArrayFormula(QUERY(TRANSPOSE({SheetA!$C$2:$C,$B2+0*SheetA!$C$2:$C}),"Select "&"MIN(Col"&JOIN(", MIN(Col",SEQUENCE(ROWS(SheetA!$C$2:$C))&")"),0)),2))-TRANSPOSE(INDEX(ArrayFormula(QUERY(TRANSPOSE({SheetA!$B$2:$B,$A2+0*SheetA!$B$2:$B}),"Select "&"MAX(Col"&JOIN(", MAX(Col",SEQUENCE(ROWS(SheetA!$B$2:$B))&")"),0)),2)))>0,(TRANSPOSE(INDEX(ArrayFormula(QUERY(TRANSPOSE({SheetA!$C$2:$C,$B2+0*SheetA!$C$2:$C}),"Select "&"MIN(Col"&JOIN(", MIN(Col",SEQUENCE(ROWS(SheetA!$C$2:$C))&")"),0)),2))-TRANSPOSE(INDEX(ArrayFormula(QUERY(TRANSPOSE({SheetA!$B$2:$B,$A2+0*SheetA!$B$2:$B}),"Select "&"MAX(Col"&JOIN(", MAX(Col",SEQUENCE(ROWS(SheetA!$B$2:$B))&")"),0)),2))),0))))

You can drag it down to get the numb ers for other rows in SheetB.

The problem of calculating the MAX/MIN for an array by rows is solved here: https://stackoverflow.com/a/64314420/3525368

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top