Question

I'm trying to compare teams' compositions to known configurations in order to see where I might have a problem :

enter image description here

The trials columns are to be compare against the differents scenarios to see if a column is a superset of a particular scenario (error being default).

Can it be done using index+match/lookup, or do I have to write some VB macro ?


EDIT : I've updated the question with a worksheet with input data.

Worksheet : https://drive.google.com/file/d/0BxwDbXStIEAsUmpONHp1RVRzR2s/edit?usp=sharing Github Gist : https://gist.github.com/lucasg/11177852 (python script for data gen)
(xlwt module needed to create excel workbooks).

I've simplified the problem using soccer teams : given 7 positions ( 1 goalie, 2 defenders, 2 midfield and 2 forward) and list of presence to certains week-end, I would like to know whether I'm gonna be able to provide a full team or am I to forfeit the match due to lack of key-players.

The positions :

styles = {
"Goalkeeper" : ["Goalkeeper"], 
"Defender" :  ["Centre back", "Wing"],
"Midfielder" : ["Centre midfield", "Wide"],
"Forward" : ["Centre forward","Winger"]
}

Most football players can play only one position, but some are more versatile and can play any positions in their own field (defense-midfield-attack).

Example of a team (18 pers.):

example_players = {
    "Forward": [
        [1, "Winger"], 
        [2, "Winger"], 
        [3, "Centre forward"], 
        [4, "Centre forward"]
    ], 
    "Defender": [
        [5, "Centre back"], 
        [6, "Centre back", "Wing"], 
        [7, "Centre back", "Wing"], 
        [8, "Wing"], 
        [9, "Centre back"]
    ], 
    "Goalkeeper": [
        [10, "Goalkeeper"], 
        [11, "Goalkeeper"]
    ], 
    "Midfielder": [
        [12, "Centre midfield"], 
        [13, "Centre midfield"], 
        [14, "Wide", "Centre midfield"], 
        [15, "Centre midfield"], 
        [16, "Centre midfield"], 
        [17, "Wide", "Centre midfield"], 
        [18, "Wide", "Centre midfield"]
    ]
}

To make it more simple, I need at least one person in each zone (goal-def-mid-attack) to be able to play, the most comfortable situation being one person in each of the 7 positions.

ex scenario :

"no_defense_4"  : ["Goalkeeper", "Wide", "Winger" ] ,
"no_attack_1"  : ["Goalkeeper", "Centre midfield", "Centre back",  ] ,

Now, given a list of a hundred weekends, and the list of the presence/abscence of players, I want to know the resulting situation.

I'm looking preferentially for a formula-based solution, since the worksheet will be uploaded and used in google drive

Was it helpful?

Solution

You can represent sets as bit vectors and then use bit operators "equal" or "AND" to test which sets get matched. Using bit vectors as set representation will solve problem of ordering and duplicate values automatically as position of each value in the bit vector is fixed and each bit will be "set" only once, regardless of how many times the value appears in the column that defines the set.

Simple to use bit vector representation in Excel including operators OR, AND, NOT is listed here: http://chandoo.org/wp/2011/07/29/bitwise-operations-in-excel/#comment-207723

For example following function

=POWER(10;0)*MIN(COUNTIF($B$3:$B$12;"T1");1)+POWER(10;1)*MIN(COUNTIF($B$3:$B$12;"T2");1)+POWER(10;2)*MIN(COUNTIF($B$3:$B$12;"S");1)+POWER(10;3)*MIN(COUNTIF($B$3:$B$12;"PL");1)+POWER(10;4)*MIN(COUNTIF($B$3:$B$12;"CC");1)+POWER(10;5)*MIN(COUNTIF($B$3:$B$12;"GC");1)

Converts values in the range $B$3->$B$12 into a bit vector representation having bits 0..5 defined so that the bit is set if the value in any column in the range is equal to:

  • bit 0 = T1
  • bit 1 = T2
  • bit 2 = S
  • bit 3 = PL
  • bit 4 = CC
  • bit 5 = GC

You can add more bits with other values easily by following the same copy/paste pattern.

So to check if certain column matches certain scenario, just compare the bit vectors. Use expression like IF(x=y;"warn2";IF(..)) and substitute bit vector of the column for x and bit vector of the warn2 scenario for y.

If partial matching is needed, you can use the bitwise AND operator as defined in the above article.

This solution as opposed to a VBA-based solution will require some copy/pasting discipline, e.g. when new trial column or new scenario will be added few expressions will have to be copy/pasted and few will have to be updated.

VBA-based solution might solve this maintenance problem automatically for you by using auto-detected CurrentRegions, all necessary logic hidden behind one macro-click.

EDIT: The bit vectors concept applied to the new soccer teams dataset

Worksheet: https://docs.google.com/spreadsheet/ccc?key=0AtZPnBk7a3pvdHcyWDV6ZFFoUTNyWWF0bjl3VFpaRkE&usp=drive_web#gid=0

As it is ambiguous what will be the exact team setup on a given day as one player may be assigned different positions, I have simplified the problem in such a way that instead of "present" or "absent" I expect the table to contain player's position. It should not be a problem to achieve as if you know what positions the player can play then instead of absent,present you can define the set of valid values to be (empty or anything else),Midfielder,Centre midfield,Wide for players 14,17,18. List of valid available values can be configured for each cell using the "Data validation" rules. The abstract role Midfielder stands for "this player can play a midfielder, exact position is not known yet".

To represent positions I use bit vector calculated with this formula

=POWER(10;6)*MIN(COUNTIF(D2:ZZ2;"Goalkeeper");1)+POWER(10;5)*MIN(COUNTIF(D2:ZZ2;"Centre back");1)+POWER(10;4)*MIN(COUNTIF(D2:ZZ2;"Wing");1)+POWER(10;3)*MIN(COUNTIF(D2:ZZ2;"Centre midfield");1)+POWER(10;2)*MIN(COUNTIF(D2:ZZ2;"Wide");1)+POWER(10;1)*MIN(COUNTIF(D2:ZZ2;"Centre forward");1)+POWER(10;0)*MIN(COUNTIF(D2:ZZ2;"Winger");1)

the formula calculates bit vector from a range D2:ZZ2 in such a way so that each position in the range is counted only once and in final vector each position has a fixed place. It is useful to set number format of the vector to custom numeric format 0000000. For example a row containing Wide,Winger,Goalkeeper in any order with any number of repeats will evaluate to vector 1000101 where the left-most bit 6 stands for Goalkeeper and 2nd from the right goes bit 2 standing for Wide. The most comfortable situation is the one with bit vector evaluating to 1111111. The only purpose of this bit vector is to detect the comfortable situation

For matching scenarios to team setups I use another vector composed of 4 digits with this meaning:

  • leftmost digit 3 - number of goalies (at most 1 counts)
  • digit 2 - number of defenders (at most 2 counts)
  • digit 1 - number of midfielders (at most 2 counts)
  • rightmost digit 0 - number of forwards (at most 2 counts) The formula to calculate this vector for range D2:ZZ2 looks like this

=POWER(10;3)*MIN(COUNTIF(D2:ZZ2;"Goalkeeper");1)+POWER(10;2)*MIN(COUNTIF(D2:ZZ2;"Defender")+COUNTIF(D2:ZZ2;"Centre back")+COUNTIF(D2:ZZ2;"Wing");2)+POWER(10;1)*MIN(COUNTIF(D2:ZZ2;"Midfielder")+COUNTIF(D2:ZZ2;"Centre midfield")+COUNTIF(D2:ZZ2;"Wide");2)+POWER(10;0)*MIN(COUNTIF(D2:ZZ2;"Forward")+COUNTIF(D2:ZZ2;"Centre forward")+COUNTIF(D2:ZZ2;"Winger");2)

It is useful to set number format of the vector to custom numeric format 0000. This same formula can calculate the 4-digit vector for team setup and for scenario.

Besides position names it can count also abstract position names like Defender.

For example in a row containing Centre back,Centre back,Goalkeeper,Goalkeeper,Goalkeeper,Defender,Defender,Midfielder,Midfielder,Winger the vector looks like 1221.

There are (1+1)*(2+1)*(2+1)*(2+1) = 54 different possible scenarios. I assume each of them is listed in the constraints sheet. You should be able to generate them all in python quite easily.

There are 2 sheets constraints with scenarios and events with days and team setups. The lookup formula that takes the vector calculated for a team setup in row #2 and searches the constraints sheet for a row with exactly the same vector and returns the value from the value column looks like this

=IFERROR(VLOOKUP($A2;constraints!$A:$B;2;FALSE);"?")

  • $A2 - contains the 4-digit vector formula for the team setup
  • constraints!$A - column in the sheet with scenarios containing the 4-digit vector formula for the scenario
  • constraints!$B - column in the sheet with scenarios containing the scenario name - the thing you are looking for
  • 2 - index of column constraints!$B
  • FALSE - means the lookup column does not have to be sorted
  • ? - fallback value if no matching scenario was found (should not occur)

The Google docs link above contains the formulas, example 3 days and example 11 scenarios.

If there's something unclear let me know and I'll improve the answer as the Google docs link will vanish some day

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top