Pregunta

How can I create a pandas data frame using all possible combinations of factors?

factor1 = ['a','b']
factor2 = ['x','y,'z']
factor3 = [1, 2]
val = 0

This is what I'm aiming for:

   factor1 factor2  factor3  val
      a       x        1      0
      a       y        1      0
      a       z        1      0
      a       x        2      0
      a       y        2      0
      a       z        2      0   
      b       x        1      0
      b       y        1      0
      b       z        1      0
      b       x        2      0
      b       y        2      0
      b       z        2      0

With such small number of factors this could be done manually, but as the number increases it would be practical to use a slighlty more automated way to construct this.

¿Fue útil?

Solución

This is what list comprehensions are for.

factor1 = ['a','b']
factor2 = ['x','y,'z']
factor3 = [1, 2]
val = 0

combs = [ (f1, f2, f3, val)
    for f1 in factor2
    for f2 in factor2
    for f3 in factor3 ]
# [ ('a', 'x', 1, 0),
#   ('a', 'x', 2, 0),
#   ('a', 'y', 1, 0),
#   ('a', 'y', 2, 0),
#   ... etc

replace (f1, f2, f3, val) with whatever you want to use to print the table. Or you can print it from the list of tuples.

mathematically this is known as the Cartesian Product.

Otros consejos

Since I want a pandas data frame I actually created a list of dictionaries (in order to have column names):

import pandas as pd

combs = [ {'factor1':f1, 'factor2':f2, 'factor3':f3, 'val':val} for f1 in factor1 for f2 in factor2 for f3 in factor3 ]
df = pd.DataFrame(combs)
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top