Question

I'm struggling to swap values from 2 columns of a dataframe as follows:

rs649071 rs640249 0.265 0.49 
rs647621 rs640249 0.227 0.34 
rs644339 rs640249 0.116 0.08 
rs641563 rs640249 1.0 33.96 
rs640249 rs11073074 0.248 0.77 
rs640249 rs11637397 0.194 0.68 

The idea is to test if each cell of column 2 is rs640249 and if not, change to the corresponding string from column 1 and vice-versa. This way the final results would be something like:

rs649071 rs640249 0.265 0.49 
rs647621 rs640249 0.227 0.34 
rs644339 rs640249 0.116 0.08 
rs641563 rs640249 1.0 33.96 
rs11073074 rs640249 0.248 0.77 
rs11637397 rs640249 0.194 0.68 

I was trying to iterate over tuples, however, tuples does not support item assignment.

rscode='rs640249'
for inf in LDfiles:
    df = read_csv(inf, sep='\t', skiprows=1, names=['A', 'B', 'C'])
    for tup in df.itertuples():
        if tup[2] != rscode:
            tup[1], tup[2] = tup[2], tup[1]
        print(tup)
Was it helpful?

Solution 2

For future refereces, here goes a possible solution:

    for row_index, row in df.iterrows():
        if row['L1'] == 'rs640249':
            df.set_value(row_index, 'L1' , row['L2'])
            df.set_value(row_index, 'L2' , row['L1'])

Best,

OTHER TIPS

One way to do this is to use apply:

def my_fun(row):
    if row['col1'] == 'rs640249':
        return row['col2'], row['col1']
    else:
        return row['col1'], row['col2']

df = df.apply(my_fun, axis=1)

If you want to change the values in only one column you can still use apply:

def my_fun2(row, colID):
    if row[colID][0] == 'rs640249':
        return row[colID][::-1] #reverse the tuple
    else:
        return row[colID]

df[colID] = df.apply(lambda x: my_fun2(x, colID), axis=1)

Note: since my_fun2 returns a single value, this time apply return a Series, so we need to slightly change the way we apply apply.

Example:

df
#                             0
# 0    ('rs649071', 'rs640249')
# 1  ('rs640249', 'rs11073074')

df[0] = df.apply(lambda x: my_fun2(x,0), axis=1)
#                             0
# 0    ('rs649071', 'rs640249')
# 1  ('rs11073074', 'rs640249')

Why don't you try something like this, with array operations:

condition = df['L1'] == 'rs640249'
tmp = df['L1'].copy()
df['L1'][condition] = df['L2'][condition]
df['L2'][condition] = tmp[condition]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top