Question

I have a dataframe ,that looks like this

       site  Active
0     deals  Active
1     deals  Active
2     deals  Active
3  discount  Active
4  discount  Active

i don't want to drop the duplicate items, but i want to change the Active columns value based on Site column,for example Active has to change inactive based on duplicate item in site column,last duplicate item has to Active, other than that Inactive

Expected

       site    Active
0     deals  InActive
1     deals  InActive
2     deals    Active
3  discount  InActive
4  discount    Active
Was it helpful?

Solution

I would do this manually. First, let us create the index set of entries whose state must remain active. To do this, I iterate over all rows and record active instances. Note that the later occurrence overrides earlier ones, so we keep only the last one occurrence of active event.

last_active = dict()
for i, row in df.iterrows():
    if row['Active'] == 'Active':
        last_active[row['site']] = i
keep_active = last_active.values()

Now I assign the state 'Active' to those entries whose index is in keep_active and InActive otherwise.

df['refined_active'] = df.apply(lambda x: 'Active' if x.name in keep_active else 'InActive', axis=1)
Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top