Question

I have a variable actor which is a string and contains values like "military forces of guinea-bissau (1989-1992)" and a large range of other different values that are fairly complex. I have been using grep() to find character patterns that match different types of actors. For example I would like to code a new variable actor_type as 1 when actor contains "military forces of", doesn't contain "mutiny of", and the string variable country is also contained in the variable actor.

I am at a loss as to how to conditionally create this new variable without resorting to some type of horrible for loop. Help me!

Data looks roughly like this:

|   | actor                                              | country         |
|---+----------------------------------------------------+-----------------|
| 1 | "military forces of guinea-bissau"                 | "guinea-bissau" |
| 2 | "mutiny of military forces of guinea-bissau"       | "guinea-bissau" |
| 3 | "unidentified armed group (guinea-bissau)"         | "guinea-bissau" |
| 4 | "mfdc: movement of democratic forces of casamance" | "guinea-bissau" |
Was it helpful?

Solution

if your data is in a data.frame df:

> ifelse(!grepl('mutiny of' , df$actor) & grepl('military forces of',df$actor) & apply(df,1,function(x) grepl(x[2],x[1])),1,0)
[1] 1 0 0 0

grepl returns a logical vector and this can be assigned to whatever, e.g. df$actor_type.

breaking that appart:

!grepl('mutiny of', df$actor) and grepl('military forces of', df$actor) satisfy your first two requirements. the last piece, apply(df,1,function(x) grepl(x[2],x[1])) goes row by row and greps for country in actor.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top