Question

I have dataset in Stata that looks like this

entityID    indicator    indicatordescr    indicatorvalue
1           gdp          Gross Domestic    100
1           pop          Population        15
1           area         Area              50
2           gdp          Gross Domestic    200
2           pop          Population        10
2           area         Area              300

and there is a one-to-one mapping between values of indicator and values of indicatordescr.

I want to reshape it to wide, i.e. to:

entityID    gdp     pop     area
1           100     15      50
2           200     10      300

where I would like gdp variable label to be "Gross Domestic", pop label "Population" and area "Area".

Unfortunately, as I understand, it is not possible to assign the value of indicatordescr as a value label of indicator, so the reshape can't transform these value labels into variable labels.

I have looked at this : Bring value labels to variable labels when reshaping wide

and this : http://www.stata.com/support/faqs/data-management/apply-labels-after-reshape/

but did not understand how to apply those to my case.

NB: the variable labeling after reshape must be done programatically, because indicator and indicatordescr have many values.

Was it helpful?

Solution

"String labels" here is informal; Stata does not support value labels for string variables. However, what is wanted here is that the distinct values of a string variable become variable labels on reshaping.

Various work-arounds exist. Here's one: put the information in the variable name and then take it out again.

clear 
input entityID  str4 indicator   str14 indicatordescr    indicatorvalue
1           gdp          "Gross Domestic"    100
1           pop          "Population"        15
1           area         "Area"              50
2           gdp          "Gross Domestic"    200
2           pop          "Population"        10
2           area         "Area"              300
end 

gen what = indicator + "_"  + subinstr(indicatordescr, " ", "_", .)  
keep entityID what indicatorvalue 
reshape wide indicatorvalue , i(entityID) j(what) string 

foreach v of var indicator* {
    local V : subinstr local v "_" " ", all
    local new : word 1 of `V' 
    rename `v' `new'
    local V = substr("`V'", strpos("`V'", " ") + 1, .)
    label var `new' "`V'"
}

renpfix indicatorvalue 

EDIT If the length of variable names bites, try another work-around:

clear 
input entityID  str4 indicator   str14 indicatordescr    indicatorvalue
1           gdp          "Gross Domestic"    100
1           pop          "Population"        15
1           area         "Area"              50
2           gdp          "Gross Domestic"    200
2           pop          "Population"        10
2           area         "Area"              300
end 

mata : sdata = uniqrows(st_sdata(., "indicator indicatordescr")) 
keep entityID indicator indicatorvalue 
reshape wide indicatorvalue , i(entityID) j(indicator) string 
renpfix indicatorvalue 
mata : for(i = 1; i <= rows(sdata); i++) stata("label var " + sdata[i, 1] + "  " + char(34) + sdata[i,2] + char(34))
end 

LATER EDIT Although the above is called a work-around, it is a much better solution than the previous.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top