Carrying string labels of string variable after reshape
Question
I have dataset in Stata that looks like this
entityID indicator indicatordescr indicatorvalue
1 gdp Gross Domestic 100
1 pop Population 15
1 area Area 50
2 gdp Gross Domestic 200
2 pop Population 10
2 area Area 300
and there is a one-to-one mapping between values of indicator
and values of indicatordescr
.
I want to reshape it to wide, i.e. to:
entityID gdp pop area
1 100 15 50
2 200 10 300
where I would like gdp
variable label to be "Gross Domestic", pop
label "Population" and area
"Area".
Unfortunately, as I understand, it is not possible to assign the value of indicatordescr
as a value label of indicator
, so the reshape can't transform these value labels into variable labels.
I have looked at this : Bring value labels to variable labels when reshaping wide
and this : http://www.stata.com/support/faqs/data-management/apply-labels-after-reshape/
but did not understand how to apply those to my case.
NB: the variable labeling after reshape must be done programatically, because indicator
and indicatordescr
have many values.
Solution
"String labels" here is informal; Stata does not support value labels for string variables. However, what is wanted here is that the distinct values of a string variable become variable labels on reshaping.
Various work-arounds exist. Here's one: put the information in the variable name and then take it out again.
clear
input entityID str4 indicator str14 indicatordescr indicatorvalue
1 gdp "Gross Domestic" 100
1 pop "Population" 15
1 area "Area" 50
2 gdp "Gross Domestic" 200
2 pop "Population" 10
2 area "Area" 300
end
gen what = indicator + "_" + subinstr(indicatordescr, " ", "_", .)
keep entityID what indicatorvalue
reshape wide indicatorvalue , i(entityID) j(what) string
foreach v of var indicator* {
local V : subinstr local v "_" " ", all
local new : word 1 of `V'
rename `v' `new'
local V = substr("`V'", strpos("`V'", " ") + 1, .)
label var `new' "`V'"
}
renpfix indicatorvalue
EDIT If the length of variable names bites, try another work-around:
clear
input entityID str4 indicator str14 indicatordescr indicatorvalue
1 gdp "Gross Domestic" 100
1 pop "Population" 15
1 area "Area" 50
2 gdp "Gross Domestic" 200
2 pop "Population" 10
2 area "Area" 300
end
mata : sdata = uniqrows(st_sdata(., "indicator indicatordescr"))
keep entityID indicator indicatorvalue
reshape wide indicatorvalue , i(entityID) j(indicator) string
renpfix indicatorvalue
mata : for(i = 1; i <= rows(sdata); i++) stata("label var " + sdata[i, 1] + " " + char(34) + sdata[i,2] + char(34))
end
LATER EDIT Although the above is called a work-around, it is a much better solution than the previous.