Question

I have imputed missing values using Amelia thereby creating 5 multiply imputed datasets. Now, I would like to split this multi-dataset, e.g. one set for year => 1990 and one set for year =<1990. Any ideas how I can do so? Many thanks!

data(freetrade)
freetrade$year #splitting variable

#Imputation of missing data
a.out <- amelia(freetrade, m=5, ts="year", cs="country")

#split of created dataset?
Was it helpful?

Solution

Amelia returns an object that contains a list of dataframes (for each imputations). You can see the structure of this object with str().

> library(Amelia)
> data(freetrade)
> 
> a.out <- amelia(freetrade, m=5, ts="year", cs="country")
-- Imputation 1 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

-- Imputation 2 --

  1  2  3  4  5  6  7  8  9 10 11 12 13

-- Imputation 3 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19

-- Imputation 4 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

-- Imputation 5 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20


> str(a.out)
List of 12
 $ imputations:List of 5
  ..$ imp1:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 30.6 22.4 41.3 26.8 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp2:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 33.6 59.7 41.3 18.2 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp3:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 48.5 32.9 41.3 47.2 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp4:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 18.4 45.5 41.3 16.9 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp5:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 15.3 44.4 41.3 40.1 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..- attr(*, "class")= chr [1:2] "mi" "list"
 $ m          : num 5
 $ missMatrix : logi [1:171, 1:10] FALSE FALSE FALSE FALSE FALSE FALSE ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:10] "year" "country" "tariff" "polity" ...
 $ overvalues : NULL
 $ theta      : num [1:9, 1:9, 1:5] -1 -0.08456 -0.03404 -0.00193 0.06483 ...
 $ mu         : num [1:8, 1:5] -0.08456 -0.03404 -0.00193 0.06483 -0.11178 ...
 $ covMatrices: num [1:8, 1:8, 1:5] 0.7881 -0.1869 -0.0531 0.2121 -0.0819 ...
 $ code       : num 1
 $ message    : chr "Normal EM convergence."
 $ iterHist   :List of 5
  ..$ : num [1:15, 1:3] 44 34 25 28 26 25 24 22 20 14 ...
  ..$ : num [1:13, 1:3] 44 27 24 22 22 21 18 17 14 11 ...
  ..$ : num [1:19, 1:3] 44 34 29 27 26 26 25 24 23 21 ...
  ..$ : num [1:15, 1:3] 44 34 27 28 23 24 23 23 19 19 ...
  ..$ : num [1:20, 1:3] 44 32 30 27 24 23 23 23 23 21 ...
 $ arguments  :List of 22
  ..$ idvars      : NULL
  ..$ logs        : NULL
  ..$ ts          : num 1
  ..$ cs          : num 2
  ..$ empri       : NULL
  ..$ tolerance   : num 1e-04
  ..$ polytime    : NULL
  ..$ splinetime  : NULL
  ..$ lags        : NULL
  ..$ leads       : NULL
  ..$ intercs     : logi FALSE
  ..$ sqrts       : NULL
  ..$ lgstc       : NULL
  ..$ noms        : NULL
  ..$ ords        : NULL
  ..$ priors      : NULL
  ..$ autopri     : num 0.05
  ..$ bounds      : NULL
  ..$ max.resample: num 100
  ..$ startvals   : num 0
  ..$ overimp     : NULL
  ..$ emburn      : num [1:2] 0 0
  ..- attr(*, "class")= chr [1:2] "ameliaArgs" "list"
 $ orig.vars  : chr [1:10] "year" "country" "tariff" "polity" ...
 - attr(*, "class")= chr "amelia"

From here you can see that the the "imputations" element of your a.out object contains your data frames, so you can reference each of your imputations from there. For example a.out$imputations[[1]]$year will give you the years from your first imputation. If you like to do that across each imputation then you can do so using an apply function or loop. To illustrate this, consider:

> sapply(a.out$imputations,function(x) head(x$year))
     imp1 imp2 imp3 imp4 imp5
[1,] 1981 1981 1981 1981 1981
[2,] 1982 1982 1982 1982 1982
[3,] 1983 1983 1983 1983 1983
[4,] 1984 1984 1984 1984 1984
[5,] 1985 1985 1985 1985 1985
[6,] 1986 1986 1986 1986 1986

EDIT: I just re-read your question and I saw that you're actually looking for something more specific. You can take what's above an apply it to make subsets of each each data frame doing something like lapply(a.out$imputations,function(x) x[x$year > 1990,]). I'm not sure how you would like to combine these imputed datasets (split by years great than/less than 1990), but if you just want to append all rows together rbind() will do the trick (if not let me know how you'd like to and I can probably recommend a solution):

> df1 <- do.call(rbind,lapply(a.out$imputations,function(x) x[x$year > 1990,]))
> df2 <- do.call(rbind,lapply(a.out$imputations,function(x) x[x$year < 1990,]))
> head(df1)
        year  country  tariff polity      pop   gdp.pc intresmi   signed fiveop     usheg
imp1.11 1991 SriLanka 26.9000      5 17247000 597.6987 2.285213 1.000000   12.8 0.2589872
imp1.12 1992 SriLanka 25.0000      5 17405000 618.3329 2.877877 0.515665   13.1 0.2623017
imp1.13 1993 SriLanka 24.2000      5 17628420 652.6205 4.280361 0.000000   13.2 0.2812928
imp1.14 1994 SriLanka 26.0000      5 17865000 680.0408 4.389912 0.000000   13.2 0.2783585
imp1.15 1995 SriLanka 20.0000      5 18112000 707.6591 3.995919 0.000000   13.2 0.2627195
imp1.16 1996 SriLanka 20.5646      5 18300000 727.0039 3.676763 0.000000   13.2 0.2681700
> head(df2)
       year  country   tariff polity      pop   gdp.pc intresmi signed fiveop     usheg
imp1.1 1981 SriLanka 30.56693      6 14988000 461.0236 1.937347      0   12.4 0.2593112
imp1.2 1982 SriLanka 22.39382      5 15189000 473.7634 1.964430      0   12.5 0.2558008
imp1.3 1983 SriLanka 41.30000      5 15417000 489.2266 1.663936      1   12.3 0.2655022
imp1.4 1984 SriLanka 26.81580      5 15599000 508.1739 2.797462      0   12.3 0.2988009
imp1.5 1985 SriLanka 31.00000      5 15837000 525.5609 2.259116      0   12.3 0.2952431
imp1.6 1986 SriLanka 17.76314      5 16117000 538.9237 1.832549      0   12.5 0.2886563
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top