Question

I have a dataframe that I want to reshape; my reshape code:

matchedlong <- reshape(matched, direction = 'long',
                       varying = c(29:33, 36:3943),
                       v.names = c("Math34", "TFCIn"),
                       times = 2006:2009, idvar = "schoolnum")

in matched columns 36 to 39 are logical (TRUE FALSE) but in matchedlong they have turned into numbers somehow .... No clear pattern to the numbers.

what is causing this?

Sample data:

example.data <- structure(list(Grade_Range_2008 = structure(c(14L, 14L, 40L,
40L, 36L, 13L), .Label = c("3-5, UE", "4-5, UE", "4-8, UE, US",
"5-10, UE, US", "5-8, 10, UE, US", "5-8, UE, US", "5-9, UE, US",
"6-11, US", "6-12, UE, US", "6-7, UE, US", "6-8, 10, UE, US",
"6-8, UE", "6-8, UE, US", "6-9, UE, US", "6, UE", "7-10, US",
"7-8, US", "8-Jun", "8-May", "K-3", "K-3, UE", "K-4, UE", "K-5",
"K-5, UE", "K-6, UE", "K-8", "K-8, UE", "K-8, UE, US", "K, 2-5, UE",
"N/A", "PK-3, UE", "PK-4, UE", "PK-5, 10, UE", "PK-5, 7-9, UE, US",
"PK-5, 8, UE", "PK-5, UE", "PK-6, 10, UE", "PK-6, UE", "PK-8, UE",
"PK-8, UE, US"), class = "factor"), X__of_Yrs_in_school = c(0L,
0L, 0L, 0L, 0L, 0L), Total_Enrollment_2008 = c(348L, 444L, 636L,
495L, 319L, 410L), Free_Lunch_pct_2008 = c(75L, 89L, 94L, 89L,
89L, 91L), Reduced_Lunch_pct_2008 = c(6L, 6L, 3L, 4L, 5L, 4L),
    Stability_pct_2008 = c(89L, 93L, 100L, 98L, 92L, 81L),
Limited_Eng__Prof__pct_2008 = c(8L,
    20L, 8L, 10L, 19L, 19L), Am__Ind_pct_2008 = c(1L, 2L, 0L,
    2L, 0L, 2L), Black_pct_2008 = c(41L, 39L, 28L, 33L, 32L,
    38L), Hispanic_pct_2008 = c(55L, 59L, 70L, 61L, 65L, 57L),
    Asian_pct_2008 = c(2L, 1L, 0L, 2L, 1L, 1L), White_pct_2008 = c(2L,
    0L, 1L, 2L, 1L, 2L), Multi_pct_2008 = c(0L, 0L, 0L, 0L, 0L,
    0L), w_o_Valid_Cert__N_2008 = c(4L, 0L, 1L, 0L, 1L, 1L),
    w_o_Valid_Cert__pct_2008 = c(11L, 0L, 2L, 0L, 3L, 5L),
Teaching_Out_of_Certification_N_ = c(7L,
    7L, 2L, 13L, 3L, 4L), Teaching_Out_of_Certification_pc = c(20L,
    15L, 4L, 25L, 9L, 18L), X_3_yrs__Exp_N_2008 = c(12L, 13L,
    5L, 12L, 5L, 5L), X_3_yrs__Exp_pct_2008 = c(34L, 28L, 11L,
    24L, 15L, 23L), Masters_Plus_N_2008 = c(6L, 11L, 15L, 10L,
    16L, 8L), Masters_Plus___2008 = c(17L, 23L, 32L, 20L, 47L,
    36L), Core_Classes_N_2008 = c(78L, 142L, 49L, 91L, 22L, 49L
    ), Core_Not_Taught_by_HQ_Teachers_p = c(23L, 6L, 2L, 24L,
    9L, 20L), Number_of_Classes_N_2008 = c(93L, 193L, 56L, 119L,
    33L, 68L), Clases_Not_taught_by_App__Cert__ = c(18L, 18L,
    2L, 37L, 3L, 13L), Clases_Not_taught_by_App__Cert_0 = c(19L,
    9L, 4L, 31L, 9L, 19L), Turnover_Rate_of_Teachers_with__ = c(31L,
    56L, 20L, 32L, 0L, 50L), Turnover_Rate_all_Teachers_pct_2 = c(42L,
    29L, 17L, 30L, 14L, 49L), Math_Level_3_4_pct_2006 = c(5.1,
    16.4, 58.2, 34.4, 48.9, 12.4), Math_Level_3_4_pct_2007 = c(15.2,
    22.1, 65.7, 29.9, 70.5, 22.6), Math_Level_3_4_pct_2008 = c(29.9,
    43.2, 69.8, 41.2, 78.9, 38.5), Math_Level_3_4_pct_2009 = c(50.7,
    49.7, 80.7, 47.1, 83.9, 51.6), Att__pct_2005 = c(0.83, 0.86,
    0.89, 0.9, 0.89, 0.87), Susp__pct_2005 = c(6L, 15L, 1L, 4L,
    0L, 3L), schoolnum = c(4013, 4045, 4096, 4101, 4102, 4117
    ), In_2006 = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE),
    In_2007 = c(FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), In_2008 = c(FALSE,
    FALSE, FALSE, FALSE, FALSE, FALSE), In_2009 = c(FALSE, FALSE,
    FALSE, FALSE, FALSE, FALSE), weights = c(1, 1, 1, 1, 1, 1
    )), .Names = c("Grade_Range_2008", "X__of_Yrs_in_school",
"Total_Enrollment_2008", "Free_Lunch_pct_2008", "Reduced_Lunch_pct_2008",
"Stability_pct_2008", "Limited_Eng__Prof__pct_2008", "Am__Ind_pct_2008",
"Black_pct_2008", "Hispanic_pct_2008", "Asian_pct_2008", "White_pct_2008",
"Multi_pct_2008", "w_o_Valid_Cert__N_2008", "w_o_Valid_Cert__pct_2008",
"Teaching_Out_of_Certification_N_", "Teaching_Out_of_Certification_pc",
"X_3_yrs__Exp_N_2008", "X_3_yrs__Exp_pct_2008", "Masters_Plus_N_2008",
"Masters_Plus___2008", "Core_Classes_N_2008",
"Core_Not_Taught_by_HQ_Teachers_p",
"Number_of_Classes_N_2008", "Clases_Not_taught_by_App__Cert__",
"Clases_Not_taught_by_App__Cert_0", "Turnover_Rate_of_Teachers_with__",
"Turnover_Rate_all_Teachers_pct_2", "Math_Level_3_4_pct_2006",
"Math_Level_3_4_pct_2007", "Math_Level_3_4_pct_2008",
"Math_Level_3_4_pct_2009",
"Att__pct_2005", "Susp__pct_2005", "schoolnum", "In_2006", "In_2007",
"In_2008", "In_2009", "weights"), row.names = c(1L, 4L, 7L, 8L,
11L, 12L), class = "data.frame")
Was it helpful?

Solution

A column must be all of one data type; you can't mix logical and numeric.

Not sure how you would even do "long" analysis on multiple different data types because usually those are the same variables with different groupings. If you need to, try converting your logical values to numeric first (with as.numeric).

While you're not using the reshape package, Hadley made this point in his discussion of the melt() function, which is performing the same task (see this paper, for instance):

In the current implementation [of melt], there is only one assumption that melt makes: all measured values must be of the same type, e.g., numeric, factor, date. We need this assumption because the molten data is stored in an R data frame, and the value column can be only one type. Most of the time this is not a problem as there are few cases where it makes sense to combine different types of variables in the cast output.

Edit:

I think you may be trying to do two things at once. Is this what you want?

a <- reshape(example.data[,-c(36:39)], direction = 'long', varying = c(29:32), v.names = c("Math34"), times = 2006:2009, idvar = "schoolnum")
b <- reshape(example.data[,-c(29:32)], direction = 'long', varying = c(36:39)-4, v.names = c("TFCIn"), times = 2006:2009, idvar = "schoolnum")
c <- merge(a,b)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top