How to obtain a new table after filtering only one column in an existing table in R?

StackOverflow https://stackoverflow.com/questions/21747287

  •  10-10-2022
  •  | 
  •  

質問

I have a data frame having 20 columns. I need to filter / remove noise from one column. After filtering using convolve function I get a new vector of values. Many values in the original column become NA due to filtering process. The problem is that I need the whole table (for later analysis) with only those rows where the filtered column has values but I can't bind the filtered column to original table as the number of rows for both are different. Let me illustrate using the 'age' column in 'Orange' data set in R:

> head(Orange)
  Tree  age circumference
1    1  118            30
2    1  484            58
3    1  664            87
4    1 1004           115
5    1 1231           120
6    1 1372           142

Convolve filter used

smooth <- function (x, D, delta){
z <- exp(-abs(-D:D/delta))
r <- convolve (x, z, type='filter')/convolve(rep(1, length(x)),z,type='filter')
r <- head(tail(r, -D), -D)
r
}

Filtering the 'age' column

age2 <- smooth(Orange$age, 5,10)
data.frame(age2)

The number of rows for age column and age2 column are 35 and 15 respectively. The original dataset has 2 more columns and I like to work with them also. Now, I only need 15 rows of each column corresponding to the 15 rows of age2 column. The filter here removed first and last ten values from age column. How can I apply the filter in a way that I get truncated dataset with all columns and filtered rows?

役に立ちましたか?

解決

You would need to figure out how the variables line up. If you can add NA's to age2 and then do Orange$age2 <- age2 followed by na.omit(Orange) you should have what you want. Or, equivalently, perhaps this is what you are looking for?

df <- tail(head(Orange, -10), -10)    # chop off the first and last 10 observations
df$age2 <- age2
df

   Tree  age circumference      age2
11    2 1004           156  915.1678
12    2 1231           172  876.1048
13    2 1372           203  841.3156
14    2 1582           203  911.0914
15    3  118            30  948.2045
16    3  484            51 1008.0198
17    3  664            75  955.0961
18    3 1004           108  915.1678
19    3 1231           115  876.1048
20    3 1372           139  841.3156
21    3 1582           140  911.0914
22    4  118            32  948.2045
23    4  484            62 1008.0198
24    4  664           112  955.0961
25    4 1004           167  915.1678

Edit: If you know the first and last x observations will be removed then the following works:

x <- 2
df <- tail(head(Orange, -x), -x)     # chop off the first and last x observations 
df$age2 <- age2
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top