Question

I have a data.frame of values with samples taken at intervals that were not exact hours. The samples form oscillating waves of unknown amplitude and period. I would like to estimate the value at every exact hour.

      hours value
60 63.06667 22657
61 64.00000 21535
62 64.93333 20797
63 65.86667 20687
64 66.80000 20129
65 67.73333 19671
66 68.66667 19066
67 69.60000 19534
68 70.53333 19994
69 71.46667 19575
70 72.40000 21466

Is there a way in R I can fit a curve to this data set, and then sample it at my given points (63,64,65,66...)? I'm aware of spline() but don't know how to make it give me exact integer values for 'hours'.

Edit: To clarify, this is the resulting data frame I wish to have (with dummy entries for 'value')

   hours value
63.00000 22800
64.00000 21535
65.00000 20780
66.00000 20500
67.00000 20011
68.00000 ...
69.00000 ...
70.00000 ...
71.00000 ...
72.00000 ...
73.00000 ...

Code to recreate data:

structure(list(hours = c(63.06666647, 63.9999998, 64.93333313, 
65.86666646, 66.79999979, 67.73333312, 68.66666645, 69.59999978, 
70.53333311, 71.46666644, 72.39999977), value = c(22657L, 21535L, 
20797L, 20687L, 20129L, 19671L, 19066L, 19534L, 19994L, 19575L, 
21466L)), .Names = c("hours", "value"), row.names = 60:70, class = "data.frame")
Was it helpful?

Solution

Work with Sean's answer, but use the splinefun tool to create your own interpolation function. Run the function thus created on a vector of your hour values to calculate the interpolated values at those exact values. The examples given on the ?splinefun page are pretty clear.

OTHER TIPS

A simple approach with the spline fit might be:

D <- structure(list(hours = c(63.06666647, 63.9999998, 64.93333313, 
65.86666646, 66.79999979, 67.73333312, 68.66666645, 69.59999978, 
70.53333311, 71.46666644, 72.39999977), value = c(22657L, 21535L, 
20797L, 20687L, 20129L, 19671L, 19066L, 19534L, 19994L, 19575L, 
21466L)), .Names = c("hours", "value"), row.names = 60:70, class = "data.frame")

sm <- smooth.spline(D$hours, D$value, spar = 0.5)

Or whatever smoothing factor for spar you prefer

plot(D$hours, D$value)
lines(sm, col = "red")

You can access the fitted y-values for each hours from the smooth spline by

sm$y
[1] 22421.54 21682.93 21023.05 20469.70 19998.72 19634.10 19448.09 19506.52 19783.97
[10] 20251.24 20891.14
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top