Get x intercept given data points in R

Question 1

Finding threshold crossings without smoothing rarely gives well-defined values. Play with the following:

data(sunspots)
sunspots = as.numeric(sunspots)
smoothover = 21 # Try smaller values here to see the failure
y = filter(sunspots,rep(1/smoothover,smoothover),circular=TRUE)
plot(y)
thresh =30
abline(h=thresh)
cross = which(diff(sign(y-thresh))!=0)-1
rug(cross)

If you want to get "exact" threshold crossings (what is exact if smoothing is so important?) you should use the above approximation first to get a region around the crossing and do a quadratic or linear interpolation afterwards, e.g. with 5 points.

Question 2

To see where your function y=f(x) (which I assume to be continuous) intersects a horizontal line y=h, you can look for sign changes in f(x)-h. The locations where this sign changes are locations where the function crosses the line. Unless you know more details about your function, so that you can interpolate between data points, this is probably the best you can hope for.

To compute sign changes in R, you can use code like this:

h <- 1000
d <- y - h
signChanges <- (d[1:(length(d)-1)] * d[2:length(d)]) <= 0

The resulting array will have one less element than your x array, as each element corresponds to an interval between two adjacent x values. If you have a data point exactly on the line, then the array will contain two subsequent TRUE values. If this is a problem, then you could insert

d <- ifelse(d == 0, 1, d)

which moves the data points slightly off the line.

To obtain corresponding x values, you can use the center of each interval:

x2 <- (x[c(signChanges, FALSE)] + x[c(FALSE, signChanges)])/2

Or you could perform a linear interpolation between the two adjacent data points:

left <- c(signChanges, FALSE)
right <- c(FALSE, signChanges)
t <- (h - y[left])/(y[right] - y[left])
x2 <- (1 - t)*x[left] + t*x[right]

Question 3

To get actual points of intersection, you could use a couple of R's spatial packages:

library(sp)
library(rgeos)  ## for gIntersection()

## Wrap your line up as a SpatialLines object
x <- c(1, 2, 3, 4)
y <- c(1000, 1200, 900, 700)
SL1 <- SpatialLines(list(Lines(Line(cbind(x,y)), "A")))

## Create a horizontal SpatialLines object
SL2 <- SpatialLines(list(Lines(Line(cbind(range(x), 1010)), "B")))

## Find their point(s) of intersection
coordinates(gIntersection(SL1, SL2))
#          x    y
# 1 1.050000 1010
# 1 2.633333 1010

Question 4

I would put your data in a single data.frame:

dat = data.frame(x, y)

Then you can subset the dataset to keep only the values where y == 1000:

dat[dat$y == 1000,]

Note that this only works if the y values you use are actually in the dataset. If you pick a y value not literally in de dataset (e.g. y = 996.345) you'll have to do some interpolation of the timeseries, see e.g. this SO post, or by fitting some spline function (this will probably work best on the range x = [150,900]).