Question

Question parts:

  1. Is there a "julia way" to implement a sliding window?
  2. What is needed in julia to ignore NaNs?

There is a matrix with 264 recording points (rows) and 200 time points (columns). I want to get the median correlation of each recording point with every other point over a 10 sample window.

I've tried this the matlab-way (tm) by creating a 3d 264x264x10 matrix where the third dim is the correlation for that window. In matlab, I would do median(cors,3) much like julia can do mean(cors,3). But median does not have support for this. It looks like mapslices(median,cors,3) might be what I want, but some recording points have NaNs. In R, I might look to na.omit() or function options like na.ignore=T But I don't see that for julia.

#oned=readdlm("10152_20111123_preproc_torque.1D")
oned=rand(200,264); oned[:,3]=NaN; oned[:,200]=NaN
windows=10
samplesPerWindow=size(oned,1)/windows
cors=zeros(size(oned,2),size(oned,2),windows)
for i=1:windows
 startat=(i-1)*windows+1
 endat=i*windows
 corofsamples=cor(oned[startat:i*windows,:])
 cors[:,:,i]= corofsamples
end
med = mapslices(median,cors,3) # fail b/c NaN
Was it helpful?

Solution

Here's one approach, which uses functions to encapsulate parts of the task. By creating a specialized version of the median function that ignores NaN, it's easier to use mapslices:

function findcors(oned, windows)
    samplesPerWindow = size(oned, 1) / windows

    cors = zeros(size(oned, 2), size(oned, 2), windows)

    for i = 1:windows
        startat = (i - 1) * samplesPerWindow + 1
        endat = i * samplesPerWindow
        corofsamples = cor(oned[startat:endat, :])
        cors[:, :, i] = corofsamples
    end

    return cors
end

function nanmedian(A)
    cleanA = A[isfinite(A)]
    if isempty(cleanA)
        NaN
    else
        return median(cleanA)
    end
end

oned = rand(200, 264)
oned[:, 3] = NaN
oned[:, 200] = NaN

cors = findcors(oned, 10)

med = mapslices(nanmedian, cors, 3)

I believe your original code was using the wrong window length inside the main loop. Hopefully I've fixed that.

The DataFrames package provides an NA value and tools to ignore NA, but still needs to clean up its median function to exploit those tools.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top