I have a Deedle Frame<DateTime,string>
.
The columns contain float
values and are dense (no missing values).
I need to build the data frame from an string []
and then:
- Build a 2D
Matrix
with the whole data
- Build a Series
Series<DateTime,Matrix<float,CpuLib>>
, collapsing the rows in a 1xn
matrix
In my case, I am experimenting with FCore by StatFactory, but I may use another linear algebra library in the future.
My concern is that I need to make sure that the order of rows and columns is not changed in the process.
Data Frame Construction
I fetch the data using the following.
I notice that the order of columns is different that the initial list of tickers.
Why is that? Will the use of Array.Parallel.Map
change the order?
/// get the selected tickers in a DataFrame from a DataContext
let fetchTickers tickers joinKind =
let getTicker ticker =
query {
for row in db.PriceBarsDay do
where (row.Ticker = ticker)
select row }
|> Seq.map (fun row -> row.DateTime, float row.Close)
|> dict
tickers
|> Array.map (fun ticker -> getTicker ticker) // returns a dict(DateTime, ClosePrice)
|> Array.map (fun dictionary -> Series(dictionary))
|> Array.map2 (fun ticker series -> [ticker => series] |> frame ) tickers
|> Array.reduce (fun accumFrame frame -> accumFrame.Join(frame, joinKind))
Data frame to 2D matrix
In order to build the matrix I use the code below. Mapping on the array of column names (selectedCols
) ensures that the order of columns is not shifted. I run unit tests on the order of rows using Array.Map
and everything looks fine but I would like to know
- if there is a consistency check in the library that would ensure that
I may not run into an issue?
- I suppose
Array.Parallel.map
would preserve the order of columns.
Here is the code:
/// Build a matrix
let buildMatrix selectedCols (frame: Frame<DateTime, String>) =
let matrix =
selectedCols
|> Array.map (fun colname -> frame.GetSeries(colname))
|> Array.map (fun serie -> Series.values serie)
|> Array.map (fun aSeq -> Seq.map unbox<float> aSeq)
|> Array.map (fun aSeq -> Matrix(aSeq) )
|> Array.reduce (fun acc matrix -> acc .| matrix)
matrix.T
Data Frame to Time Series of Row Matrices
I build the time series of row matrices with the code below.
- Keeping the data in the Series should ensure that the order of rows
is preserved.
- How can I filter the columns and ensure that the column order is exactly as in the array of column names passed on to the function?
Here is the code:
// Time series of row matrices - it'll be used to run a simulation
let timeSeriesOfMatrix frame =
frame
|> Frame.filterRows (fun day target -> day >= startKalman)
|> Frame.mapRowValues ( fun row -> row.Values |> Seq.map unbox<float> )
|> Series.mapValues( fun row -> Matrix(row) )
Many thanks.
PS: I kept all the three scenarios together because I believe that the three examples above would better help other users and myself understand how the library works rather than discussing each single case separately.