Question

I'm using a Java program to extract some data points, and am planning on using scipy to determine the correlation coefficients. I plan on extracting the data into a csv-style file. How should I format each corresponding dataset, so that I can easily read it into scipy?

Was it helpful?

Solution

Each dataset is a column and all the datasets combined to make a CSV. It get read as a 2D array by numpy.genfromtxt() and then call numpy.corrcoef() to get correlation coefficients.

Note: you should also consider the same data layout, but using pandas. Read CSV into a dataframe by pandas.read_csv() and get the correlation coefficients by .corr()

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top