Question

I am currently using python pandas and want to know if there is a way to output the data from pandas into julia Dataframes and vice versa. (I think you can call python from Julia with Pycall but I am not sure if it works with dataframes) Is there a way to call Julia from python and have it take in pandas dataframes? (without saving to another file format like csv)

When would it be advantageous to use Julia Dataframes than Pandas other than extremely large datasets and running things with many loops(like neural networks)?

Was it helpful?

Solution

So there is a library developed for this

PyJulia is a library used to interface with Julia using Python 2 and 3

https://github.com/JuliaLang/pyjulia

It is experimental but somewhat works

Secondly Julia also has a front end for pandas which is pandas.jl

https://github.com/malmaud/Pandas.jl

It looks to be just a wrapper for pandas but you might be able to execute multiple functions using julia's parallel features.

As for the which is better so far pandas has faster I/O according to this reading csv in Julia is slow compared to Python

OTHER TIPS

I'm a novice at this sort of thing but have definitely been using both as of late. Truth be told, they seem very quite comparable but there is far more documentation, Stack Overflow questions, etc pertaining to Pandas so I would give it a slight edge. Do not let that fact discourage you however because Julia has some amazing functionality that I'm only beginning to understand. With large datasets, say over a couple gigs, both packages are pretty slow but again Pandas seems to have a slight edge (by no means would I consider my benchmarking to be definitive). Without a more nuanced understanding of what you are trying to achieve, it's difficult for me to envision a circumstance where you would even want to call a Pandas function while working with a Julia DataFrame or vice versa. Unless you are doing something pretty cerebral or working with really large datasets, I can't see going too wrong with either. When you say "output the data" what do you mean? Couldn't you write the Pandas data object to a file and then open/manipulate that file in a Julia DataFrame (as you mention)? Again, unless you have a really good machine reading gigs of data into either pandas or a Julia DataFrame is tedious and can be prohibitively slow.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top