Why PyTorch is faster than sklearn models?

https://datascience.stackexchange.com/questions/76515

machine-learning
random-forest
scikit-learn
pytorch
machine-learning-model

12-12-2020
|

Question

Recently, I get to know about the hummingbird library for Python. I trained a RandomForest on a 10M-sized dataset with 2 labels. With sklearn it was taking 450 ms for inference. But after converting the same model to PyTorch, now it takes 128ms on CPU inference.

If both are running on the CPU, then why hummingbird's Pytorch model is faster than sklean model?

I am not getting what hummingbird does to my sklearn model to increases speed.

Solution

It is difficult to answer your question without the access to your code. The best way to understand the difference is to profile the code and see where the bottlenecks are for your specific problem.

For this, you can use different profiling modules in python:

cProfile
python line profiler

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange