Question

Recently, I get to know about the hummingbird library for Python. I trained a RandomForest on a 10M-sized dataset with 2 labels. With sklearn it was taking 450 ms for inference. But after converting the same model to PyTorch, now it takes 128ms on CPU inference.

If both are running on the CPU, then why hummingbird's Pytorch model is faster than sklean model?

I am not getting what hummingbird does to my sklearn model to increases speed.

Was it helpful?

Solution

It is difficult to answer your question without the access to your code. The best way to understand the difference is to profile the code and see where the bottlenecks are for your specific problem.

For this, you can use different profiling modules in python:

  1. cProfile
  2. python line profiler
Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top