Question

I developed a classification model for a telecom client. Where we classify between Dual-sim and non-Dual-Sim clients. After many iteration the best precision we can get is 60%. The contract says that the acceptance criteria is 75% precision. The client measures precision based on campaign results. In other words, they call sample data and ask them explicitly are a dual sim or not.

Facts we know:

  • Learning curve shows that more training data won't do the trick.
  • We have got all variables that were used in the market before(and everything we could think of, but nothing improved the model).
  • Training data don't even get 1% of the whole population I am generalizing on(and we can't get more than that).
  • The random baseline is 35%, so we got a lift of around 1.9 (but they won't approve of it).
  • We have tried many iterations from the most simple models to the most complex, and this is the best we can get.
  • The dual sim and non-dual sim distribution across the variables, isn't that different.

  • The language used were SQL, R.

So the question is, what else can I do to prove that the model has a value but without getting the 75% precision?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top