Question

Earlier this week I asked this question (please review):

Imagine StackOverflow started offering a subscription where companies could buy X number of impressions per month for a set of tags.

For example, you might be interested in running an ad with 50000 impressions on the following tags: javascipt, typescript, react, nextjs, nodejs, webpack.

StackOverflow "promises" you will get 50000 impressions, but they are free to distribute them across the above tags however they desire (could even be 100% to one tag).

One of the solutions provided suggested using the Hopcroft-Karp algorithm, which I have since been able to successfully implement.

However, I'm having trouble scaling this solution to work for my use case. The bottle neck is actually in the setup of the graph more-so than the algorithm performance.

The problem is the massive number of edges that must be initialized. Take the example below:



==Inventory==

Tag          Units 

javascipt    2 
typescript   4 
react        3 
nextjs       3   

== Subscriptions ==  

Account  Units   Tags
Acme     7       javacript, typescript, react, nextjs

Stark    4       javacript, typescript

If we create a vertex to represent every unit, then we would end up with a graph that has 101 edges (7*11 + 4*6)!

enter image description here

The problem I'm working on is orders of magnitude larger, so this quickly becomes unrealistic.

As suggest, I can reduce the number of vertex by scaling the units, but I lose resolution and that's actually somewhat important in this case (I can maybe scale by 100x max).

I was thinking there must be a more efficient way to do this, since I'm essentially duplicating vertex for every unit. I came up with the following bipartite graph with weighted edges and vertex values:

enter image description here

The goal would be to to adjust the weights of the edges so that:

  • On the left, the sum of edges is equal the value of the vertex
  • On the right, the sum of edges is less than or equal to value of the vertex

Below is an example of one of the solutions:

Solution to bipartite graph

I'm guessing (hoping) this is already a well known problem, for which there is also a well documented solution (ideally one that's not NP-hard)!

Maybe it's even possible to modify the Hopcroft-Karp algorithm to work for this?

Was it helpful?

Solution

I've got bad news. There's no hope for an algorithm whose worst-case running time is polynomial in the size of the weighted graph (unless P = NP, which seems unlikely).

Your problem is as hard as the knapsack problem, which has no polynomial-time algorithm (unless P = NP) when the input is represented in binary. The knapsack problem has a pseudopolynomial time algorithm, which means that there is an algorithm whose running time is polynomial in the numbers (but not polynomial-time in the size of the input). In your setting, that turns into an algorithm whose running time is polynomial in the size of the unweighted graph (with a massive number of edges) but not polynomial in the size of the weighted graph (with a small number of edges).

You could try expressing this as an integer linear programming (ILP) instance, and then solve with an ILP solver. It's possible that might yield a more efficient solution. The core idea is to introduce an integer variable $x_{i,j}$ that counts the number of impressions provided to company $i$ in tag $j$; then you can introduce a few linear inequalities to capture the problem's requirements.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top