Question

I'm trying to classify and come up with a reasonable solution for the following problem (abstracted from a real world problem).

Problem

Imagine StackOverflow started offering a subscription where companies could buy X number of impressions per month for a set of tags.

For example, you might be interested in running an ad with 50000 impressions on the following tags: javascipt, typescript, react, nextjs, nodejs, webpack.

StackOverflow "promises" you will get 50000 impressions, but they are free to distribute them across the above tags however they desire (could even be 100% to one tag).

How should StackOverflow optimize the distribution so that they can sell/deliver the maximum number of impressions?

For simplicity, assume:

  • we know the number of impressions each tag receives per month ahead of time and it doesn't change
  • each impressions only belongs to a single tag

My Research

I believe this might be similar to the Knapsack or Bin packing problems.

The simplest way to approach this is with a "first-fit" (or maybe better described as "first come, first-serve") algorithm:

  1. Get all active "subscriptions" (impressions & tags), ordered by Created Date ASC
  2. For each "subscription", grab the first tag and "consume" as many impressions as we can. If the subscription still has remaining impressions, move to the next tag.
  3. Repeat with next subscription

That might play out something like this:

==Inventory==

Tag          Impressions
javascipt    2000
typescript   5000
react        5000
nextjs       5000


== Subscriptions ==

Acme:  8000  (javacript, typescript, react, nextjs)
Stark: 5000  (javacript, typescript)

== Allocation Results ==

Acme: 
 - javacript:  2000
 - typescript: 5000
 - react:      1000

Stark: 
 - javacript:  0
 - typescript: 0

The optimization for this solution is clearly horrible. In the example above, while we had more than enough impressions to go around, we allocated the only ones that "Stark" could purchase before we got around to them.

Is there an official name to this problem and what does the optimal solution look like?

Was it helpful?

Solution

I think this is actually "just" maximum cardinality bipartite matching, which can be solved optimally in $O(|E|\sqrt{|V|})$ time using the Hopcroft-Karp algorithm.

On one side, you have a vertex for each demanded impression (say, customer A demands 50000 of either "javascript" or "typescript", and customer B demands 10000 "javascript", so 60000 vertices in all), and on the other side you have a vertex for each supplied impression (say, 30000 labelled "javascript" and 15000 labelled "typescript", so 45000 vertices in all). Draw an edge from each demanded impression to every supplied impression of a type that satisfies its impression type constraints (so there would be 45000 edges leaving each of customer A's vertices, and 15000 edges leaving each of customer B's), and solve.

If you have thousands of impressions, you may wind up with millions of edges, which might take too long. If an approximate answer suffices, I would suggest dividing all numbers through by some constant, e.g. 1000, and then multiplying them back up again afterwards. (If this constant divides all input numbers, the solution will remain optimal.)

OTHER TIPS

If you had only one tag, this would become a knapsack problem: in particular, determining whether you could sell all the impressions would be exactly the subset problem. Therefore, the problem is NP-hard.

A plausible approach to solve it in practice would be to use integer linear programming.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top