Question

I'm looking for an algorithm to split a bipartite graph into subgraphs with a specific constraint. I'm not sure if any existing algorithms solve my problem or not.

I have an undirected bipartite graph where the nodes are customers ($C$) and services ($S$). I want to split this into several smaller subgraphs, limiting the amount of services in each subgraph to some maximum number $N$. Unfortunately, looking for disconnected subgraphs is not sufficient because the graph connectivity is too high, so I think I will need to duplicate services.

Formally, I want a set of subgraphs such that:

  • Each customer $c \in C$ appears in exactly one subgraph
  • All edges appear in exactly one subgraph (the one in which their customer appears)
  • Each service $s \in S$ can appear in any number of subgraphs (it's okay to duplicate services to help the split)
  • Each subgraph should have at most $N$ services (where $N$ is a given constant that is guaranteed to be larger than the highest number of services connected to any single customer)
  • The subgraphs should have as many customers as possible (without this restriction, it's trivially solvable by putting each customer in their own subgraph with a copy of their services). That can be a heuristic rather than formally proven.

Can anyone suggest an algorithm for doing this? The number of nodes is not huge (roughly 1000 customers, 100 services, with each customer connecting to 5 or less services) so brute force approaches or those with bad big-O scaling may be suitable.

Was it helpful?

Solution

It smells like it might be NP-hard.

One plausible approach is to use a SAT solver or ILP solver. Suppose you decide you want to have at most $m$ subgraphs. Then you can have boolean variables $x_{i,k},y_{j,k}$ indicating that customer $i$ goes into subgraph $k$ and service $j$ goes into subgraph $k$. You can obtain a bunch of constraints (clauses) based on your requirements and then ask the SAT solver or ILP solver to find a feasible solution. The worst-case running time is exponential. This might or might not work well enough in your situation.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top