How is this function defined? Is it pre-computed? If so you can iterate over the computed function representation and retrieve pairs of words based on your criterion. If not, you have no other option but to compute the function between all pairs of words (This is necessary in your graph approach). Instead of the Bron-Kerbosch algorithm, I would go for a randomization+approximation strategy like,
http://dimacs.rutgers.edu/Workshops/Challenge10/abstracts.html#ovelgonne
http://dl.acm.org/citation.cfm?id=1933306.1934471
It is based on an modularity maximization approach. Modularity is the ratio between number of outgoing edges in a cluster to the number of edges within the cluster. In your case, you will be looking for the clusters with ratio 0, and choose the biggest one of them. This algorithm is really fast and works for most datasets that I have worked with. Though a modularity based method might be overkill for this problem, IMHO I feel this is a nice way to think about the problem + the implementation of this algorithm is available online (C implementation by the authors of the paper).