How does mercurial's bisect work when the range includes branching?

https://stackoverflow.com/questions/13187988

29-07-2021
|

Question

If the bisect range includes multiple branches, how does hg bisect's search work. Does it effectively bisect each sub-branch (I would think that would be inefficient)?

For instance, borrowing, with gratitude, a diagram from an answer to this related question, what if the bisect got to changeset 7 on the "good" right-side branch first.

@  12:8ae1fff407c8:bad6
|
o  11:27edd4ba0a78:bad5
|
o    10:312ba3d6eb29:bad4
|\
| o  9:68ae20ea0c02:good33
| |
| o  8:916e977fa594:good32
| |
| o  7:b9d00094223f:good31
| |
o |  6:a7cab1800465:bad3
| |
o |  5:a84e45045a29:bad2
| |
o |  4:d0a381a67072:bad1
| |
o |  3:54349a6276cc:good4
|/
o  2:4588e394e325:good3
|
o  1:de79725cb39a:good2
|
o  0:2641cc78ce7a:good1

Will it then look only between 7 and 12, missing the real first-bad that we care about? (thus using "dumb" numerical order) or is it smart enough to use the full topography and to know that the first bad could be below 7 on the right-side branch, or could still be anywhere on the left-side branch.

The purpose of my question is both (a) just to understand the algorithm better, and (b) to understand whether I can liberally extend my initial bisect range without thinking hard about what branch I go to. I've been in high-branching bisect situations where it kept asking me after every test to extend beyond the next merge, so that the whole procedure was essentially O(n). I'm wondering if I can just throw the first "good" marker way back past some nest of merges without thinking about it much, and whether that would save time and give correct results.

Solution

To quote from Mercurial: The Definitive Guide:

The hg bisect command is aware of the “branchy” nature of a Mercurial project's revision history, so it has no problems dealing with branches, merges, or multiple heads in a repository. It can prune entire branches of history with a single probe, which is how it operates so efficiently.

The code that does the work is in hbisect.py and actually looks at the descendent and ancestor trees from each node where the state has been determined.

It looks to me like the changeset chosen to test is chosen by weighting "how central" it is in graph of those yet to test (i.e. bisecting by ancestors vs. non-ancestors, rather than chronology):

108     x = len(a) # number of ancestors
109     y = tot - x # number of non-ancestors
110     value = min(x, y) # how good is this test?

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow