How to manage concurrent development with mercurial?

https://stackoverflow.com/questions/3719019

02-10-2019
|

Question

This is a best practice question, and I expect the answer to be "it depends". I just hope to learn more real world scenarios and workflows.

First of all, I'm talking about different changes for the same project, so no subrepo please.

Let's say you have your code base in an hg repository. You start to work on a complicated new feature A, then a complicated bug B is reported by your trusted tester (you have testers, right?).

It's trivial if (the fix for) B depends on A. You simlply ci A then ci B.

My question is what to do when they are independent (or at least it seems now).

I can think of the following ways:

Use a separate clone for B.
Use anonymous or named branches, or bookmarks, in the same repository.
Use MQ (with B patch on top of A).
Use branched MQ (I'll explain later).
Use multiple MQ (since 1.6)

1 and 2 are covered by an excellent blog by @Steve Losh linked from a slightly related question.

The one huge advantage of 1 over the other choices is that it doesn't require any rebuild when you switch from working on one thing to the other, because the files are physically separated and independent. So it's really the only choice if, for example, A and/or B touches a header file that defines a tri-state boolean and is included by thousands of C files (don't tell me you haven't seen such a legacy code base).

3 is probably the easiest (in terms of setup and overhead), and you can flip the order of A and B if B is a small and/or urgent fix. However it can get tricky if A and B touches the same file(s). It's easy to fix patch hunks that failed to apply if A and B changes are orthogonal within the same file(s), but conceptually it's still a bit risky.

4 can make you dizzy but it's the most powerful and flexible and scalable way. I default hg qinit with -c since I want to mark work-in-progress patches and push/pull them, but it does take a conceptual leap to realize that you can branch in MQ repo too. Here are the steps (mq = hg --mq):

hg qnew bugA; make changes for A; hg qref
mq branch branchA; hg qci
hg qpop; mq up -rtip^
hg qnew bugB; make changes for B; hg qref
mq branch branchB; hg qci
To work on A again: hg qpop; mq up branchA; hg qpush

It seems crazy to take so many steps, and whenever you need to switch work you must hg qci; hg qpop; mq up <branch>; hg qpush. But consider this: you have several named release branches in the same repository, and you need to work on several projects and bug fixes at the same time for all of them (you'd better get guaranteed bonus for this kind of work). You'd get lost very soon with the other approaches.

Now my fellow hg lovers, are there other/better alternatives?

(UPDATE) qqueue almost makes #4 obsolete. See Steve Losh's elegant description here.

Solution 2

It seems like there's no more or better choices than the ones I listed in the question. So here they are again.

Use one clone per project.
- Pros: total separation, thus no rebuild when switching projects.
- Cons: toolchain needs to switch between two clones.
Use anonymous or named branches, or bookmarks, in the same repository.
- Pros: standard hg (or any DVCS) practice; clean and clear.
- Cons: must commit before switching and rebuild after.
Use MQ with one patch (or multiple consecutive patches) per project.
- Pros: simple and easy.
- Cons: must qrefresh before switching and rebuild after; tricky and risky if projects are not orthogonal.
Use one MQ branch (or qqueue in 1.6+) per project.
- Pros: ultra flexible and scalable (for the number of concurrent projects)
- Cons: must qrefresh and qcommit before switching and rebuild after; feels complicated.

Like always, there's no silver bullet, so pick and choose the one right for the job.

(UPDATE) For anyone who's in love with MQ, using MQ on top of regular branches (#2 + #3) is probably the most common and preferable practice.

If you have two concurrent projects with baseline on two branches (for example next release and current release), it's trivial to hop between them like this:

hg qnew; {coding}; hg qrefresh; {repeat}
hg qfinish -a
hg update -r <branch/bookmark/rev>
hg qimport -r <rev>; {repeat}

For the last step, qimport should add a -a option to import a line of changesets at once. I hope Meister Geisler notices this :)

OTHER TIPS

I would always use named branches, because that lets Mercurial do its job: to keep your project history, and to remember why you made which changes in what order to your source code. Whether to have one clone or two sitting on your disk is generally an easy one, given my working style, at least:

Does your project lack a build process, so that you can test and run things right from the source code? Then I will be tempted to have just one clone, and hg up back and forth when I need to work on another branch.
But if you have a buildout, virtualenv, or other structure that gets built, and that might diverge between the two branches, then doing an hg up then waiting for the build process to re-run can be a big pain, especially if things like setting up a sample database are involved. In that case I would definitely use two clones, one sitting at the tip of trunk, and one sitting at the tip of the emergency feature branch.

So the question is, at the point when you are told to stop working on feature A, and begin independent feature B, what alternative options are there, for: How to manage concurrent development with mercurial?

Let's look at the problem with concurrency removed, the same way you write threaded code- define a simple work flow for solving any problem given to you, and apply it to each problem. Mercurial will join the work, once it's done. So, programmer A will work on feature A. Programmer B will work on feature B. Both just happen to be you. (If only we had multi-core brains:)

I would always use named branches, because that lets Mercurial do its job: to keep your project history, and to remember why you made which changes in what order to your source code.

I agree with Brandon's sentiment, but I wonder if he overlooked that feature A has not been tested? In the worst case, the code compiles and passes unit tests, but some methods implement the previous requirements, and some methods implement the new ones. A diff against the previous check-in is the tool I would use to help me get back on track with feature A.

Is your code for feature A at a point when you would normally check it in? Switching from feature A to working on feature B is not a reason to commit code to the head or to a branch. Only check in code that compiles and passes your tests. My reason is, if programmer C needs to begin feature C, a fresh checkout of this branch is no longer the best place to start. Keeping your branch heads healthy, means you can respond quickly, with more reliable bug fixes.

The goal is to have your (tested and verified) code running, so you want all your code to end up merged into the head (of your development and legacy branches). My point seems to be, I've seen branching used inefficiently: code becomes stale and then not used, the merge becomes harder than the original problem.

Only your option 1 makes sense to me. In general:

You should think your code works, before someone else sees it.
Favor the head over a branch.
Branch and check-in if someone else is picking up the problem.
Branch if your automated system or testers need your code only.
Branch if you are part of a team, working on a problem. Consider it the head, see 1-4.

With the exception of config files, the build processes should be a checkout and a single build command. It should not be any more difficult to switch between clones, than for a new programmer to join the project. (I'll admit my project needs some work here.)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow