June 20, 2014 • ∞
Back when DVCSes were all new and yet unexplored, one of the touted benefits was the ability to work offline, be it because you are on the plane or because code hosting site went down. While true, you still need to communicate with the outside world, and when your Internet connection gets all cranky, this can become an ordeal of its own.
If you ever used Mercurial over an unreliable connection, you might be wondering why can’t it resume broken clones and why, oh why can’t it do partial pulls and pushes. The reason for the former is purely technical and will be explained shortly, and the latter is simply not the case.
Mercurial communicates changesets by streaming changelog followed by manifest data followed by file data. This minimizes seeking both for client and server, and maximizes on-the-wire compression.
This design has two implications for the viability of “resume”.
First, a clone or pull operation gets interrupted in the middle, odds are that not a single valid changeset was ever transferred (since you might be missing the associated manifest and filelogs) so there’s no consistent point to resume an interrupted operation from.
Second, if a repository gets a new commit between starting a pull and resuming a pull, the contents of a new stream will change at various places, with new changesets, manifests and filelogs appearing in the middle of the stream. This makes it impossible to implement something similar to byte serving, where client can request a portion of a response. Nor is it possible to resume transfer at an arbitrary changeset.
With that in mind, here are all the ways that can help you work around broken connections.
Depending on how flaky your connection is, there are two options for performing initial clones.
First, you can try so-called “streaming clones”. These minimize TTFB, but do generally require a bit more data to be transferred.
Here’s how to do a streaming clone:
$ hg clone --uncompressed http://live.hglabhq.com/hg/hgsharp/hgsharp
Note that HgLab can enforce this behavior for all compatible clients.
Your second option will be a
hg clone –-rev operation, followed by a number of incremental pulls. This behaves similarly to cloning a repository in some distant past and doing occasional updates.
$ hg clone --rev 5 http://live.hglabhq.com/hg/hgsharp/hgsharp
With incremental pulls you can get the entire repository on a changeset-by-changeset basis, thus reducing the amount of data transferred during each operation to an absolute minimum:
$ cd hgsharp $ hg pull --rev 10 http://live.hglabhq.com/hg/hgsharp/hgsharp $ hg pull --rev 20 http://live.hglabhq.com/hg/hgsharp/hgsharp $ hg pull --rev 21 http://live.hglabhq.com/hg/hgsharp/hgsharp
In true Mercurial spirit (which means, among other things, being consistent throughout the entire system), pushes work very similarly. To get all fancy, here’s a bit of revset-fu to do incremental pushes:
$ hg push --rev "limit(sort(draft(), rev), 2)"
What this does is sends first two draft revisions to a remote repository. Of course, this works just as good:
$ hg push --rev 42
There’s one command to watch out for when you are struggling to keep connected. This command is
hg incoming and here’s why you should care.
hg incoming does is it shows which changesets from the remote repository are not yet in your local one. The way it works is it pulls all the missing parts of history locally, displays the log and then disposes of the bytes it had received. This means that, in case of broken connections, it can very well fail or waste precious bandwidth.
Hopefully, the next time you find yourself in a situation like this you’ll know exactly what to do to get the work done.
HgLab is a behind-the-firewall self-hosted Mercurial server and source control management system which gives you:
Interested in HgLab and Mercurial? Want to know when new releases are out? Join the HgLab HQ Mailing List for to get notified when something interesting happens.