Fossil and NetBSD

The tech-repository has been silent for a long time. Initial conversions from CVS to git and Mercurial highlighted various issues with the conversion tools. They also provided some insight into scalability issues with the existing major two. In July, bsdtalk run an interesting interview with D. Richard Hipp about fossil.

Fossil has some interesting properties. One of them is the license, making it easy to integrate into the base system without too much discussion about legal issues. Another one is that it is pretty self-contained. This brings up the most important question, how does fossil scale?

To answer that question, it was important to get a repository into it first. The first try used the Git import tool for fossil on the original conversion from CVS to Git I created in 2008. This resulted in some bug fixes for the stat page in fossil to be 64bit clean, but also provided the insight that the branch handling in svn->git was messed up and that there is at least one major issue in fossil in terms of repo size.

Based on the known issues of the cvs2svn tool and the problems other conversion tools have, I started to hack up some conversion logic myself. Testing it on a private part of the NetBSD repository, that contains e.g. the master version of pkg-vulnerabilities, resulted in improvements for the delta caching in fossil. This file makes up about 1/3 about all revisions in that sub-repository and itself has over 3000 (CVS) revisions. At the time fossil was using plain recursion to handle delta chains and in some code paths using two stack frames for one entry in the delta chain. This resulted in crashes of fossil due the default stack size limit in NetBSD. This was fixed in smallstack branch of fossil by using tail recursion if possible and otherwise moving the necessary unwind data to the heap.

With that issue fixed, it was possible to import both pkgsrc and src into fossil. Many operations like checkout are speedy. I haven't done much benchmarking yet, but things like "fossil status" or the built-in web UI were fast enough. One very major factor that impacts clone and initial import was that the "rebuild" step which (re)creates the internal meta data cache took 10h on a fast machine.

The main issue with rebuild is that it has to parse all the manifests (the description of a change set). For pkgsrc and src, this is dominated by the list of files in revision with the associated SHA1 hash. A typical working copy of pkgsrc has over 60000 files, so the resulting manifest is in the area of Megabytes. Every single manifest has to be hashed (with both SHA1 and MD5) at least once and parsed. A longer discussion with Richard resulted in two different ideas to be tried. The first idea was to cluster the file list into smaller parts referenced by hash. If one part wasn't changed compared to the parent revision, it could be reused. This cut the size of the temporary database used for the import from 200GB to between 5GB and 8GB, depending on the number of clusters. The obvious down side is the much increase in the number of artifacts in the resulting fossil database. The second idea, which can now be found in the experimental branch, was to make the manifest itself kind of a patch by encoding only the list of changed files relative to a baseline. This approach resulted in similar savings for the intermediate database. Initial testing showed that the rebuild time went down from 10h to 1h -- much better, but still not the expected range.

A combination of some optimizations like using a faster SHA1 implementation and tweaking the manifest parser resulted in at least 10% improvement. An additional change to the manifest cache resulted in bring down the time to 10min. That's still not optimal, but for something normally only done during clone, it is not a major blocking point.

At this point the major road block for fossil is the run time of "fossil clone", In terms of IO, it isn't too bad. 36MB sent to the server and 530MB received is acceptable for the repository size. The problem is that the operation takes 90min on a fast machine cloning from localhost and only 25min are actually spend for the process.