Full support for NetBSD/ARM committed.
Full support for NetBSD/ARM committed.
Following the recent flag day to move (e)ARM from EHABI to normal Itanium-style unwinding as well as the import of recent LLVM/Clang, the EABI variants of the ARM ports are now fully supported with Clang. In short, just setting MKLLVM=yes and HAVE_LLVM=yes for build.sh will switch to over to the BSD licensed toolchain. Testing especially on ARMv4 is very welcome.
Martin Husemann is stitll working on the networking support for the Cubietruck, but once that is done we plan to setup a bunch of Cubietruck boards for bulk building pkgsrc. Stay tuned.
I'm investigating MIPS now, but that is a bit more tricky.
A minor update fixing the handling of path names with spaces as well as delayed vendor branches.
A minor update fixing the handling of path names with spaces as well as delayed vendor branches.
The new snapshot of cvs2fossil fixes two issues seen in the NetBSD repositories:
Both changes have been in use for the NetBSD conversions for a while.
NetBSD/AMD64 has been supported by LLVM and Clang for quite some time now. There are a few regressions in the ATF tests compared to GCC, but they don't look serious.
Recent work for NetBSD/i386 involved fixing a number of nasty little ABI bugs, where NetBSD and other ELF systems differ in the details. It is now comparable to the status of AMD64.
NetBSD/AMD64 has been supported by LLVM and Clang for quite some time now. There are a few regressions in the ATF tests compared to GCC, but they don't look serious.
Recent work for NetBSD/i386 involved fixing a number of nasty little ABI bugs, where NetBSD and other ELF systems differ in the details. It is now comparable to the status of AMD64.
I've started on getting support for LLVM and Clang as system compiler in NetBSD in 2010. The reach-over frame work was committed last February. Unlike GCC and PCC, I haven't imported the source code yet. It would add a lot of space in the repository and working copies as well as increase the overhead of keeping the copy in NetBSD synchronized with upstream trunk. As such interested parties have to run the "checkout" target in src/external/bsd/llvm to get a fresh copy from svn as well as re-run the target whenever the in-tree version changes. After that, all that is needed is setting MKLLVM=yes and HAVE_LLVM=yes to build the system with Clang.
In my own ATF runs I currently have 8 failures for the Clang world. 5 of the 8 failures are also seen in the ATF runs of the GCC world in the same Virtualbox environment. The remaining cases are as follows:
For i386, there is one problem in libm, where expf(3) seems to give wrong results, at least that is the reason why the sinhf regression test fails. I haven't looked at this further.
I haven't run any benchmarks yet, so no numbers on code size, execution speed or even build time. The LLVM build is a debug build, so it is slower than necessary.
I'm also running irregular pkgsrc builds with Clang. There are a number of common issues:
Help in cutting down the number of trivial build failures would definitely be appreciated. At the moment, almost 1000 packages fail for various reasons, many of them can be fixed in a bunch of minutes.
Update: The correct option is HAVE_LLVM and not HAVE_CLANG.
Checkout https://github.com/jsonn/src and https://github.com/jsonn/pkgsrc!
Checkout https://github.com/jsonn/src and https://github.com/jsonn/pkgsrc!
After some initial issues the git conversion seems to be stable now. Both src and https://github.com/jsonn/pkgsrc can be found on github.
Fun fact about git: pushing trunk from src alone and the other refs with push -all afterwards requires 50% more space than doing the push -all in first place.
One issue remains. I can't do incremental exports from fossil to git, because git can't find objects it has imported earlier. I'm not the first person to experience this, but it seems no one cares enough to fix it.
New server with better connection for the fossil repositories, a mirror and initial addition of git support
New server with better connection for the fossil repositories, a mirror and initial addition of git support
The fossil repositories (src and pkgsrc) have moved to a new server, so at least upstream bandwidth no longer is a problem. For cloning, it is still recommented to fetch the database files directly from ftp.NetBSD.org. Zafer Aydoğan also provides mirrors at src and pkgsrc.
The src repository has been rebuild from scratch after many clean ups to the branching. It should be in a pretty good shape now. Special thanks to S.P. Zeidler for the assistance in messing up cvs.NetBSD.org.
At this point, I believe the conversion to be stable and don't plan any more repository changes. If you run across inconsistencies in the conversion, please mail me though.
I've also started to integrate the git export. This is still a lot more overhead than necessary. The incremental fast-import in git doesn't work, so it has to write all all blobs and commits on every pass, adding another 20min or so to the conversion rounds. The result can be found on github. I'm interested in feedback here to decide asking github for more space to add src as well.
Update: there seems to be a small bug with the conversion of file deletes, so the git repo will be nuked and rebuild soonish
Update 2: fixed and repushed
Bug fix release for cvs2fossil to get stable conversion for pkgsrc
Bug fix release for cvs2fossil to get stable conversion for pkgsrc
The new snapshot of cvs2fossil fixes three important issues:
I've used this and also changed the conversion to create the newer sub-second timestamp format. To fully exploit the bug fixes, the two NetBSD repositories (pkgsrc and src) have been recreated some scratch. I've decided to skip the top-level directory now as well, so src/Makefile is now just Makefile and src/bin/cat/cat.c can be found in bin/cat/cat.c.
Better diagnostic for Attic conflicts and support for commitid
Better diagnostic for Attic conflicts and support for commitid
There is another snapshot of cvs2fossil. This snapshot fixes two small issues:
A small bug fix for cvs2fossil to handle "cvs delete" on branches correctly.
A small bug fix for cvs2fossil to handle "cvs delete" on branches correctly.
I'm proud to release a new snapshot of cvs2fossil. This fixes an annoying little bug reported from the TCL/TK community.
If the first commit of a branch is in state "dead", this can be the result of two different things:
In the first case, the time of the commit equals the branch point, in the second case, it doesn't. The old version didn't check this and handled the second case like the first one.
In the two month since the last update, the Fossil conversion utility has seen quite a number of improvements.
A public mirror of the repository conversion is provided.
In the two month since the last update, the Fossil conversion utility has seen quite a number of improvements.
A public mirror of the repository conversion is provided.
Since the initial release of the Fossil converter, a lot has changed:
The code can be found here.
The processing time is still around 5h for src, but I'm currently running the machine with less memory. On my laptop with SSD, it needs 3h for a full conversion.
I am providing a public mirror of the conversion. It is updated around 3-4 times a day. Please avoid cloning directly from that site as I am a bit bandwidth constrained. You can fetch the repository directly from ftp.netbsd.org (which is much faster) and pull changes from my server afterwards. Please also note that the leaf versions are sometimes not completely stable due to incomplete rsync of large commits or if a branch was created without a commit between runs. When I see such an instance, I will close the wrong leaf.
Initial release of cvs to fossil conversion routines
Initial release of cvs to fossil conversion routines
Three weeks ago I wrote about the fossil tests. Quite a few things have changed inside fossil and I have been working on reimplementing the Python parts of conversion tools in C as well improving the performance.
The code doesn't have any fancy build system and a few (Net)BSD features are used, so don't expect it to work out of the box on anything else. Most RCS files should work out-of-the-box. A few limitations are known and not handled automatically:
I've been slowly fixing up various issues exposed by this tool in the NetBSD repository. Processing needs around 5h on an AMD Opteron 1389 from src in CVS form to the Fossil repository. The majority of the time is spend in the commit building step, which is primarily IO bound and a potential place for further investigation. Ultimate goal is to get a bit exact import of all major branches, which seems to (almost) the case now.
Large scale repositories create interesting issues for version control systems. How does fossil cope?
Large scale repositories create interesting issues for version control systems. How does fossil cope?
The tech-repository has been silent for a long time. Initial conversions from CVS to git and Mercurial highlighted various issues with the conversion tools. They also provided some insight into scalability issues with the existing major two. In July, bsdtalk run an interesting interview with D. Richard Hipp about fossil.
Fossil has some interesting properties. One of them is the license, making it easy to integrate into the base system without too much discussion about legal issues. Another one is that it is pretty self-contained. This brings up the most important question, how does fossil scale?
To answer that question, it was important to get a repository into it first. The first try used the Git import tool for fossil on the original conversion from CVS to Git I created in 2008. This resulted in some bug fixes for the stat page in fossil to be 64bit clean, but also provided the insight that the branch handling in svn->git was messed up and that there is at least one major issue in fossil in terms of repo size.
Based on the known issues of the cvs2svn tool and the problems other conversion tools have, I started to hack up some conversion logic myself. Testing it on a private part of the NetBSD repository, that contains e.g. the master version of pkg-vulnerabilities, resulted in improvements for the delta caching in fossil. This file makes up about 1/3 about all revisions in that sub-repository and itself has over 3000 (CVS) revisions. At the time fossil was using plain recursion to handle delta chains and in some code paths using two stack frames for one entry in the delta chain. This resulted in crashes of fossil due the default stack size limit in NetBSD. This was fixed in smallstack branch of fossil by using tail recursion if possible and otherwise moving the necessary unwind data to the heap.
With that issue fixed, it was possible to import both pkgsrc and src into fossil. Many operations like checkout are speedy. I haven't done much benchmarking yet, but things like "fossil status" or the built-in web UI were fast enough. One very major factor that impacts clone and initial import was that the "rebuild" step which (re)creates the internal meta data cache took 10h on a fast machine.
The main issue with rebuild is that it has to parse all the manifests (the description of a change set). For pkgsrc and src, this is dominated by the list of files in revision with the associated SHA1 hash. A typical working copy of pkgsrc has over 60000 files, so the resulting manifest is in the area of Megabytes. Every single manifest has to be hashed (with both SHA1 and MD5) at least once and parsed. A longer discussion with Richard resulted in two different ideas to be tried. The first idea was to cluster the file list into smaller parts referenced by hash. If one part wasn't changed compared to the parent revision, it could be reused. This cut the size of the temporary database used for the import from 200GB to between 5GB and 8GB, depending on the number of clusters. The obvious down side is the much increase in the number of artifacts in the resulting fossil database. The second idea, which can now be found in the experimental branch, was to make the manifest itself kind of a patch by encoding only the list of changed files relative to a baseline. This approach resulted in similar savings for the intermediate database. Initial testing showed that the rebuild time went down from 10h to 1h -- much better, but still not the expected range.
A combination of some optimizations like using a faster SHA1 implementation and tweaking the manifest parser resulted in at least 10% improvement. An additional change to the manifest cache resulted in bring down the time to 10min. That's still not optimal, but for something normally only done during clone, it is not a major blocking point.
At this point the major road block for fossil is the run time of "fossil clone", In terms of IO, it isn't too bad. 36MB sent to the server and 530MB received is acceptable for the repository size. The problem is that the operation takes 90min on a fast machine cloning from localhost and only 25min are actually spend for the process.
When AMD introduced Long Mode aka the 64bit extension, they retired the segmentation implementation to switch to a flat memory model. The entire concept of segmentation? No! The FS and GS register still work somewhat like before, just not entirely. Why is it important? Modern Linux binaries require support for Thread Local Storage (TLS) and that is using FS and GS.
When AMD introduced Long Mode aka the 64bit extension, they retired the segmentation implementation to switch to a flat memory model. The entire concept of segmentation? No! The FS and GS register still work somewhat like before, just not entirely. Why is it important? Modern Linux binaries require support for Thread Local Storage (TLS) and that is using FS and GS.
The x86 platform has two layers of virtualisation. The lower layer is the paging and translates logical to physical addresses. The upper layer turns a pointer and a base register into a logical address. The base register is called a segment register and either implied or specified by an instruction prefix. This started as a cute hack to support 1MB of memory with 16bit registers in the 8086 and turned more complex with the introduction of Protected Mode with the 80286. In Protected Mode, the value of the segment register is actually a descriptor value used to index the Global Descriptor Table or in some cases the per-process equivalent. This table contains a base and limit as well as type and access information. The CPU enforces the limit on access, e.g. the pointer value has be smaller than the limit. The logical address is the sum of the pointer and the base.
This changes dramatically when Long Mode is activated. Most segment registers are plainly ignored, the only exception are FS and GS. In many operating systems they are used to either point to Thread Local Storage (userland) or CPU local storage (kernel). To provide an alternative, two MSRs were introduced to contain the 64bit base and no limit checks are performed. This has some issues. First of all, some applications actually depended on the limit check. More importantly, it adds some fun issues for 32bit compatibility. There is no way besides writing to the segment register to set the limit. Setting the segment register also modifies the corresponding MSRs. It is not possible to load 64bit base addresses using descriptors. In short: the correct way to load FS/GS depends on whether the target code is running in 32bit compatibility mode or not. For 64bit mode it has to load fs/gs first, if the value changed. Next, the MSRs are updated to get the value of interest. For 32bit mode, the code doesn't have to bother with the MSRs and can just load the fs/gs register directly.
The initial patch to support sysarch on amd64 and TLS for Linux binaries does the MSR manipulations in cpu_switchto, e.g. on context change and in the manipulation functions on change. This means that the wrmsr instruction is not part of the normal system call path as opposed doing it on every interrupt return or system call exit as some other operating systems do. It is left for future work to
The division is one of the less often used integer operations and by a huge margin the slowest. Often the divisor is semi-constant and this allows using much cheaper operations.
The division is one of the less often used integer operations and by a huge margin the slowest. Often the divisor is semi-constant and this allows using much cheaper operations.
The computation of a / b or a %b is one of the less common operations in many programs. Nevertheless it is part of many time critical operations. Many hash tables for example have a prime number as size and require the computation of the remainder. The division instructions are generally also the slowest integer operations CPUs offer and by a far margin.
The price of the division makes it attractive to check for cheaper alternatives. For constants, GCC and other compilers will replace unsigned divisions with simpler code based on an old paper. This doesn't help with numbers that are variable, but invariant at runtime. For resizable hash tables the resizing is a rare operation and the hot path (hashing) could benefit greatly from using this optimisations.
This leads to the new functions in sys/bitops.h: fast_divide32, fast_remainder32 and fast_divide32_prepare. The third function computes the constants needed for the other two. Benchmarking with the services(5) lookup code using CDB shows a 10% improvement for getservbyport. This is partially a result of the better scheduling as multiplications can run in parallel with other instructions and utilize the pipelines better.
A space-efficient alternative to db and cdb
A space-efficient alternative to db and cdb
Many applications require a key,value database, but the data itself is (mostly) constant. General purpose hash databases are inefficient for this purpose as they waste a lot of space and often also require more complex queries. The most well-known constant database is from DJB, but it is a number of limitations.The biggest issues of DJB's implementation concern the large overhead (24 Bytes / record) and the inefficient hash function. Bob Jenkins has a comparison.
To work around this, I started to design a new constant database format. As a building block I used the CHM algorithm for a minimal perfect hash function. An implementation for that already existed in nbperf(1). I decided to use a separate offset table as it provides a natural encoding of the entry size and makes it easier to allow multiple keys for the same value.
To minimize the storage I decided against handling key matching in the library. The pay load for many of the intended uses already contains the key, so storing it separately would waste space. The perfect hashing ensures that only a single entry has to be matched, so the overhead is generally small, too.
The result is quite impressive. For the /etc/service database, the cdb version is 304KB, the original db(3) database 2.1MB (4.2MB file size, but sparse). The run time of services_mkdb dropped from 2.2s before to 0.24s afterwards. For terminfo, the cdb version is 1.3MB compared to 2.1MB for db(3) (2.8MB file size, but sparse). The run time of tic -x dropped from 1.5s to 0.4s.
Code size is quite small as well, the reader itself is 5.2KB code and the writer 15KB. Compiled for AMD64, the reader is 1.2KB and the writer 3.6KB.
What is left before USE_DESTDIR=yes can be the default for everyone?
What is left before USE_DESTDIR=yes can be the default for everyone?
DESTDIR support is one of the biggest internal changes since the introduction of package barrier and it has some user visible consequences.
What is left to do before it can be made the default to keep the impact as small as possible?
A new name for a build phase, a new behavior for DESTDIR users and some black magic.
A new name for a build phase, a new behavior for DESTDIR users and some black magic.
When the DESTDIR support was originally introduced 3.5 years ago, I tried to keep it as non-intrusive in the infrastructure as possible.
The reason was simply, that it is very easy to introduce bugs in the complex make rules and DESTDIR support would need quite a bit of work before it is ready for main stream consumption. My initial goal for 2006Q4 was 50% or so I wrote in my EuroBSDCon slides. I was waaaay too optimistic.
Fast forward 3 years to the current time. DESTDIR support is now supported by almost all packages. 384 package locations without DESTDIR support remains (compared to 8302 with). To avoid unnecessary regressions from developers not testing it, I made USE_DESTDIR=yes the default for PKG_DEVELOPER a while ago. That's when the complains and insults started.
Due to the change, "make install" now suddenly doesn't modify /usr/pkg any longer. The other direct consequence is that "DEPENDS_TARGET=package" breaks as well.
The former is reasonable to fix by finally pushing the originally planed intrusive changes. The latter is more difficult to do without second guessing and more a case of Update your config, please. So, what was needed to change the install target to behave the same from a user perspective?
First of all, a bit of search and replace. Second, dealing with the surprises. Just renaming install to stage-install -- fails. The install-vars target has be renamed as well, as the necessary files are created. More fun was the addition of the new target. Just adding
.PHONY: install .if ${_USE_DESTDIR} == "no" install: stage-install .else install: package-install .endif
...doesn't work. It surprisingly does nothing. What happened?
pkgsrc computes a number of variables based on the dependencies. For example, BUILDLINK_PREFIX depends on whether the package is builtin and if not, where it is found. This got quite expensive when pkgviews was introduced. A long time ago, pkgsrc therefore run a new make instance for every major build phase (depends, extract, build, install). This was changed later with the introduction of the pkgsrc barrier. Essentially, the build phases are separated into pre-barrier and post-barrier phases, with a separate make invocation in between.
How does this affect the install target above? install has to be a post-barrier operation otherwise it ends up effectively invocing build, but not the desired install rule.
Black magic. Why do I know? The same problem occured with package-install a long time ago...
pbulk includes logic to avoid rebuilding packages that haven't changed based on the RCS ID ($NetBSD$).
A look at some problem cases.
pbulk includes logic to avoid rebuilding packages that haven't changed based on the RCS ID ($NetBSD$).
A look at some problem cases.
pbulk contains a script to determine whether a package has to be rebuilt or not. There are three checks performed by default:
The third condition is automatically valid after a bulk build, otherwise the system has time keeping or file system issues. The other conditions are more interesting.
The first condition can trigger a permanent rebuild if files are formed in a way that the +BUILD_INFO processor extracts incomplete or additional RCS IDs. The rdigest package for example had a patch starting with
@@ -41,8 +41,17 @@ __RCSID("$NetBSD: digest.c,v 1.15 2007/0
As the RCS IDs are extracted with a simple grep expression, this ended up overwriting the original RCS ID at the top.
Another example was url2pg, which contained the following statement in a local file:
print PLIST ("\@comment \$NetBSD\$\n");
This is picked up too and should be escape, in this case by splitting the string into two parts in the middle of NetBSD.
Interestingly, the second condition is violated in one case as well. p5-DBIx-Class-EncodedColumn depends on p5-Digest-SHA. The version of p5-Digest-SHA is higher than the version of the Perl package, so the dependency resolver picks up that. During the build, pkgsrc decides that perl is already present and good enough, so skips the dependency. I'm not sure how to best address this yet.
After the DESTDIR changes, emulators/handy_sdl started to fail in the bulk builds, even though it works fine when build by hand. A failure analysis.
After the DESTDIR changes, emulators/handy_sdl started to fail in the bulk builds, even though it works fine when build by hand. A failure analysis.
The bulk builds are a great way to find build issues. Sometimes they are pretty puzzling though, because they show issues that don't appear otherwise.
I recently added the full DESTDIR support for emulators/handy_sdl, after making sure that the do-install rules are fine. The package compiled fine during my tests, but failed in the next run of pbulk with a dependency error from gmake. This was very puzzling as it was still working when starting the build manually.
The first idea was to look for the same problem that broke some Emacs packages. The Emacs packages had been failing in the bulk build for the obvious reason with a rather impossible error message (-batch missing when it was clearly specified). After some digging it turned out that the Makefile used the variable BATCH, which is already set in bulk builds to prevent interactions e.g. when patches don't apply. Turned out to be a dead end.
After some scratching the head I tried to fully replicate the build and started it as unprivileged user. As pkgsrc allows building packages with PKG_DESTDIR_SUPPORT=user-destdir without privileges, pbulk naturally uses that too. That was an instant hit -- the build fails if done unprivileged. Looking at the tarball of the work area from the bulk build, it became obvious why. The src directory is not executable and therefore gmake couldn't scan it. Turns out, a single chmod -R a+x fixes the problem.
The hard-coded limit of 1024 mbuf clusters in the x86 GENERIC kernel is gone. The new limit is based on the available memory.
The hard-coded limit of 1024 mbuf clusters in the x86 GENERIC kernel is gone. The new limit is based on the available memory.
The NetBSD kernel uses special memory buffers (mbufs) for network operations. Storage for large packets is allocated as clusters, typically 2KB in size.
Since forever, the limit for the number of clusters was static. Depending on the architecture and the presence of the GATEWAY option, the kernel used at most 4MB for the various forms of network IO. This limit was easy to exhaust, e.g. by starting a Bittorrent client with file descriptor limit of 1024.
This limit exists to avoid exhausting the system memory by a remote attack. It couldn't be raised on most architectures, because the kernel reserved a fixed amount of virtual address space at boot time.
On some architectures, this is completely unnecessary, because memory pools as used for mbuf clusters, but also other common data structures, are using a special direct mapping. This means that any given physical memory address can be easily converted into a virtual address -- without having to modify the page tables.
Other architectures can just grow the kernel memory map or have a huge reservation for the normal kernel memory allocator. AMD64 is such an architecture. By default, the kernel reserves up to 1GB for internal allocations, so it can just allocation the address space for mbuf clusters from the same range.
Over all, removing the special kernel submap was an easy exercise, only two architectures needed special care. Both i386 and the ARM family lack direct mappings and share address space between kernel and userland. On i386, the kernel is allowed to use only 512MB, including mapping device memory and the like. An additional limit was therefore needed on this architectures.
As a result, kern.mbuf.nmbclusters can now be increased at run time with sysctl(8). Some limits are enforced to prevent resource starvation. Basically, at most 1/4 of all memory can be used for mbuf clusters. Performance concerns were raised for architectures without direct mapping due to higher lock contention on the kernel memory map, but the normal pool cache makes that a rare event.
20 pages of changes on http://pkgsrc.se:
20 pages of changes on http://pkgsrc.se:
To welcome the new year, I wanted to get rid of some of the ancient cruft. pkgsrc like many other systems tends to accumulate unmaintained software. In many cases, this software tends to be broken in the bulk builds, has special restrictions on redistribution or is a security nightmare.
For the post-2009Q1 cleanup, the basic criterions were:
Sadly, removing PHP completely is not an option, but it has been over a year since the final EOL of PHP 4 and therefore enough time to migrate the remaining code to a supported code base.
Java has similar issues. The Linux emulation for the older Sun JDK/JRE releases was often problematic. The licensing doesn't really help either, e.g. it is virtually impossible to build and distribute a JDK 1.5 as native code. With the advent of GPLed OpenJDK 7 the situation has improved dramatically.
PKG_DEVELOPER now implies USE_DESTDIR=yes by default.
This helps to catch bugs and simplifies updates.
PKG_DEVELOPER now implies USE_DESTDIR=yes by default.
This helps to catch bugs and simplifies updates.
After a long time working in the background, I have started the process of making USE_DESTDIR=yes the default behavior.
The first step was making developers use it by default.
This has a number of implications:
After a small round of changes with the help of tnn, "make replace" now fully works with USE_DESTDIR as well.
Before, it couldn't replace identical versions. The new pkg_add -U option explicitly handles that case.
What next? There are still slightly over 600 packages that don't support USE_DESTDIR.
I hope to further reduce this to 400 or so, at which point the current warning will turn into an error.
I plan to make USE_DESTDIR the default soon for all useres, "make replace" was the last major blocker.
Hopefully, in the second half of 2010 it will be required and support for USE_DESTDIR=no can be removed.