Unbreaking CVSup/amd64
When I recently upgraded my FreeBSD6/amd64
“-STABLE” machine, the CVSup
binary on the
system stopped working. CVSup
would dump core consistently,
shortly after connecting to the remote server. This rather
unwelcome development took away my ability to keep my CVS
trees upto-date; fixing the bug became top priority. The bug
also turned out to be an interesting one.
The bug
Running CVSup
after the upgrade to 6.3-PRERELEASE would
result in a core dump shortly after connection establishment.
The faulting instruction was trying to save SSE registers to memory; and this was odd since there was no reason for this particular code path to be using SSE registers in the first place.
Rebuilding Modula-3
and CVSup
from source did not fix the
core dump, though the builds of these tools themselves completed
without error. A search through the PR database revealed
that other FreeBSD users had also been tripped by the bug: PR
bin/124353.
A peek at the solution
Modula-3’s runtime needed to be patched in the following way to fix this fault.
-
First, in
$M3SRC/libs/m3core/src/unix/freebsd-4.amd64/Unix.i3
, we declare the Modula-3 functionUnix.fcntl()
as being implemented externally by C functionufcntl()
.... snip ... <*EXTERNAL "ufcntl"*> PROCEDURE fcntl (fd, request: int; arg: long): int; ... snip ...
-
Matching this declaration, an implementation of
ufcntl()
was provided in$M3SRC/libs/m3core/src/runtime/FBSD_AMD64/RTHeapDepC.c
:#include <fcntl.h> ... snip ... int ufcntl(int fd, int cmd, long arg) { return (fcntl(fd, cmd, arg)); }
On the surface, this “fix” does not seem to be doing
anything. The ufcntl()
entry point takes 3 arguments but it
passes these down to fcntl()
unchanged, and in the same order.
Yet, despite the apparent “no op”-like nature of the change, the core dumps were gone.
Why this works
To understand why this fix works, we have to delve into the ABI;
into the C calling conventions used for AMD64
code.
For normal function calls, the AMD64 calling convention passes
upto 6 integer arguments in registers. Thus register %rdi
would hold the first argument (fd
in our case), register
%rsi
the second, cmd
, register %rdx
the third and so
on. However, the C prototype for fcntl()
is: int fcntl(fd,
cmd, ...);
, i.e., fcntl
is a varargs function. Varargs
functions use a different calling convention on the AMD64:
register %rax
is a “hidden” input parameter for
these functions.
So, prior to the fix, the Modula-3
runtime was invoking
fcntl()
directly, but with registers set up for a non-varargs
function call.
Now, as it turns out, in FreeBSD 6.2 and earlier, fcntl()
in
libc was not a C language function; rather it was implemented as
an assembly language stub that invoked the SYS_fcntl
system
call. On the AMD64, FreeBSD’s argument passing convention for
system calls is close enough to the non-varargs C calling
convention that the processor’s registers happened to be
correctly setup for a direct system call.
When fcntl()
in libc was changed in FreeBSD™ 6-STABLE
on 24 Apr 2008 to be a C function instead of a system call,
things broke.
Though not obvious from just looking at the C code, the no-op like fix above works by using the C compiler to translate between the two calling conventions.
What’s worrying
The relevant change to libc was in CVS/SVN -HEAD
for about
20 days before it was merged to -stable. CVSup
is also a
critical tool for the FreeBSD project. This bug was however
only detected in -stable, and not in -current.