Start day: 09/07/2022\
- First level hook:
app-admin/keepassxc-2.7.1-r1
Password manager.
Dependencies:
Lots and lots of dependencies in the tree. Let's choose the first one:gnuconfig-20220508
- Second level hook:
sys-devel/gnuconfig-20220508
Updated config.sub and config.guess file from GNU.
Dependencies:
None.
ebuild #
Let's start by reviewing the ebuild.
SRC_URI="https://dev.gentoo.org/~sam/distfiles/${CATEGORY}/${PN}/${P}.tar.xz"
HOMEPAGE="https://savannah.gnu.org/projects/config"
LICENSE="GPL-3+-with-autoconf-exception"
Nothing else is remarkable here, it's just packaging, source preparation, testing and installation instructions. No patches are available for stable version (the one I use).
Lines of code: ~3655
Source code preparation #
Nothing unusual here as well,
wget https://dev.gentoo.org/~sam/distfiles/sys-devel/gnuconfig/gnuconfig-20220508.tar.xz
tar xf gnuconfig-20220508.tar.xz --one-top-level
rm gnuconfig-20220508.tar.xz
ChangeLog #
gitlog-to-changelog
is a perl script, distributed under the GPL3+ license. As the name suggests, it's a converter between two log formats. ChangeLog
and ChangeLog-old
files are also provided in the source tree.
[Break]: Read GPL3 License.
config.sub #
The remaining program consists of two parts - config.sub
and config.guess
. Starting with the config.sub
, it's a script for validating and canonicalizing a configuration triplet:
# The goal of this file is to map all the various variations of a given
# machine specification into a single specification in the form:
# CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM
# or in some cases, the newer four-part form:
# CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM
# It is wrong to echo any other type of specification.
It is distributed under the GPL 3.0 licence, or any later version.
Validation #
The string is splitted into components in the following way:
*-*-*-*-*
- Invalid configuration, more than four components*-*-*-*
-basic_machine=$field1-$field2
,basic_os=$field3-$field4
*-*-*
- Ambiguous whether COMPANY is present, or skipped and KERNEL-OS is two parts*-*
- Second component is usually, but not always the OS.basic-machine=$field1
orbasic-machine=$field1-unknown
forfield2
=zephyr*
.*
- Some configurations are single-component short-hands, for example for386bsd
,basic_machine=i386-pc
andbasic_os=bsd
.
Remark: It's so interesting to see so many machine types, being used to x86-64 for the lifetime. One day, I'll try to use GNU/Gentoo/Linux with another platform as well. :D
Then, some substitution are made, e.g.
pdp11-unknown)
vendor=dec
;;
i370-ibm*)
vendor=ibm
;;
xps-unknown | xps100-unknown)
cpu=xps100
vendor=honeywell
;;
x64 | amd64)
cpu=x86_64
vendor=pc
;;
for CPU and vendor. Then, similarly, some other validation rules are applied to operating systems (includes checking ABI and libc), and validates the OS-kernel combination, for example -dietlibc*
is not valid because it's just a libc implementation, and requires a kernel.
Remark: Wow, never actually heard about diet libc. Some meta-information about this minimal libc implementation mostly for embedded devices: Stable release date: September 24, 2018. Supported platforms: Alpha, ARM, PA-RISC, ia64, i386, MIPS, s390, sparc, PowerPC. Actually, there is also DietLinux. It is a boot floppy based on the diet libc. Two alternatives are dnetc linux, and Pauls Boot CD. I think I'll test it one day, sounds pretty interesting.
If CPU and and OS are known, but not the manufacturer, the logical manufacturer is picked, for example:
*-beos*)
vendor=be
;;
*-genix*)
vendor=ns
;;
s390-* | s390x-*)
vendor=ibm
;;
In the end, the canonicalized configuration is echoed, and the program terminates:
echo "$cpu-$vendor-${kernel:+$kernel-}$os"
exit
If the initial configuration wasn't validated successfully on a step N during the script execution, the script terminates at step N+1.
config.guess #
Another part of gnuconfig
is config.guess
. Originally written by Per Bothner; maintained since 2000 by Ben Elliston. Similarly, the license of config.guess
is GPL 3+. It is used for system detection.
It's possible to disable some shellcheck features for systems with pre-POSIX /bin/sh by uncommenting the following line: # shellcheck disable=SC2006,SC2268
. SC2006 checks whether the $(...)
(correct) notation is used instead of legacy backticked `...` (legacy). SC2268 warns if x-prefix (x-hack) is used: e.g. [ “x$var” = “xval” ]
(legacy). Throughout the code, a few other spellchecks are disabled.
System Detection #
One thing which aids system detection is compiler, used by this script: HOST_CC
(deprecated) / CC_FOR_BUILD
.
The detection starts with executing uname
:
UNAME_MACHINE=`(uname -m) 2>/dev/null` || UNAME_MACHINE=unknown
UNAME_RELEASE=`(uname -r) 2>/dev/null` || UNAME_RELEASE=unknown
UNAME_SYSTEM=`(uname -s) 2>/dev/null` || UNAME_SYSTEM=unknown
UNAME_VERSION=`(uname -v) 2>/dev/null` || UNAME_VERSION=unknown
In case the system name includes Linux
or GNU
, LIBC
is next detected:
case $UNAME_SYSTEM in
Linux|GNU|GNU/*)
LIBC=unknown
set_cc_for_build
cat <<-EOF > "$dummy.c"
#include <features.h>
#if defined(__UCLIBC__)
LIBC=uclibc
#elif defined(__dietlibc__)
LIBC=dietlibc
#elif defined(__GLIBC__)
LIBC=gnu
#else
#include <stdarg.h>
/* First heuristic to detect musl libc. */
#ifdef __DEFINED_va_list
LIBC=musl
#endif
#endif
EOF
cc_set_libc=`$CC_FOR_BUILD -E "$dummy.c" 2>/dev/null | grep '^LIBC' | sed 's, ,,g'`
eval "$cc_set_libc"
# Second heuristic to detect musl libc.
if [ "$LIBC" = unknown ] &&
command -v ldd >/dev/null &&
ldd --version 2>&1 | grep -q ^musl; then
LIBC=musl
fi
# If the system lacks a compiler, then just pick glibc.
# We could probably try harder.
if [ "$LIBC" = unknown ]; then
LIBC=gnu
fi
;;
esac
Remark: And once again, the musl preprocessor debate shows up. Just one macro could avoid the hacks above.
Case Branches #
Depending on the configuration, yielded by uname, different case branches (not exclusive) are executed. Here is a few examples of the preconditions:
case $UNAME_MACHINE:$UNAME_SYSTEM:$UNAME_RELEASE:$UNAME_VERSION in
*:NetBSD:*:*)
# ...
;;
*:Redox:*:*)
;;
alpha:OSF1:*:*)
;;
Tek43[0-9][0-9]:UTek:*:*) # Tektronix 4300 system running UTek (BSD)
;;
*:NetBSD:*:*
#
For the resulting guess, the CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM
form is used: GUESS=$machine-${os}${release}${abi-}
. Depending on whether the system supports ELF object format, os
might be set to netbsd
or netbsdelf
.
Vendor can't be deduced, so it's always set to unknown, e.g.:
case $UNAME_MACHINE_ARCH in
aarch64eb) machine=aarch64_be-unknown ;;
armeb) machine=armeb-unknown ;;
sh3el) machine=shl-unknown ;;
For Debian GNU/NetBSD machines release
is set to -gnu
. Otherwise, it is set to echo "$UNAME_RELEASE" | sed -e 's/[-_].*//' | cut -d. -f1,2
.
*:SecBSD:*:*
and other uncommon systems... #
For them, GUESS
is set to the following:
UNAME_MACHINE_ARCH=`arch | sed 's/SecBSD.//'`
GUESS=$UNAME_MACHINE_ARCH-unknown-secbsd$UNAME_RELEASE
secbsd
is replaced by libertybsd
for LibertyBSD, sortix
for Sortix, twizzler
for Twizzler and so on.
Special cases #
Some special cases are treated separately, for example, for mips:OSF1:*.*
, GUESS
is set to mips-dec-osf1
.
x86_64:Linux:*:*
#
Here is the case of mine and most other machines utilizing gnuconfig
. The full block of code is the following:
x86_64:Linux:*:*)
set_cc_for_build
CPU=$UNAME_MACHINE
LIBCABI=$LIBC
if test "$CC_FOR_BUILD" != no_compiler_found; then
ABI=64
sed 's/^ //' << EOF > "$dummy.c"
#ifdef __i386__
ABI=x86
#else
#ifdef __ILP32__
ABI=x32
#endif
#endif
EOF
cc_set_abi=`$CC_FOR_BUILD -E "$dummy.c" 2>/dev/null | grep '^ABI' | sed 's, ,,g'`
eval "$cc_set_abi"
case $ABI in
x86) CPU=i686 ;;
x32) LIBCABI=${LIBC}x32 ;;
esac
fi
GUESS=$CPU-pc-linux-$LIBCABI
;;
CPU
is set to $UNAME_MACHINE
, LIBCABI
to $LIBC
(or ${LIBC}x32
for x32 systems). Then, ABI is determined, and finally the GUESS
is set to $CPU-pc-linux-$LIBCABI
.
Output #
After all cases are processed, if the GUESS
is not an empty string, it is outputted and the script terminates.
if test "x$GUESS" != x; then
echo "$GUESS"
exit
fi
Otherwise, the script resorts to compiler aid. A few extra heuristics are performed to detect some systems. A few examples:
main ()
{
#if defined (MULTIMAX) || defined (n16)
#if defined (UMAXV)
printf ("ns32k-encore-sysv\n"); exit (0);
#else
#if defined (CMU)
printf ("ns32k-encore-mach\n"); exit (0);
#else
printf ("ns32k-encore-bsd\n"); exit (0);
#endif
#endif
#endif
#if defined (__386BSD__)
printf ("i386-pc-bsd\n"); exit (0);
#endif
#if defined (sequent)
#if defined (i386)
printf ("i386-sequent-dynix\n"); exit (0);
#endif
#if defined (ns32000)
printf ("ns32k-sequent-dynix\n"); exit (0);
#endif
#endif
/*...*/
exit (1);
}
Failure to Recognize the System #
In case config.guess
fails to recognize the system, the following text is outputted:
This script (version $timestamp), has failed to recognize the
operating system you are using. If your script is old, overwrite *all*
copies of config.guess and config.sub with the latest versions from:
https://git.savannah.gnu.org/cgit/config.git/plain/config.guess
and
https://git.savannah.gnu.org/cgit/config.git/plain/config.sub
EOF
our_year=echo $timestamp | sed 's,-.*,,'
thisyear=date +%Y
# shellcheck disable=SC2003
script_age=expr "$thisyear" - "$our_year"
if test "$script_age" -lt 3 ; then
cat >&2 <<EOF
If $0 has already been updated, send the following data and any
information you think might be pertinent to config-patches@gnu.org to
provide the necessary information to handle your system.
config.guess timestamp = $timestamp
uname -m = (uname -m) 2>/dev/null || echo unknown
uname -r = (uname -r) 2>/dev/null || echo unknown
uname -s = (uname -s) 2>/dev/null || echo unknown
uname -v = (uname -v) 2>/dev/null || echo unknown
/usr/bin/uname -p = (/usr/bin/uname -p) 2>/dev/null
/bin/uname -X = (/bin/uname -X) 2>/dev/null
hostinfo = (hostinfo) 2>/dev/null
/bin/universe = (/bin/universe) 2>/dev/null
/usr/bin/arch -k = (/usr/bin/arch -k) 2>/dev/null
/bin/arch = (/bin/arch) 2>/dev/null
/usr/bin/oslevel = (/usr/bin/oslevel) 2>/dev/null
/usr/convex/getsysinfo = (/usr/convex/getsysinfo) 2>/dev/null
UNAME_MACHINE = "$UNAME_MACHINE"
UNAME_RELEASE = "$UNAME_RELEASE"
UNAME_SYSTEM = "$UNAME_SYSTEM"
UNAME_VERSION = "$UNAME_VERSION"
EOF
fi
exit 1
2022-09-13
7 days passed since start.