Linux From Scratch: Fun, Rewarding, Occasionally Confusing. (Parts 1-2)

The authors of Linux from Scratch (LFS) recently celebrated the release of version 8.4 of their book on constructing a working Linux system entirely from source code. I’ve been meaning to dive into this project for a while and figured this would be a good chance to provide a first-hand report of my experiences.

Linux from Scratch is an excellent educational opportunity at a time when desktop Linux is continually becoming more user-friendly and abstracted through a never-ending stream of window managers and desktop environments disguising its internals. Gone are the days of Ncurses (or text mode) installers and arcane error messages; most mainstream distributions are as quick and simple as a Windows or OS X install. Though this is welcome in a practical sense, we lose the customizability and insight we once had in configuring a system from its component parts.

Conversely, LFS deals in the business of compiling, configuring, and text-file editing, stripping away the creature conveniences of a desktop-oriented distro. Even the source packages themselves are manually downloaded using wget – we’re provided only with a package list and instructions to gather them from the appropriate locations.

The familiarity the reader gains with the Linux command-line ecosystem as a result of this manual configuration process is one of LFS’s major strengths. Even as a veteran Linux desktop user, there are many components with which most of us will never interact on a day-to-day basis. Non-developers don’t need to care about GCC, correct? What even IS binutils? No need to worry, this book will answer these questions in more-than-sufficient detail.

There’s also a large amount of insight to be gained on the structural challenges of the build process. The reader will learn of concepts such as circular dependencies (“to compile a compiler, you need a compiler”) and testing package compilation to ensure the preceding configuration was completed correctly. Additionally, the book discusses multiple “passes” or iterations of package builds as the associated toolchain and libraries change. Our first passes pull from the host system entirely, while later passes incorporate libraries and tools built to comprise part of the eventual LFS system. As a significant portion of the packages installed are themselves an essential part of the build process, we get to see how they function and rely on each other along the way.

LFS is also a great tool to help differentiate the core and commonalities of Linux vs. the GNU/userspace utilities and extra functionality and differences afforded between the myriad desktop distributions. It makes it easier to understand the absolute essentials of a Linux system, as well as the niceties and redundancies that can be otherwise dispatched based on the user’s needs. Like the vast majority of distributions, LFS follows the POSIX standard (for system design), FHS (for a unified filesystem hierarchy), and LSB (for core Linux components – Core, Desktop, Runtime Languages, and Imaging).

As a foreword, it’ll quickly become apparent that this isn’t a good project for an absolute Linux beginner. Some working knowledge of basic file management commands (cd, ls, rm, mv, cp, and tar, for instance) will be required, as well as a bit of intuitive knowledge of why errors can occur. The project is well-supported through mailing lists and IRC (irc.freenode.net #LFS-support), and I was able to easily get help for any issues I saw, but there were also a few minor problems that I was able to solve on my own. I’ve left most of these out of these examples, as they usually related to typos (my own – the book is impeccably edited), forgetting to set environment variables, or executing commands from an improper directory for the task.

Additionally, due to the length of this process and number of steps involved, it’s difficult to write about this project without it seeming like a log. I’ll try and avoid this where possible, and won’t be including much in the way of commands/output. Please don’t think of these articles as an instructional guide – the LFS book is the go-to resource, and its processes should be followed verbatim.

The book is issued in versions for both the sysvinit and systemd init systems, but I will be working with the latter due to its widespread adoption within most modern distributions. I should also note that there are PDF and HTML versions of the text. Having experience with both, I’d recommend the HTML route, as the commands can be a little more difficult to copy-and-paste from the PDF version.

I will be building x86_64 “pure” with no 32-bit support, though there are also instructions available for PowerPC and ARM. My host system (or VM, more accurately) is running Ubuntu Server 18.04 .

Polishing Up The Host

LFS is cross-compiled from an existing Linux distribution. As the build process progresses, the new Linux system is gradually separated from its host environment until it reaches the point of being bootable on its own.

With this in mind, ensuring the host’s software packages are up to date is an important first step to building an LFS system. Fortunately, the author includes a quick version-checking script included at the beginning of the book’s second chapter to ensure the host machine is up to speed.

I ran this script on my host and received the following results –

bash, version 4.4.19(1)-release

/bin/sh -> /bin/dash

ERROR: /bin/sh does not point to bash

Binutils: (GNU Binutils for Ubuntu) 2.30

bison (GNU Bison) 3.0.4

/usr/bin/yacc -> /usr/bin/bison.yacc

bzip2, Version 1.0.6, 6-Sept-2010.

Coreutils: 8.28

diff (GNU diffutils) 3.6

find (GNU findutils) 4.7.0-git

GNU Awk 4.1.4, API: 1.1 (GNU MPFR 4.0.1, GNU MP 6.1.2)

/usr/bin/awk -> /usr/bin/gawk

gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0

./version-check.sh: line 31: g++: command not found

(Ubuntu GLIBC 2.27-3ubuntu1) 2.27

grep (GNU grep) 3.1

gzip 1.6

Linux version 4.15.0-45-generic (buildd@lgw01-amd64-031) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019

m4 (GNU M4) 1.4.18

GNU Make 4.1

GNU patch 2.7.6

Perl version=’5.26.1′;

Python 3.6.7

sed (GNU sed) 4.4

tar (GNU tar) 1.29

./version-check.sh: line 43: makeinfo: command not found

xz (XZ Utils) 5.2.2

./version-check.sh: line 45: g++: command not found

g++ compilation failed

The fun begins! Most of our software is up to date version-wise, but a few important dependencies are missing.

Believe it or not, the alias of /bin/sh to /bin/dash isn’t a typo – dash is a shell used in modern Debian/Ubuntu derivatives as a quicker bash replacement. However, we’re required to fix the symbolic link to point to /bin/bash for the purpose of building LFS.

root@menikmati:~# rm /bin/sh

root@menikmati:~# ln -s /bin/bash /bin/sh

We’re also missing g++ and makeinfo. These dependencies were resolved with the apt install g++ and apt install texinfo commands, respectively.

A bare minimum install of LFS is around 6GB, though it’ll be helpful to have some extra space handy when building our system. Though the book goes into creating multiple partitions and mount points (root, /home, /boot, etc.), I settled for a single root partition of 15GB and a 2GB swap partition, as I don’t require the system for day-to-day use and wanted to keep the partition layout as simple as possible. I then formatted and mounted both partitions.

Next, the LFS enviroment variable is set. This variable points to the folder in which LFS will be built on the host system. In my case, I had previously mounted my new root partition to /mnt/LFS, and set the environment variable to reflect this location.

With all that out of the way, it’s time to download packages! LFS recommends specific version numbers and provides a handy wget-list for the purpose. This took about 3 minutes on my test VM over Wi-Fi.

Once the packages are downloaded, we finalize our preparation by creating a tools directory within our LFS directory and symlinking it to /tools. To prevent accidental changes to the host system, a lfs user is created and given ownership of the LFS directories. Environment variables are added to the lfs user’s .bashrc and .bash_profile to support the build process.

First Builds

As build time can vary drastically depending on the host system, LFS uses a term called SBU (Standard Build Unit) to express how long a package takes to compile and install. binutils (the first package to be installed) takes 1 SBU to complete. The binutils build is timed by the user, then all other build times in the book are listed relative to the build time for binutils.

The instructions also touch upon “test suites” – a set of checks to determine a package built and installed itself correctly. These are largely skipped in this section, as dependencies are not yet in place for most of the tests.

As mentioned in the introduction, we start by building a minimal system and toolchain to construct a final LFS system. The toolchain includes a compiler, assembler, linker, libraries, and utilities that are used to build the other tools. This toolchain is isolated from the eventual “final” system and stored in $LFS/tools.

Before we begin building packages, we need to identify our “Target Triplet” or name of the working platform. We’ll also need the name of the platform’s dynamic linker/loader, not to be confused with the standard linker (ld) that is part of binutils. For my 64-bit system, this should be ld-linux-x86-64.so.2, though you can check with readelf -l <name of binary> | grep interpreter.

Our first packages are binutils (a cross-linker) and GCC (a cross-compiler). Temporary libraries are cross-compiled, while we also set the GCC source to tell the compiler which target dynamic linker will be used.

It was at this point that I hit my first challenge of the project, as GCC failed to build properly due to a missed option in the pre-build configuration process. After racking my brain with no success, I reached out to the #lfs-support channel on Freenode and promptly got the help I needed.

Next, the sanitized Linux API headers are installed, allowing Glibc to interface with the features the kernel will provide.

Glibc is then installed, referencing the compiler, binary tools, and kernel headers already present.

This is followed by a second pass of binutils referencing the proper library search path to be used by ld. I hit another snag at this point, as I failed to alias /tools/lib to /tools/lib64. In this case, I was able to identify and resolve the issue by searching the linuxquestions.org message board.

GCC then sees a second pass with its sources modified to reference the newly-installed dynamic linker rather than the one present on the host system. Once this is complete, the core toolchain is self-contained and self-hosted and is no longer reliant on the host.

From here, we build a series of utilties – TCL, Expect, DejaGNU, NCurses, bash, bison, nzip2, coreutils, diffutils, file, Findutils, gawk, gettext, grep, gzip, make, patch, perl, python, sed, tar, and xz. These all installed smoothly without much in the way of additional configuraton.

Once the packages are installed, we then strip unneeded debugging symbols, documentation, and temporary build directories to reclaim disk space.

Finally, ownership permissions for the $LFS/tools directory are assigned to the root user, as our work with the lfs user is complete. My next article will cover Part III of LFS, in which we chroot into the new system and use our toolchain to continue building the LFS system.