Dorian 6cfb8082c5 fix: xorriso append_partition for real USB boot + grub-mkstandalone
Root cause of USB boot failure: our xorriso used -e boot/grub/efi.img
to embed the EFI image inside the ISO. This works for CD-ROM and QEMU
but NOT for USB on real UEFI hardware.

Fix: use the Will Haley / Debian live-build approach:
- -append_partition 2 (GPT type EFI) appends efi.img AFTER ISO data
- -e --interval:appended_partition_2:all:: references the appended partition
- --mbr-force-bootable forces MBR active flag
- grub-mkstandalone with embedded bootstrap config (searches for grub.cfg)
- grub.cfg placed in both /boot/grub/ AND /EFI/BOOT/ on ISO
- grub.cfg uses search --label ARCHIPELAGO to find the ISO root

This is the exact approach used by StartOS, TAILS, and every production
custom Debian live ISO that boots from USB.

Also: iso-debug, iso-branding skills + reference docs, dev-start.sh
option 0 for branding dev, improved dev-branding.sh and test-iso-qemu.sh.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:34:29 +00:00

6.9 KiB

name, description, allowed-tools
name description allowed-tools
iso-debug Diagnose and fix Archipelago ISO boot failures. Covers hybrid MBR/GPT, UEFI/BIOS boot chains, live-boot initramfs, GRUB/ISOLINUX configuration, xorriso packaging, and USB boot compatibility. Use when ISO doesn't boot, installer doesn't start, kernel panics, or USB isn't recognized by BIOS/UEFI. Bash, Read, Grep, Glob, Agent, Edit

ISO Boot Debugging — Archipelago Custom Base

Systematic diagnosis of ISO boot failures for the Archipelago debootstrap-based installer.

Architecture

The ISO boot chain has 5 stages. Failure at any stage has distinct symptoms:

Stage Component Symptom if broken
1. BIOS/UEFI recognition Hybrid MBR + GPT USB not in boot menu at all
2. Bootloader ISOLINUX (BIOS) or GRUB EFI (UEFI) Black screen after selecting USB
3. Kernel + initramfs vmlinuz + initrd.img with live-boot Kernel panic or initramfs shell
4. Root filesystem live-boot mounts filesystem.squashfs "No root device" or blank screen
5. Installer systemd service + auto-install.sh Boots to shell but no installer prompt

Stage 1: USB Not Recognized

Most common cause: Wrong MBR code in the ISO hybrid boot sector.

Diagnosis

# Compare first 16 bytes of working vs broken ISO
xxd -l 16 working.iso
xxd -l 16 broken.iso

# Check for valid boot signature at offset 510
xxd -s 510 -l 2 broken.iso
# Must show: 55aa

Known MBR codes

  • 4552 — Debian Live MBR (extracted from Debian Live ISO). Works on all tested hardware.
  • 33ed — ISOLINUX package generic isohdpfx.bin. Does NOT work on some UEFI hardware.

Fix

The project ships the proven MBR at image-recipe/branding/isohdpfx.bin (432 bytes, starts with 4552). Build script uses it via: -isohybrid-mbr "$SCRIPT_DIR/branding/isohdpfx.bin"

xorriso flags that matter

  • -isohybrid-mbr <file> — Embeds MBR code for USB hybrid boot
  • -isohybrid-gpt-basdat — Adds GPT partition entry for EFI (REQUIRED for UEFI USB boot)
  • -partition_offset 16 — Reserves space for GPT table (REQUIRED — without this some UEFI firmware won't see the USB)
  • -eltorito-alt-boot -e boot/grub/efi.img -no-emul-boot — EFI boot catalog entry

Balena Etcher

Writes raw ISO to USB — no special formatting. If the ISO boots in QEMU but not on hardware, the MBR code is the issue, not Etcher.

Stage 2: Bootloader Failure

BIOS path: ISOLINUX

Required files in ISO: isolinux/isolinux.bin, isolinux/ldlinux.c32, isolinux/boot.cat Config: isolinux/isolinux.cfg

UEFI path: GRUB

Required files: EFI/BOOT/BOOTX64.EFI, boot/grub/efi.img, boot/grub/grub.cfg The EFI image is a FAT32 filesystem containing the GRUB binary, built with:

grub-mkimage -O x86_64-efi -o BOOTX64.EFI -p /boot/grub \
    part_gpt part_msdos fat iso9660 udf normal boot linux search \
    search_fs_uuid search_fs_file search_label configfile echo cat \
    ls test true loopback gfxterm gfxmenu font png all_video video \
    video_bochs video_cirrus efi_gop efi_uga

Critical: all_video, efi_gop, efi_uga needed for display on real hardware.

Diagnosis

# Mount ISO and verify files
sudo mount -o loop,ro broken.iso /mnt
ls -la /mnt/isolinux/
ls -la /mnt/EFI/BOOT/
cat /mnt/boot/grub/grub.cfg
cat /mnt/isolinux/isolinux.cfg
sudo umount /mnt

Stage 3: Kernel / Initramfs

live-boot

The initramfs must contain live-boot hooks. Without them, the kernel boots but can't find root.

Kernel params required: boot=live components

  • boot=live — triggers live-boot's initramfs scripts
  • components — tells live-boot to scan live/ for squashfs files

Verify initramfs has live-boot

TMPDIR=$(mktemp -d)
unmkinitramfs /path/to/initrd.img $TMPDIR
# live-boot installs scripts/live as a FILE (not directory)
ls -la $TMPDIR/scripts/live    # or $TMPDIR/main/scripts/live
file $TMPDIR/scripts/live      # Should say "ASCII text"

Common initramfs failures

  1. live-boot not installed: debootstrap --include can't resolve its deps. Must install via chroot apt-get after debootstrap.
  2. Broken initramfs from container build: update-initramfs needs /proc, /sys, /dev mounted in the chroot.
  3. scripts/live is a FILE not directory: Verification code must use [ -e ] not [ -d ].

Stage 4: Root Filesystem

live-boot searches for squashfs files in live/ on the boot media.

  • Mounts boot media (USB/CDROM) at /run/live/medium
  • Finds live/filesystem.squashfs
  • Mounts it read-only, creates tmpfs overlay
  • pivot_root into the combined root

Diagnosis

If you get an initramfs shell prompt (initramfs):

# Inside initramfs shell:
ls /run/live/medium/          # Is boot media mounted?
ls /run/live/medium/live/     # Is squashfs there?
cat /proc/cmdline             # Does it have boot=live?

Stage 5: Installer Not Starting

The installer auto-starts via:

  1. Getty auto-login on tty1 (root, no password)
  2. systemd service archipelago-installer.service
  3. Wrapper script searches for boot media at: /run/live/medium, /run/archiso, /cdrom

Diagnosis

If you get a shell but no installer prompt:

systemctl status archipelago-installer.service
cat /usr/local/bin/archipelago-start-installer
ls /run/live/medium/archipelago/auto-install.sh

Quick Verification Checklist

Run against any ISO before flashing:

ISO=path/to/iso
MNT=$(mktemp -d)
sudo mount -o loop,ro $ISO $MNT

echo "=== MBR ===" && xxd -l 4 $ISO
echo "=== Boot sig ===" && xxd -s 510 -l 2 $ISO
echo "=== Files ===" && for f in live/vmlinuz live/initrd.img live/filesystem.squashfs isolinux/isolinux.bin EFI/BOOT/BOOTX64.EFI boot/grub/grub.cfg archipelago/auto-install.sh; do [ -e $MNT/$f ] && echo "OK: $f" || echo "MISSING: $f"; done
echo "=== Kernel params ===" && grep "boot=live" $MNT/boot/grub/grub.cfg && echo OK || echo MISSING
echo "=== live-boot ===" && INITRD=$(mktemp -d) && unmkinitramfs $MNT/live/initrd.img $INITRD 2>/dev/null && ([ -e $INITRD/scripts/live ] && echo "OK" || echo "MISSING")

sudo umount $MNT

Key Files

File Purpose
image-recipe/build-auto-installer-iso.sh Main build script (~2600 lines)
image-recipe/branding/isohdpfx.bin Proven MBR code (432 bytes)
image-recipe/branding/grub-theme/ GRUB theme (theme.txt + background.png)
image-recipe/branding/plymouth-theme/ Plymouth boot splash
.gitea/workflows/build-iso-dev.yml CI workflow with smoke test
image-recipe/test-iso-qemu.sh QEMU testing script
image-recipe/dev-branding.sh Quick branding iteration (patch + repackage)

Infrastructure

What Where
CI runner gitea-runner.service on 192.168.1.228
ISO builds FileBrowser at http://192.168.1.228:8083 → Builds/
Dev branch dev-iso (separate CI: build-iso-dev.yml)
Main branch main (CI: build-iso.yml) — DO NOT break