Root cause of USB boot failure: our xorriso used -e boot/grub/efi.img to embed the EFI image inside the ISO. This works for CD-ROM and QEMU but NOT for USB on real UEFI hardware. Fix: use the Will Haley / Debian live-build approach: - -append_partition 2 (GPT type EFI) appends efi.img AFTER ISO data - -e --interval:appended_partition_2:all:: references the appended partition - --mbr-force-bootable forces MBR active flag - grub-mkstandalone with embedded bootstrap config (searches for grub.cfg) - grub.cfg placed in both /boot/grub/ AND /EFI/BOOT/ on ISO - grub.cfg uses search --label ARCHIPELAGO to find the ISO root This is the exact approach used by StartOS, TAILS, and every production custom Debian live ISO that boots from USB. Also: iso-debug, iso-branding skills + reference docs, dev-start.sh option 0 for branding dev, improved dev-branding.sh and test-iso-qemu.sh. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
176 lines
6.9 KiB
Markdown
176 lines
6.9 KiB
Markdown
---
|
|
name: iso-debug
|
|
description: Diagnose and fix Archipelago ISO boot failures. Covers hybrid MBR/GPT, UEFI/BIOS boot chains, live-boot initramfs, GRUB/ISOLINUX configuration, xorriso packaging, and USB boot compatibility. Use when ISO doesn't boot, installer doesn't start, kernel panics, or USB isn't recognized by BIOS/UEFI.
|
|
allowed-tools: Bash, Read, Grep, Glob, Agent, Edit
|
|
---
|
|
|
|
# ISO Boot Debugging — Archipelago Custom Base
|
|
|
|
Systematic diagnosis of ISO boot failures for the Archipelago debootstrap-based installer.
|
|
|
|
## Architecture
|
|
|
|
The ISO boot chain has 5 stages. Failure at any stage has distinct symptoms:
|
|
|
|
| Stage | Component | Symptom if broken |
|
|
|-------|-----------|-------------------|
|
|
| 1. BIOS/UEFI recognition | Hybrid MBR + GPT | USB not in boot menu at all |
|
|
| 2. Bootloader | ISOLINUX (BIOS) or GRUB EFI (UEFI) | Black screen after selecting USB |
|
|
| 3. Kernel + initramfs | vmlinuz + initrd.img with live-boot | Kernel panic or initramfs shell |
|
|
| 4. Root filesystem | live-boot mounts filesystem.squashfs | "No root device" or blank screen |
|
|
| 5. Installer | systemd service + auto-install.sh | Boots to shell but no installer prompt |
|
|
|
|
## Stage 1: USB Not Recognized
|
|
|
|
**Most common cause**: Wrong MBR code in the ISO hybrid boot sector.
|
|
|
|
### Diagnosis
|
|
```bash
|
|
# Compare first 16 bytes of working vs broken ISO
|
|
xxd -l 16 working.iso
|
|
xxd -l 16 broken.iso
|
|
|
|
# Check for valid boot signature at offset 510
|
|
xxd -s 510 -l 2 broken.iso
|
|
# Must show: 55aa
|
|
```
|
|
|
|
### Known MBR codes
|
|
- `4552` — Debian Live MBR (extracted from Debian Live ISO). **Works on all tested hardware.**
|
|
- `33ed` — ISOLINUX package generic isohdpfx.bin. **Does NOT work on some UEFI hardware.**
|
|
|
|
### Fix
|
|
The project ships the proven MBR at `image-recipe/branding/isohdpfx.bin` (432 bytes, starts with `4552`).
|
|
Build script uses it via: `-isohybrid-mbr "$SCRIPT_DIR/branding/isohdpfx.bin"`
|
|
|
|
### xorriso flags that matter
|
|
- `-isohybrid-mbr <file>` — Embeds MBR code for USB hybrid boot
|
|
- `-isohybrid-gpt-basdat` — Adds GPT partition entry for EFI (REQUIRED for UEFI USB boot)
|
|
- `-partition_offset 16` — Reserves space for GPT table (REQUIRED — without this some UEFI firmware won't see the USB)
|
|
- `-eltorito-alt-boot -e boot/grub/efi.img -no-emul-boot` — EFI boot catalog entry
|
|
|
|
### Balena Etcher
|
|
Writes raw ISO to USB — no special formatting. If the ISO boots in QEMU but not on hardware, the MBR code is the issue, not Etcher.
|
|
|
|
## Stage 2: Bootloader Failure
|
|
|
|
### BIOS path: ISOLINUX
|
|
Required files in ISO: `isolinux/isolinux.bin`, `isolinux/ldlinux.c32`, `isolinux/boot.cat`
|
|
Config: `isolinux/isolinux.cfg`
|
|
|
|
### UEFI path: GRUB
|
|
Required files: `EFI/BOOT/BOOTX64.EFI`, `boot/grub/efi.img`, `boot/grub/grub.cfg`
|
|
The EFI image is a FAT32 filesystem containing the GRUB binary, built with:
|
|
```bash
|
|
grub-mkimage -O x86_64-efi -o BOOTX64.EFI -p /boot/grub \
|
|
part_gpt part_msdos fat iso9660 udf normal boot linux search \
|
|
search_fs_uuid search_fs_file search_label configfile echo cat \
|
|
ls test true loopback gfxterm gfxmenu font png all_video video \
|
|
video_bochs video_cirrus efi_gop efi_uga
|
|
```
|
|
**Critical**: `all_video`, `efi_gop`, `efi_uga` needed for display on real hardware.
|
|
|
|
### Diagnosis
|
|
```bash
|
|
# Mount ISO and verify files
|
|
sudo mount -o loop,ro broken.iso /mnt
|
|
ls -la /mnt/isolinux/
|
|
ls -la /mnt/EFI/BOOT/
|
|
cat /mnt/boot/grub/grub.cfg
|
|
cat /mnt/isolinux/isolinux.cfg
|
|
sudo umount /mnt
|
|
```
|
|
|
|
## Stage 3: Kernel / Initramfs
|
|
|
|
### live-boot
|
|
The initramfs must contain live-boot hooks. Without them, the kernel boots but can't find root.
|
|
|
|
**Kernel params required**: `boot=live components`
|
|
- `boot=live` — triggers live-boot's initramfs scripts
|
|
- `components` — tells live-boot to scan live/ for squashfs files
|
|
|
|
### Verify initramfs has live-boot
|
|
```bash
|
|
TMPDIR=$(mktemp -d)
|
|
unmkinitramfs /path/to/initrd.img $TMPDIR
|
|
# live-boot installs scripts/live as a FILE (not directory)
|
|
ls -la $TMPDIR/scripts/live # or $TMPDIR/main/scripts/live
|
|
file $TMPDIR/scripts/live # Should say "ASCII text"
|
|
```
|
|
|
|
### Common initramfs failures
|
|
1. **live-boot not installed**: debootstrap `--include` can't resolve its deps. Must install via `chroot apt-get` after debootstrap.
|
|
2. **Broken initramfs from container build**: `update-initramfs` needs `/proc`, `/sys`, `/dev` mounted in the chroot.
|
|
3. **scripts/live is a FILE not directory**: Verification code must use `[ -e ]` not `[ -d ]`.
|
|
|
|
## Stage 4: Root Filesystem
|
|
|
|
live-boot searches for squashfs files in `live/` on the boot media.
|
|
- Mounts boot media (USB/CDROM) at `/run/live/medium`
|
|
- Finds `live/filesystem.squashfs`
|
|
- Mounts it read-only, creates tmpfs overlay
|
|
- pivot_root into the combined root
|
|
|
|
### Diagnosis
|
|
If you get an initramfs shell prompt `(initramfs)`:
|
|
```bash
|
|
# Inside initramfs shell:
|
|
ls /run/live/medium/ # Is boot media mounted?
|
|
ls /run/live/medium/live/ # Is squashfs there?
|
|
cat /proc/cmdline # Does it have boot=live?
|
|
```
|
|
|
|
## Stage 5: Installer Not Starting
|
|
|
|
The installer auto-starts via:
|
|
1. Getty auto-login on tty1 (root, no password)
|
|
2. systemd service `archipelago-installer.service`
|
|
3. Wrapper script searches for boot media at: `/run/live/medium`, `/run/archiso`, `/cdrom`
|
|
|
|
### Diagnosis
|
|
If you get a shell but no installer prompt:
|
|
```bash
|
|
systemctl status archipelago-installer.service
|
|
cat /usr/local/bin/archipelago-start-installer
|
|
ls /run/live/medium/archipelago/auto-install.sh
|
|
```
|
|
|
|
## Quick Verification Checklist
|
|
|
|
Run against any ISO before flashing:
|
|
```bash
|
|
ISO=path/to/iso
|
|
MNT=$(mktemp -d)
|
|
sudo mount -o loop,ro $ISO $MNT
|
|
|
|
echo "=== MBR ===" && xxd -l 4 $ISO
|
|
echo "=== Boot sig ===" && xxd -s 510 -l 2 $ISO
|
|
echo "=== Files ===" && for f in live/vmlinuz live/initrd.img live/filesystem.squashfs isolinux/isolinux.bin EFI/BOOT/BOOTX64.EFI boot/grub/grub.cfg archipelago/auto-install.sh; do [ -e $MNT/$f ] && echo "OK: $f" || echo "MISSING: $f"; done
|
|
echo "=== Kernel params ===" && grep "boot=live" $MNT/boot/grub/grub.cfg && echo OK || echo MISSING
|
|
echo "=== live-boot ===" && INITRD=$(mktemp -d) && unmkinitramfs $MNT/live/initrd.img $INITRD 2>/dev/null && ([ -e $INITRD/scripts/live ] && echo "OK" || echo "MISSING")
|
|
|
|
sudo umount $MNT
|
|
```
|
|
|
|
## Key Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `image-recipe/build-auto-installer-iso.sh` | Main build script (~2600 lines) |
|
|
| `image-recipe/branding/isohdpfx.bin` | Proven MBR code (432 bytes) |
|
|
| `image-recipe/branding/grub-theme/` | GRUB theme (theme.txt + background.png) |
|
|
| `image-recipe/branding/plymouth-theme/` | Plymouth boot splash |
|
|
| `.gitea/workflows/build-iso-dev.yml` | CI workflow with smoke test |
|
|
| `image-recipe/test-iso-qemu.sh` | QEMU testing script |
|
|
| `image-recipe/dev-branding.sh` | Quick branding iteration (patch + repackage) |
|
|
|
|
## Infrastructure
|
|
|
|
| What | Where |
|
|
|------|-------|
|
|
| CI runner | gitea-runner.service on 192.168.1.228 |
|
|
| ISO builds | FileBrowser at http://192.168.1.228:8083 → Builds/ |
|
|
| Dev branch | dev-iso (separate CI: build-iso-dev.yml) |
|
|
| Main branch | main (CI: build-iso.yml) — DO NOT break |
|