Saturday, 28 September 2013

Why is it so difficult to boot .ISO files from a USB drive?

I was asked this question recently - Good question!  There is no 1 minute answer, but here was my reply (which may also help to explain how grub4dos works)...

Booting from an ISO is all about disk (Hard disk or Compact Disk) access. We read sectors of data from a disk into memory and then this data (code) is then run by the CPU.

The BIOS provides a Real Mode interface to read and write to sectors of a disk - the interface mechanism is via something called Interrupt 13h. For instance, this code is used to ask the BIOS to read 3 sectors (512 bytes per sector) into a specific memory address:

mov ah,2             ;move the value of 2 into register AH in the CPU, 2 means 'read sectors' in this case
mov al,3              ; 3=3 sectors
mov ch,10          ; track 10
mov cl,1             ; sector 1
mov dh,2            ; head 2
mov dl,80h         ;drive number 80h
mov bx,2000h     ;memory area to put data = 2000h
Int 13h               ; call the BIOS by generating an Interrupt 13h - the BIOS will read the sectors and put the data into memory at address 2000h

Note that Int13h originally (in IBM PC days!) only supported floppy disks (device 00h, 01h, 02h, 03h). Later, when hard disks (Winchesters!) were invented, it supported those (80h=hdd0, 81h=hdd1, etc. up to 9fh). Then later still, CDs were invented and people wanted to boot from them, so special boot code was added to BIOSes to allow device A0h and above to access CDs (but only if you asked the BIOS to boot from them - that is why if you boot to DOS from a floppy disk, it cannot access a CD without a CD driver being loaded).
Later still, USB drives were invented. If we boot from a USB hard disk, a BIOS will map a USB drive as device 80h = hdd0. The internal hard disk will be moved by the BIOS to respond to device 81h requests instead of the usual 80h id.

The most important thing to understand about booting Windows XP/Vista/7/8/WinPE and linux is that it boots in two (or more) stages:

1. Real mode - access to the source files on a disk is done using BIOS calls (Int 13h). DOS uses only this mode and so does Win95/98.
2. Protected mode - the operating system loads it's own driver code which accesses the disk controllers directly - the BIOS is not used because the BIOS only works in Real Mode and not protected mode. So linux/WindowsXP will load a 'kernel' and some mass storage drivers in Real mode (using BIOS calls) and then switch to protected mode. From this point onwards, the protected-mode drivers inside the OS access the hard disk and CD drives, etc. directly - if the wrong driver was loaded then the OS will not be able to access the hard disk that it booted from and you get a 'panic' or BSOD (usually 0x0000007b).

So now let's consider a linux OS booting from a 'flat' filesystem (i.e. not an ISO) - e.g. this could be a hard disk boot

1. Stage 1 - BIOS tells OS boot code that boot device is 80h - boot code loads the OS kernel code using BIOS calls to Int 13h (disk read/write interface to BIOS - hard disk 0 is device 80h)
2. Stage 2 - kernel switches to protected mode  - mount volumes
3. Stage 3 - directly access disk to load rest of the filesystem - more drivers, etc.

Stage 3 needs to have the correct drivers to access the hard disk (SATA/RAID/SCSI/IDE, etc.) and understand the drive filesystem (FAT16/FAT32/NTFS/ext2/ext3/ext4 etc.).

Now let's consider booting the same linux OS, but this time from a CD - this time we have some pre-boot things going on which the BIOS does for us to trick anything using Int 13h into thinking we are booting from a funny sort of hard disk:

pre-boot 1 - BIOS is asked to boot from the CD by user (e.g. F12 pressed or BIOS boot order preset)
pre-boot 2 - BIOS configures Int 13h so that any request made to device A0h reads sectors from the CD in the CD drive
1. Stage 1 - BIOS tells OS boot code that boot device is A0h - then boot code loads the kernel code using BIOS calls to Int 13h (disk read/write interface to BIOS requesting device=A0h)
2. Stage 2 - kernel switches to protected mode, accesses storage devices via drivers, mounts volumes, etc.
3. Stage 3 - kernel  loads rest of the files from the CD - more drivers, etc.

So you see that in Stage 3, the OS need to have drivers which can access CD controllers directly (which could be IDE or SATA devices) AND needs to understand how to read the filesystem on that CD/DVD.

Now consider what would happen if we have a USB drive that contains an ISO file.

BIOSes cannot 'mount' an ISO file or access an ISO file or know anything about ISO files, so we have to use a special boot loader which can 'mount' the ISO and load the files inside it into memory and then run the code in memory.
In this case we will use grub4dos.
grub4dos is a Real Mode boot loader. One thing it can do is 'map' the ISO file and add some code into the Int 13h BIOS code (patch the BIOS interrupt vector). For instance, we can tell grub4dos to map an ISO file as device A0h (any device between a0h and ffh is a 'CD/DVD' device). This means that if we then tell some OS boot code that it booted from device A0h, then the OS boot code will read from device A0h and boot just as if it was booting from a CD. We can do this in grub4dos with these commands  (comments in brackets):

title Boot from linux ISO     (this is the menu text)
map /linux.iso (0xA0)        (treat any access to device A0h as an access to this file - e.g. if boot code asks for sector 1, send back the first 512 bytes of the linux.iso file)
map --hook                      (patch the Int13h BIOS code so that this 'map' will take affect)
chainloader (0xa0h)         (load the first sectors of device A0h (i.e. the linux.iso file) into memory, set the boot device number to A0h and jump to that boot code)

As far as the linux boot code is aware, it is booting from device A0h  (i.e. a CD) and it will make calls to the BIOS Int13h to access device A0h and read any of the files it wants to from the 'CD'.

However, once the Stage 1 code of linux loads, it loads the linux kernel and drivers, etc. and then switches to protected mode. Now the protected mode drivers look at all the disk controllers and CD controllers in the system directly by accessing their I/O registers. It will look for all the storage devices it can find and 'mount' them as volumes (e.g. sda1, sda2, etc.). So now we have a problem, because we did not actually boot from a CD - we tricked the linux boot code - we actually booted from an ISO file on a USB drive called linux.iso - but there is no way that the linux boot code knows that - it needs to access more files from a mounted volume, but the only devices it has been able to mount as volumes are a USB drive and an internal HDD. Neither of these have the files on them it is looking for. So now the kernel 'panics' and it stops booting (typically we get a 'cannot find squashfs' message).

I hope this is clear enough for you so far because now it gets even more complicated! As you see, we can start to boot protected-mode OS's from an ISO using grub4dos, but as soon as it switches to protected mode we are LOST because it cannot continue booting as it cannot find the rest of the boot files on any volume that it can access! What we need to do is tell the OS the name of the ISO, so that it can mount the ISO as a volume and then load the rest of the OS files inside the ISO. With linux this is often done using specific 'cheat codes' or 'kernel parameters'.

title LinuxMint 8
map /boot/LinuxMint-8.iso (0xa0)                 - map ISO as device A0h
map --hook                                                - hook the Int 13h interrupt so it takes affect
root (0a0h)                                                 - set root device as A0h - now if we say /casper we are referring to (0xa0)/casper
kernel /casper/vmlinuz iso-scan/filename=/boot/LinuxMint-8.iso file=/cdrom/preseed/mint.seed boot=casper                                                                     - load vmlinuz into memory - add cheatcodes into memory
initrd /casper/initrd.lz                                   - load the initrd.lz file as the initial ramdrive
boot                                                           - jump to the loaded code to start the linux OS  (note 'boot' is not needed as grub4dos automatically adds boot as the last command anyway)

In this way, when linux switches to protected-mode, it can look for the named ISO file on any mounted volume, mounts it as a CD filesystem and then can access the files on the newly mounted volume (the files 'inside' the ISO).

This 'mechanism' is non-standard though and different linux's have different cheat codes (or simply don't support this mechanism at all). Some use the UUID of the drive as a 'marker' so they can find the correct device that has the ISO file. That is the downside of having 100's of separately developed linux distributions - they all  use different cheat codes (kernel parameters) and many do not even support booting via an ISO file!

In Easy2Boot, I use a special feature in grub4dos (the partnew command) which actually creates a new partition table entry and maps the start of that partition to the start of the ISO file - so when linux switches to protected mode, it sees a valid disk partition table entry, mounts it as a cdfs filesystem volume and then finds the files on the volume - so there is no need for any special cheat codes. This works for 99% of linux ISOs really well! For more details see here.

With Windows Install ISOs it gets more complicated...

For XP-based OS's, we typically load a protected-mode mass storage driver like WinVBlock, ImDisk or FiraDisk in Stage 1 text-mode setup and then pass on to the driver, the name of the ISO. Then in Stage 2 (protected-mode) the driver runs and finds the ISO file (either on the USB drive or in memory) and mounts it as a virtual CD/DVD.

For Vista+ OS's, Easy2Boot uses the fact that Windows PE will look for an AutoUnattend.xml file on a 'Removable' device, such as a USB Flash drive, when it boots - so E2B adds a command inside the xml file to cause Windows PE to load the Windows ISO as a virtual DVD drive. In this way, Windows Setup 'sees' a virtual DVD drive and does not complain about a 'missing device driver for a CD'.

If you read some of my tutorials in my site, you will get an idea of the various tricks that can be used to get over this problem of getting an OS to mount an ISO as a volume once it gets to protected mode.

If it was easy, everyone would have done it years ago!

linux isoboot cheat codes

Here is a list of the many different cheat codes used by different versions of linux (where %ISOSCAN% is the path of the ISO file and %UUID% is the UUID of the partition):

findiso=%ISOSCAN%   (may also need bootid=<volname>  boot=live and live-media-path=/xxx/yyy)

If you found this blog post useful, please tick one of the Reactions tick boxes below to let me know.