I am pretty good about keeping backups. (I learned the hard way.) Yet, due to a variety of problems, I ended up not being backed up for about two weeks at the end of November 2002.
I was sitting in front of my PowerBook G4. The new backup media was due to arrive tomorrow. I casually unmounted my external FireWire disks, and moved to the living room to watch a DVD.
An hour and a half later, I plugged the external disks back into the laptop.
What do you mean, “Mac OS X doesn’t recognize volumes”?
The dawning comprehension
A trip to Mac OS 9, and I knew that both my external disks had an “error Disk First Aid cannot repair”, that “The directory was too severely damaged for DiskWarrior to repair it”, and that “Norton Utilities encountered an error #-39 and cannot continue”.
Not good.
The program that had the most useful error message was DiskWarrior. Not because I know what directory structure is, nor because I particularly wanted to hear “too severely damaged” that day. It was because it told me to email their support and they might be able to help me.
Alsoft Tech Support
So I did, and they told me to download a copy of Sedit.
Then we got on the phone, and they walked me through looking at various sections of my disk, trying to figure out where the missing bits of the data were.
They weren’t there. It was clear that Mike Rodgers, the Alsoft tech support guy, was doing all he could, but that my disk was beyond his ability to resurrect dead disks.
The optimism
I thanked Mike, hung up, and started thinking about my options. I could wipe the disks and restore from 2-week old backups. I would lose a few days’ worth of code (everything else was in CVS). A few things here and there. Nothing I couldn’t bear to part with.
However, there was one thing that really kept me thinking. When I was on the phone with Alsoft and we looked at the damaged part of the disk, a lot of things in that area seemed valid. It’s just that a whole lot of really important bits were completely bogus.
Hope springs eternal; I figured there’s a slim chance that I can figure out which few bits were trashed, and recover them.
The happy ending
I spent two days reading HFS+ format technote. I analyzed the extent and the nature of the damage, and repaired it by hand. I got all my data back without any problems.
I emerged victorious, and came to two conclusions:
- Blame most likely falls on Mac OS X. It either caused the damage, or failed to prevent someone else from causing the damage; either way, the operating system should not do this to my disk.
- This, therefore, may happen to other people.
And hence I present you with…
The super-scary instructions for fixing your trashed volume headers
I cannot emphasize this enough: if you do something wrong while following these instructions, you can easily trash your data beyond repair. Even beyond my ability to repair it. If you don’t know what you are doing, stop here.
The tools
You will need the following tools:
- Any hexadecimal calculator. I used MacsBug, but that’s really heavy-weight. Find one that you like.
- Sedit
All numbers below here are hexadecimal, unless noted otherwise.
Finding your HFS+ volume headers
The volume header is the part of your disk which tells Mac OS all sorts of valuable information about the location of the data on the disk. The first step is to find the volume headers to determine whether they are damaged in a way that can be repaired with what I learned.
Start up your computer in Mac OS 9 with the damaged FireWire disk unplugged. Launch Sedit and plug in the disk. From the File menu in Sedit, choose Open Disk Thru Driver. Sedit will present you with a list of disks, one of which will be listed with either the name of your damaged disk or “Macintosh HD”. It is likely to be the last disk in the list. Choose that disk from the list and click OK.
An Sedit window will open showing you a dump of the data on your disk.
From the Block menu, choose Read Block Number, enter the number 2 in the dialog that appears, and click OK.
From the Template menu, choose HFS MDB. The data dump in the window will be labelled and rearranged. Write down the values of the following fields: Sig
,
ABlkSz
, AlBlSt
, VCSize
, VCBM
, CtlCSz
.
If Sig
has a value other than 4244
, or VCSize
has value other than 482B
, check that “Displaying Block” in the top left corner of the window claims you
are looking at block number 2. If it does, stop; these instructions are no good to you. If it doesn’t, start over — you are looking at the wrong block.
Next, calculate the following values:
- HFS+ extent start:
extentStart = VCBM
- Number of allocation blocks:
allocationBlocks = CtlCSz
- allocation block size:
allocationSize = ABlkSz
- allocation start:
wrapperOffset = AlBlSt
- HFS+ block ratio:
blockRatio = allocationSize / 200
- HFS+ volume offset:
volumeOffset = blockRatio * extentStart + wrapperOffset
- HFS+ volume header:
volumeHeader = volumeOffset + 2
- HFS+ backup header:
backupHeader = allocationBlocks * blockRatio + volumeHeader - 4
For example, I got the following values for one of my damaged volumes:
extentStart = 5
allocationBlocks = FD36
allocationSize = 71000
wrapperOffset = 18
blockRatio = 71000 / 200 = 388
volumeOffset = 388 * 5 + 18 = 11C0
volumeHeader = 11C0 + 2 = 11C2
backupHeader = FD36 * 388 + 11C2 - 4 = 37E386E
and these for the other:
extentStart = 5
allocationBlocks = FF83
allocationSize = 1CD000
wrapperOffset = 18
blockRatio = 1CD000 / 200 = E68
volumeOffset = E68 * 5 + 18 = 4820
volumeHeader = 4820 + 2 = 4822
backupHeader = FF83 * E68 + 4822 - 4 = E613F56
The last two values you calculated give you the location of your volume header and backup volume header. Next we will look at them and try to figure out whether they can be fixed using this procedure.
Sanity-checking the HFS+ volume headers
Go to the volume header, using the Read Block Number command in Sedit, and giving it volumeHeader
as calculated above. Select HFS+ Volume Header from the
template menu and write down the following values: Sig
, Vers
, Attrib
, LastVer
, BlkSiz
, TotBlk
, BitBlk
, Bit1
.
Go to the backup volume header, using the Read Block Number command in Sedit, and giving it backupHeader
as calculated above. Write down the following values:
Sig
, Vers
, Attrib
, BlkSiz
, TotBlk
, BitBlk
, Bit1
.
Compare the corresponding values from your volume header and the backup volume header. If they are not the same, stop now. These instructions do not apply to you. If they are the same, that’s a good sign.
Next, look at the Bit1
value you have for the volume header and the backup volume header. Its first half should be 00000001
. Its second half should be equal
to the value of BitBlk
. If either of these is not true, stop now. These instructions do not apply to you.
Finally, look at the value of LstVer
. It should be 31302E30
. If it is not, stop. These instructions do not apply to you.
Calculating the correct HFS+ volume size and block size
If you are still with me, then you are ready to calculate the correct values for the damaged fields. First, some intermediate values:
blockSizeSquare = allocationBlocks * blockRatio * 40 / BitBlk
Now, round blockSizeSquare
up to a power of 4: if the first non-zero digit of blockSizeSquare
is 1
, 2
, or 3
, replace it with 4
; if it’s 4
or
higher, replace it with 10
; replace all other digits with zero.
Next, take the square root of blockSizeSquare
to calculate the HFS+ block size. If your hexdecimal calculator has a square root function, use it. Otherwise,
follow this algorithm:
- If
blockSizeSquare
has an even number of zeros after the first non-zero digit, replace the non-zero digit with its square root, and remove half of the zeros that come after it; - if it has an odd number of zeros, and it starts with a
10
, replace the10
with a40
, and remove one half of the remaining zeros. - if it has an odd number of zeros, and it starts with a
40
, replace the40
with a80
, and remove one half of the remaining zeros.
The number you thus get is the blockSize
. Finally, calculate totalBlocks = allocationSize / blockSize * allocationBlocks
.
Again, here are the examples of my two disks:
-
BitBlk = E0
-
blockSizeSquare = FD36 * 388 * 40 / E0 = FF78C4
-
blockSizeSquare
rounded up to a power of 4= 1000000
-
blockSize = 1000
-
totalBlocks = 6FC4D6
-
BitBlk = 399
-
blockSizeSquare = FF83 * E68 * 40 / 399 = FFCA05
-
blockSizeSquare
rounded up to a power of 4= 1000000
-
blockSize = 1000
-
totalBlocks = 1CC1EE7
Note that I got the same block size for both; this is expected. You should expect to get 1000
as well. If you don’t, it is much more likely that you did some
math wrong than anything else.
Repairing the volume headers
That’s it. Now we need to assemble all the repaired values and write them to disk. They are:
Sig = 482B
Vers = 0004
Attrib = 00000000
BlkSize = blockSize
, as calculatedTotBlk = totalBlocks
, as calculated
To be extra safe, you should save the old (bad) values in your volume headers before you overwrite them. Use the following procedure:
- Choose Save Blocks to File, and save 1 block starting at
volumeHeader
- Choose Save Blocks to File, and save 1 block starting at
backupHeader
- Go to the volume header, using Read Block Number and
volumeHeader
- Edit the five values you calculated
- Choose Write Block to write the volume header
- Choose Write to Block Number using
backupHeader
, to write the backup volume header
You are almost done. Your disk is now in the state where DiskWarrior can repair it. You must run DiskWarrior now to complete the repair. Enjoy.