Question

Skip to the question if not interested by the story

After an unfortunate lost of 2 disk on a 4 disk raid 5 array I got into some voodoo to take as much data back as I can.

The first drive giving signs of weaknesses was replaces, and during the rebuild (~80% through) the second drive failed on some dead sectors.

Long story short, now I have two drives (2 and 4) on a consistent state. One (3) is synced but with bad sectors in the middle. The last (1) is only part synced due to the rebuild process. Disks are 1.5 Tb for a total 4.1 Tb array.

After having trying all the read-only voodoo on the 234, 124 and the 1234 configurations I've been able to get a good portion of the important data. (Which is only 100 Gb on the 4.1 Tb whole.)

Now the next step is to try some file system (reiserfs) rebuild to see if I'm able to get more data. Theses operations are destructive. And I have three working disks setups on different states to try.

So to the question

Is there any way to make a snapshot of any kind of the md block device and work on this without altering the md device?

Some points:

  • No matters the performance, very slow is acceptable.

  • I have 2 * 2 Tb of storage on external temporary drives usable for the "changelog" of the snapshot.

  • I do not have enough storage to copy the whole device (md) to another place and make a lvm volume with it. (Nor the space to image each drive separately)

  • No need to be reliable in time, the snapshot will be deleted after data recovery (if any).

  • Well, I think it is clear: I only need to to stuff on the read only md and then throw the changes away.

Any ideas?

Thanks!

Was it helpful?

Solution

Use the device-mapper snapshot target. Just be warned that it will not mask I/O errors from the underlying bad disk(s) so this is best suited to good disks with corrupted filesystems.

tl;dr - Skip the following three paragraphs of my backstory.

The most recent incident I dealt with also involved a RAID5 with 4 disks but in a USB-enclosure. It was formatted with NTFS and ironically held a 640GB disk image that was recovered from a failing laptop disk using gddrescue, during which the box reported a disk failure 300GB through. I did not perform the ddrescue, so the bad laptop disk was sent in for a replacement before I was asked to help.

I arrived and had to find a way to retrieve as much of the image file as possible in the limited time that I had access to the RAID box. (It was borrowed and I was visiting from out of town.) The enclosure had a flaw where upon a power cycle it forgot about the disk failure, so the RAID probably operated out of sync for days silently corrupting the NTFS, thus ntfs-3g refused to mount it. I managed to recover 300GB and no more, however that was sufficient to recover many otherwise lost files contained in the image. (I ran testdisk, scrounge-ntfs, and ntfsundelete but I chose not to use photorec.) I ended up using testdisk to read the image file out of the NTFS, but I also tried things like using testdisk to repair the NTFS enough to make ntfs-3g cooperate, and even running chkdsk in VirtualBox which only managed to truncate the image to zero bytes.

I found it extremely valuable to try several mutually exclusive destructive methods in order to find the best solution.

The device-mapper snapshot target makes use of the dm-snapshot kernel module which performs copy-on-write on a block level. In my steps, I will operate on the failing disk /dev/failing. You will need to supply a block device large enough to store your changes that I will call /dev/cow. It is important that you do not reuse the snapshot exception store for other copy-on-write devices that you create.

 # Make it much harder to accidentally overwrite anything
 # Run on all partition sub-devices as well, if applicable
1. blockdev --setro /dev/failing

 # Create /dev/mapper/top
2. echo 0 `blockdev --getsz /dev/failing` snapshot /dev/failing /dev/cow p 4 | dmsetup create top

 # Manipulate /dev/mapper/top as you wish

 # Tear-down
3. dmsetup remove top

I provide two alternatives to creating /dev/cow:

A. Using a sparse file

 # Create a sparse file
1. dd if=/dev/zero bs=1048576 count=0 seek=size_in_MB of=tempfile

 # Print name of next unused loop device
2. losetup -f

 # Associate the file with a loop device
3. losetup -f tempfile

 # Use as /dev/cow

 # Use the name from #2 here
4. losetup -d /dev/loopX

5. rm tempfile

B. Using the zram kernel module (see documentation if adapting to ramzswap or compcache!)

 # Create 4 of them - zram0-3 (you may run into a need for more than one)
1. modprobe zram num_devices=4

 # Set size
2. echo $((1048576*size_in_MB)) > /sys/block/zram0/disksize

 # Associate with a loop device (dmsetup will fail with zramX but not loopX!)
3. losetup -f
4. losetup -f /dev/zram0

 # Use as /dev/cow

 # Use the name from #3 here
5. losetup -d /dev/loopX

6. echo reset > /sys/block/zram0

In my time-limited situation I needed to copy the 300GB image somewhere but I did not have the space for it, so I compressed it (to 25GB).

If you ever need to store a compressed read-only copy of a block device for later use without creating intermediate files, I suggest using squashfs. Break up the device into 4GB chunks using (un)chunkfs (requires FUSE) and run mksquashfs on each chunk individually. That way it can be stored on FAT32 volumes, or on NTFS without high CPU usage from ntfs-3g creating large files. I recommend checksumming the resulting files, and maybe try par2 if you want to add redundancy.

In order to reassemble the device content, you will most likely need more than the default 8 loop devices. To do this, modprobe loop max_loop=2048 or if it's compiled into your kernel then add max_loop=2048 to your kernel command line. Mount each squashfs and associate the files within to loop devices. Finally, use dmsetup to concatenate them using the linear target. (Read man dmsetup and preferably remember the -r switch, otherwise writes will be dropped instead of failing immediately.)

OTHER TIPS

If you got enough space on some other storage, I'd just image the drives with dd.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top