I was getting these errors from one of my new hard disks:
Feb 3 00:16:07 orac kernel: [78407.504324] ata3.01: exception Emask 0x0 SAct 0x
0 SErr 0x0 action 0x0
Feb 3 00:16:07 orac kernel: [78407.504610] ata3.01: BMDMA stat 0x64
Feb 3 00:16:07 orac kernel: [78407.504881] ata3.01: failed command: READ DMA
Feb 3 00:16:07 orac kernel: [78407.505162] ata3.01: cmd c8/00:08:98:0f:c1/00:00
:00:00:00/f0 tag 0 dma 4096 in
Feb 3 00:16:07 orac kernel: [78407.505163] res 51/40:08:98:0f:c1/00:00
:00:00:00/f0 Emask 0x9 (media error)
Feb 3 00:16:07 orac kernel: [78407.505722] ata3.01: status: { DRDY ERR }
Feb 3 00:16:07 orac kernel: [78407.506002] ata3.01: error: { UNC }
Feb 3 00:16:08 orac kernel: [78407.781740] ata3.00: configured for UDMA/133
Feb 3 00:16:08 orac kernel: [78407.801565] ata3.01: configured for UDMA/133
Feb 3 00:16:08 orac kernel: [78407.801578] ata3: EH complete
So I searched for a solution. I found [ubuntu] Hard Drive Error : ata3.00: status: { DRDY ERR } and in there hobong says:
It’s Kernel Bug on ata ACPI. I put “options libata noacpi=1” on /etc/modprobe.d/options and the ERROR is gone.
This is supplemented by a later comment from thatmattbone:
I think in 9.10, any file ending in “.conf” in /etc/modprobe.d is parsed. I created a new file, /etc/modprobe.d/options.conf and put the “options libata noacpi=1” in there.
So I created /etc/modprobe.d/options.conf with the content “options libata noacpi=1” and then I rebooted.
Upon reboot the disk was recognised as containing erros and fsck was forced. I had the opportunity to cancel but I let it run. While it was running a whole heap of the same original errors came through. I’m not sure if that was because the /etc/modprobe.d/options.conf file hadn’t done the trick, or if it was because it was too early in the boot process and /etc/modprobe.d/options.conf hadn’t been processed yet.
Anyway, I needed to try and fix this problem, so I ran lshw -C disk to see what I could see and found the following:
root@orac:~# lshw -C disk
*-disk:0
description: ATA Disk
product: ST32000644NS
vendor: Seagate
physical id: 0.0.0
bus info: scsi@2:0.0.0
logical name: /dev/sda
version: SN12
serial: 9WM67R7A
size: 1863GiB (2TB)
capabilities: gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=5 guid=9302d195-5ffc-41f2-949f-2899017a4dc0
*-disk:1
description: ATA Disk
product: SAMSUNG HD204UI
physical id: 0.1.0
bus info: scsi@2:0.1.0
logical name: /dev/sdb
version: 1AQ1
serial: S2K4J1CBA13712
size: 1863GiB (2TB)
capabilities: partitioned partitioned:dos
configuration: ansiversion=5 signature=91cd6331
As you can see, my new disk, sdb, was reported with different capabilities than my old disk, and my old disk seemed to be working fine. so I figured I’d have a look into that.
Turns out that fdisk creates MBR partition tables, but there’s a newer scheme known as GUID Partition Table or just GPT.
There are tools for working with GPT partition tables on Linux, notably GPT fdisk which comes with the command-line tool gdisk. The gdisk utility wasn’t available on my system, but I was able to install it with apt-get:
root@orac:~# apt-get install gdisk
Then I ran gdisk on my broken disk and it reported MBR only:
root@orac:~# gdisk /dev/sdb
GPT fdisk (gdisk) version 0.5.1
Partition table scan:
MBR: MBR only
BSD: not present
APM: not present
GPT: not present
***************************************************************
Found invalid GPT and valid MBR; converting MBR to GPT format.
THIS OPERATON IS POTENTIALLY DESTRUCTIVE! Exit by typing 'q' if
you don't want to convert your MBR partitions to GPT format!
***************************************************************
Warning! Secondary partition table overlaps the last partition by 33 blocks
You will need to delete this partition or resize it in another utility.
Command (? for help): q
Also you will notice that last warning, about there being something dodgy with the secondary partition table overlapping the last partition. Maybe these issues were related to the errors I was getting? I doubt it, but who knows.
Anyway, I decided to put a new GPT partition on my new disk and reformat the whole thing in the hope that I could get it to work.
I ran gdisk on my good disk to see what types of partitions it had:
root@orac:~# gdisk /dev/sda
GPT fdisk (gdisk) version 0.5.1
Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present
Found valid GPT with protective MBR; using GPT.
Command (? for help): ?
b back up GPT data to a file
c change a partition's name
d delete a partition
i show detailed information on a partition
l list known partition types
n add a new partition
o create a new empty GUID partition table (GPT)
p print the partition table
q quit without saving changes
r recovery and transformation options (experts only)
s sort partitions
t change a partition's type code
v verify disk
w write table to disk and exit
x extra functionality (experts only)
? print this menu
Command (? for help): p
Disk /dev/sda: 3907029168 sectors, 1.8 TiB
Disk identifier (GUID): 9302D195-5FFC-41F2-949F-2899017A4DC0
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Total free space is 1756 sectors (878.0 KiB)
Number Start (sector) End (sector) Size Code Name
1 34 3891402377 1.8 TiB EF00
2 3891402378 3907027378 7.5 GiB 8200
Command (? for help): q
Note that the primary partition was using code EF00. The following table explains that EF00 is “EFI System”, but I’m not sure what that means.
0700 Linux/Windows data 0c01 Microsoft Reserved 2700 Windows RE
4200 Windows LDM data 4201 Windows LDM metadat 8200 Linux swap
8301 Linux Reserved 8e00 Linux LVM a500 FreeBSD disklabel
a501 FreeBSD boot a502 FreeBSD swap a503 FreeBSD UFS
a504 FreeBSD ZFS a505 FreeBSD Vinum/RAID a800 Apple UFS
a901 NetBSD swap a902 NetBSD FFS a903 NetBSD LFS
a903 NetBSD RAID a904 NetBSD concatenated a905 NetBSD encrypted
ab00 Apple boot af00 Apple HFS/HFS+ af01 Apple RAID
af02 Apple RAID offline af03 Apple label af04 AppleTV recovery
be00 Solaris boot bf00 Solaris root bf01 Solaris /usr & Mac
bf02 Solaris swap bf03 Solaris backup bf04 Solaris /var
bf05 Solaris /home bf05 Solaris EFI_ALTSCTR bf06 Solaris Reserved 1
bf07 Solaris Reserved 2 bf08 Solaris Reserved 3 bf09 Solaris Reserved 4
bf0a Solaris Reserved 5 c001 HP-UX data c002 HP-UX service
ef00 EFI System ef01 MBR partition schem ef02 BIOS boot partition
fd00 Linux RAID
In any event I decided that I would create my new partition as an EFI System too. So I did that:
root@orac:~# gdisk /dev/sdb
GPT fdisk (gdisk) version 0.5.1
Partition table scan:
MBR: MBR only
BSD: not present
APM: not present
GPT: not present
***************************************************************
Found invalid GPT and valid MBR; converting MBR to GPT format.
THIS OPERATON IS POTENTIALLY DESTRUCTIVE! Exit by typing 'q' if
you don't want to convert your MBR partitions to GPT format!
***************************************************************
Warning! Secondary partition table overlaps the last partition by 33 blocks
You will need to delete this partition or resize it in another utility.
Command (? for help): ?
b back up GPT data to a file
c change a partition's name
d delete a partition
i show detailed information on a partition
l list known partition types
n add a new partition
o create a new empty GUID partition table (GPT)
p print the partition table
q quit without saving changes
r recovery and transformation options (experts only)
s sort partitions
t change a partition's type code
v verify disk
w write table to disk and exit
x extra functionality (experts only)
? print this menu
Command (? for help): o
This option deletes all partitions and creates a new protective MBR.
Proceed? (Y/N): y
Command (? for help): n
Partition number (1-128, default 1):
First sector (34-3907029134, default = 34) or {+-}size{KMGT}:
Last sector (34-3907029134, default = 3907029134) or {+-}size{KMGT}:
Current type is 'Unused entry'
Hex code (L to show codes, 0 to enter raw code): EF00
Changed system type of partition to 'EFI System'
Command (? for help): l
0700 Linux/Windows data 0c01 Microsoft Reserved 2700 Windows RE
4200 Windows LDM data 4201 Windows LDM metadat 8200 Linux swap
8301 Linux Reserved 8e00 Linux LVM a500 FreeBSD disklabel
a501 FreeBSD boot a502 FreeBSD swap a503 FreeBSD UFS
a504 FreeBSD ZFS a505 FreeBSD Vinum/RAID a800 Apple UFS
a901 NetBSD swap a902 NetBSD FFS a903 NetBSD LFS
a903 NetBSD RAID a904 NetBSD concatenated a905 NetBSD encrypted
ab00 Apple boot af00 Apple HFS/HFS+ af01 Apple RAID
af02 Apple RAID offline af03 Apple label af04 AppleTV recovery
be00 Solaris boot bf00 Solaris root bf01 Solaris /usr & Mac
bf02 Solaris swap bf03 Solaris backup bf04 Solaris /var
bf05 Solaris /home bf05 Solaris EFI_ALTSCTR bf06 Solaris Reserved 1
bf07 Solaris Reserved 2 bf08 Solaris Reserved 3 bf09 Solaris Reserved 4
bf0a Solaris Reserved 5 c001 HP-UX data c002 HP-UX service
ef00 EFI System ef01 MBR partition schem ef02 BIOS boot partition
fd00 Linux RAID
Command (? for help): p
Disk /dev/sdb: 3907029168 sectors, 1.8 TiB
Disk identifier (GUID): 71584326-3AD4-0BD9-A98A-9173A1FCF308
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Total free space is 0 sectors (0 bytes)
Number Start (sector) End (sector) Size Code Name
1 34 3907029134 1.8 TiB EF00 EFI System
Command (? for help): w
Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING
MBR PARTITIONS!! THIS PROGRAM IS BETA QUALITY AT BEST. IF YOU LOSE ALL YOUR
DATA, YOU HAVE ONLY YOURSELF TO BLAME IF YOU ANSWER 'Y' BELOW!
Do you want to proceed, possibly destroying your data? (Y/N) y
OK; writing new GPT partition table.
The operation has completed successfully.
Then I created my new ext4 file system on my new GPT partition:
root@orac:~# mkfs -t ext4 /dev/sdb1
mke2fs 1.41.11 (14-Mar-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
122101760 inodes, 488378637 blocks
24418931 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
14905 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 33 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
And I also lessened the percentage of blocks reserved for root to 1%:
root@orac:~# tune2fs -m 1 /dev/sdb1
tune2fs 1.41.11 (14-Mar-2010)
Setting reserved blocks percentage to 1% (4883786 blocks)
I would have liked to have set it to 0%, but that’s what I did last time and I decided to avoid doing that just in case that had in some way contributed to the errors I was getting (I doubt it, but better safe than sorry).
So then I put the following line in my /etc/fstab file:
/dev/sdb1 /mnt/airgap ext4 defaults 0 2
And then I was good to mount my new file system:
root@orac:~# mount /mnt/airgap
I’m in the process of copying about 1.6TB of data onto my newly minted disk, and it seems to be running OK at the moment. I guess it will be about a day or so before I know for sure if any of the above has helped.