DEBRIS.COMgood for a laugh, or possibly an aneurysm

Wednesday, March 22nd, 2006

seriously bad RAM

Turns out the source of my hardware problems was a bad stick of RAM — a ~$100 memory module cost me a couple days’ worth of time. Argh.

When the symptoms first started appearing, I had the idea to run memtest, but that requires burning a CD and booting from it — and I don’t have physical access to the machine. Sucks.

This was the beginning of the end:

Message from syslogd@nsb at Sun Mar 19 03:39:23 2006 ...
nsb kernel: journal commit I/O error
After that, the filesystem became read-only:
[root@nsb tmp]# cat /proc/mounts 
rootfs / rootfs rw 0 0
/dev/root / ext3 ro 0 0
which meant logging failed, inbound mail was lost or rejected, and all sorts of other badness.

There’s a fix for the read-only problem, but it didn’t work:

[root@nsb tmp]# mount -o remount,rw /
mount: block device /dev/md1 is write-protected, mounting read-only
The good news is that the Ops guys at the hosting facility transplanted the disk drives into a new host, allowing me to grab the files I didn’t have good backups of.

Anyway, if you read this after having searched Google for one of the error messages above, my advice is to make backups immediately, but be aware that they’ll probably be corrupt. Some component of your hardware is about to make an ugly exit, and it may take your data along for the ride.


Tags: hardware, failure, server, ram
posted to channel: Personal
updated: 2006-03-22 08:11:19

follow recordinghacks
at http://twitter.com


Search this site



Carbon neutral for 2007.