[MLB-WIRELESS] *tap tap* Is this thing on?

(GalaxyMaster) gm.outside+wireless at gmail.com
Thu Sep 17 11:01:09 EST 2009


Steven,

On Thu, 17 Sep 2009 10:19:39 +1000
Steven Haigh <netwiz at crc.id.au> wrote:

> 1) The system (for reasons yet unknown) decided to remount the /  
> partition as read only - and would not change to read-write.

/ is usually remounted read-only if there are serious consistency
errors with the file-system (you can tune this behaviour with
tune2fs, but I think remounting read-only on errors is the best
option).

> Problems with this means that I couldn't restart services, or really
> do anything. A standard reboot failed due to things not being able to
> be written.

The best thing you could do on RO file-system is to run e2fsk manually
to prepare the filesystem for the next reboot/re-mount.

Also, you should have been able to do the reboot by using reboot -nf,
but it would be an equivalent for a power cycle since no disk buffers
are written back to disks.  I'd recommend to use 'sync' before
rebooting the system with 'reboot -nf'.

> 2) The system was power cycled from a remote power switch that is  
> onsite but never came back up.

This was expectable since your root filesystem was inconsistent.  To
avoid it in the future you may want to add the '-y' flag to e2fsk in
your startup scripts (usually /etc/rc.d/rc.sysinit or /etc/rc.d/rc.S).
This way it would fix most errors itself.  For an unattended system
this option is quite good since most people don't know filesystem
stuff at the level required by e2fsk to answer its questions correctly,
and most people just reply 'yes' on all questions anyway.

> To help with this in the future, I have secured and configured a  
> serial console on the box that is active from the moment GRUB loads

Yep, it's a very nice feature to have.

> I'm also going to check through the system to see if there is any  
> indication why / was rendered read-only with no apparent trigger.

I hope that I provided some highlights why it was rendered read-only.
If your /var/log resides on a different partition (which is a good
thing to do if it's not) then you might want to check /var/log/messages
for fs corruption notifications.

> The RAID arrays held together fine - so whatever happened seemed to
> be on the filesystem level.

Yes, it's likely that this was the problem.

-- 
Dmitry D. Khlebnikov
Openwall Pty Ltd
+61 428 425291



More information about the Melbwireless mailing list