[NTLUG:Discuss] ext3 vs. XFS vs. ....

Robert Pearson e2eiod at gmail.com
Tue May 2 10:45:58 CDT 2006


On 5/2/06, Mike Hart <just_mike_y at yahoo.com> wrote:
> Speed, disk space, and cpu usage don't mean much when
> you get multipe power outages spaced about 30-40
> seconds apart, just long enough for the drive to be in
> the process of a forced filesystem check during boot.
>
> Where's the benchmark on reliability? Specifically on
> a killswitch during a fschk after a kill. I'd be
> really interested to see how the different filesystems
> standup to our powergrid in spring :-)

I have read Burton's reply and agree with it.
The filesystem can never be the first line of defense against power
outages. The worst case of which is the one you mention, whipsawing.
I have seen file system corruption in cases where the system was
fully up and had been operating for some time, and a "whipsaw"
condition occurred such that the power did not go completely off.
The disks never spun down. Talk about a mess. Hardware failures
above the filesystem can totally destroy the file system through
corruption.
About the best a filesystem in an unstable power situation can
do is either shutdown and refuse to come up until the power
is stable or be able to come up in less than 5 seconds and
"checkpoint" every ".1" seconds. This means it could recover to
within any ".1" second increment of the boot or recovery.
To "checkpoint" at anytime the remainder of the hardware the
file system relies on must be stable. All bets on this are off in
a power whipsaw without line power conditioning and a UPS.
A UPS may not be sufficient to handle power whipsaws. In areas of
high whipsawing, which could be caused by industrial areas,
bad wiring, or high lightening strikes, a top quality UPS coupled
with a line power conditioning transformer would be a "must have"
configuration.
The best overall fault tolerant box I ever saw was Tandem.
We used to try everything in the world to make them stop.
The best fault tolerant filesystem was on the Sequoia platform.
It was a beauty. It had so many wonderful features. I loved that
machine.
Stratus was (is?) a good price/performance box. It was best
at hardware failure failover, like CPUs, when I used them.
A power whipsaw could put a Stratus out of business in the
filesystem. We ran all of ours on a UPS.
All of these machines were (are?) really expensive.
Today, superior fault tolerance to these machines can be
provided, for much less cost, with redundancy.
Caveat! If you cheapen the hardware platforms too much then
the redundancy doesn't buy you anything for hardware failure
failovers.
For example, don't use a Desktop PC mobo for cluster servers.
Use a good quality Server PC mobo. It will be well worth the
extra money, about 2x the Desktop mobo, if the Information
you are trying to protect is worth anything.
At least SATA disks for the primary mirrors. SAS preferred
with at least SATA in the secondary.
IDE in the tertiary and quaternary mirrors. Three "nines" of
Information High Availability is good enough in the third and
fourth mirrors. Provided you do regular, "good" backups for
Disaster Recovery.



More information about the Discuss mailing list