[NTLUG:Discuss] Which filesystem & partitioning for a 9Terabyte volume?

Fri Jan 30 02:37:29 CST 2009

Howdy,
  I bet there is a nice article out there on RAID which can explain this
better than I can, but I'll give it a shot.  I did not find one that
satisfied me in 10 minutes of looking.

  I don't know whether RAID 5 uses even or odd parity.  Maybe, it varies
depending on user choice.  I am going to assume even parity, but it
would be pretty much the same if it was odd parity.  The second thing I
don't know is how the error correction works.  All the references I read
refer to it as parity, but that would just tell you that an error was
found, but not give you enough information to recover the original data.
This is the problem you refer to.  I'll explain what I think I know and
hypothesize a bit.

  RAID 5 uses a simple parity bit to detect errors.  So, it is good for
single bit errors and will miss the error if 2 drives report the same
bit wrong in the same stripe.  In an array, there is a chunk of sectors
on every drive that are combined into the same stripe.  This number of
sectors in a chunk can vary depending on how you format the drive.  When
the system reads a stripe, it adds the corresponding bits from every
drive up to see if their total an even number of bits on.  It does not
add the data from the parity drive to this total.  If the total number
of bits on in each position is even, then the data is considered good
and the data is returned to the user.  If the computed parity was odd,
then the information from the parity block for this stripe is read and
correction is done and the data is returned to the user.  With this
scheme, it does not matter how many drives are in the set, because the
bits are just added for 3 drives or 50.  My guess is that the parity
block contains computed reed solomon values or something similar to
recover the data.  That would mean the parity block might not be the
same size as the other blocks, but it would be a predictable size and so
the system would know where all the blocks are. 

 I hope someone will jump in and correct any errors and omissions and
they probably will, with this group.  I mean that as a complement.
Good day,
Ralph

On Fri, 2009-01-30 at 01:04 -0600, Neil Aggarwal wrote:
> Ralph:
> 
> One thing has never made sense to me.
> 
> The storage capacity of RAID 5 is stated as
> N-1 where you have N drives.  That means
> the redundancy information is stored in the equivalent
> of one drive's worth of space.
> 
> How can that scale to a large number of drives?
> 
> Let me state it this way:  The size of the
> redundancy data expressed as a percentage
> of the data storage is 1/(N-1).
> 
> For 3 drives, the redundancy is 50% as
> large as the data.
> 
> For 4 drives, it is 33%
> 
> For 11 drives, it is 10%
> 
> For 101 drives, it is 1%
> 
> As the number of drives goes up, the amount of
> redundancy data goes down as a percentage of the
> data storage.
> 
> In the case with 101 drives, how can one bit of
> redundancy recover 100 bits worth of lost data?
> 
> Does the redundancy algorithm get better as the
> number of drives go up?
> 
> Thanks,
> 	Neil
> 
> --
> Neil Aggarwal, (832)245-7314, www.JAMMConsulting.com
> Eliminate junk email and reclaim your inbox.
> Visit http://www.spammilter.com for details.  
> 
> 
> > > My understanding is that given N drives (where N must be at 
> > least 3) you 
> > >   "consume" the equivalent of one drive spreading the 
> > redundancy data 
> > > across all drives.  Assuming all drives are equal size, how 
> > is losing a 
> > > small drive recoverable but losing a big drive not?
> 
> 
> _______________________________________________
> http://www.ntlug.org/mailman/listinfo/discuss