[NTLUG:Discuss] SCSI performance question

David Stanaway david at stanaway.net
Mon Oct 27 20:02:11 CDT 2008


If protection is one of their concerns, why have everything strung 
together on one SCSI bus?

You want to have 2 SCSI paths to each piece of data. What if you get a 
bad connector?

Most SAN storage arrays do this pretty well. There is also Equilogic and 
a host of other storage appliance options. DIY with DAS is fine, but do 
any SCSI disk enclosures you have been looking at provide multipath access?


Robert Parkhurst wrote:
> Ah..  They don't need faster storage, they just need something that grows
> since they don't like to delete files (and they have upper-management on
> their side ;-)).
>
> We looked into iSCSI, but management wanted so much "protection" that in the
> end it just got a bad wrap for having too much overhead.  Looking at a new
> solution based on a different tech (SCSI vs. iSCSI) could shed a "new light"
> on it and allow it to get reconfigured so it doesn't have all the overhead
> that the other one had.
>
> The biggest issue is that it needs to scale to a reasonable amount without
> killing performance.
>
>
>
> On Sun, Oct 26, 2008 at 11:36 PM, Robert Pearson <e2eiod at gmail.com> wrote:
>
>   
>> On Sat, Oct 25, 2008 at 4:20 PM, Robert Parkhurst
>> <robert.parkhurst at gmail.com> wrote:
>>     
>>> Ah!  Sorry for not including that information earlier!  I'll try to
>>>       
>> answer
>>     
>>> as best I can..
>>>
>>> The Specs / RAID layout / Drives (etc.) are as follows:
>>>
>>> If I were to go with JetStor, it's got a PCI-X (133MHz) LSI RAID
>>>       
>> controller
>>     
>>> with 256MB cache on it.  The drives are SATA-II (with U320 interface on
>>>       
>> the
>>     
>>> backplane that would be used to connect back to the Linux "head node").
>>>       
>>  The
>>     
>>> enclosure has 16 drives and would be configured for RAID-5 or RAID-6 for
>>> protection...  (Although you could also configure it for other RAID's
>>>       
>> like
>>     
>>> RAID-0).  The unit comes with two U320 LVD connections on the back side.
>>>
>>> The URL is:  http://acnc.com/02_01_jetstor_sata_416s.html
>>>
>>>
>>> The Application / Usage:
>>>
>>> This storage system would be used in a litigation support company.  They
>>>       
>> use
>>     
>>> (windows) software that "cracks" files--the software opens up the files,
>>> extracts the data and meta-data from them, and creates files from that.
>>> Each file gets a tiff image made along with a meta-data file..  So for
>>> example a word document becomes three documents:  the origional word
>>> document, the tiff image and the metadata file for the tiff image.
>>>
>>> There's LOTS of I/O on the system, both reads and writes..  The process
>>>       
>> is
>>     
>>> this:  You get the client data that needs to be cracked for use in the
>>> case..  You save that data to the server, then run the "cracking"
>>>       
>> software
>>     
>>> against it.  The cracking software reads that origional source data,
>>>       
>> cracks
>>     
>>> it, then saves the new files out to a destination (also on the server).
>>>
>>>
>>> Right now they've got a single Linux box with an internal RAID array
>>>       
>> using
>>     
>>> an Areca RAID card to pass through all the drives, then Linux does
>>>       
>> software
>>     
>>> RAID-5 (I asked if they wanted me to make a hardware RAID out of it, but
>>>       
>> my
>>     
>>> boss at the time didn't want to have to rebuild the raid....).  Anyway,
>>> that's shared out to Windows via SMB and the server performs SO well that
>>> ALL work is now done on that server alone vs. before when they had to
>>> distribute the load between multiple windows servers becasue no single
>>> windows server could handle all the reads and writes.   So it got me
>>> thinking that if we went to direct SCSI attached storage, using LVM + XFS
>>> and such we could expand the RAID and still keep the performance that the
>>> Linux box gave without having to throw $$$ at some place like Dell or
>>> NetApp.
>>>
>>> My concern though for doing this would be what are the limits of SCSI in
>>>       
>> a
>>     
>>> configuration like this -- putting 31 x JetStor's (RAID-6) on a single
>>>       
>> PCI-e
>>     
>>> (x8) LSI SCSI Card.  Would it be possible to still get the high
>>>       
>> performance
>>     
>>> or would it be best to limit it to something like 8 x JetStor enclosures
>>>       
>> or
>>     
>>> something?
>>>
>>> Hope this helps and again, thanks for any input!
>>>
>>> Robert
>>>
>>>       
>>> On Sun, Jul 5, 2009 at 9:10 PM, Leroy Tennison
>>> <leroy_tennison at prodigy.net>wrote:
>>>
>>>       
>>>> Robert Parkhurst wrote:
>>>>         
>>>>> I've used SCSI a lot, but mostly in the "lower-end" area (external
>>>>>           
>> CD/DVD
>>     
>>>>> SCSI drives, external SCSI hard drive or even a small SCSI external
>>>>>           
>> disk
>>     
>>>>> pack).  I'm curious though how SCSI would perform if you had a lot of
>>>>>           
>>>> large
>>>>         
>>>>> disk arrays attached off a single SCSI bus/adapter?
>>>>>
>>>>> Specifically, say I had an LSI SCSI adapter (Ultra-320, PCI-e (x8))
>>>>>           
>> that
>>     
>>>>> could have 32 SCSI devices attached to it.  And I attached 31x16TB
>>>>>           
>>>> external
>>>>         
>>>>> SCSI enclosures (like the JetStor SCSI unit or something) off that one
>>>>> adapter and then used Linux to make an LVM striped volume over all of
>>>>>           
>>>> them
>>>>         
>>>>> and formatted it with something like ReiserFS or XFS.  Would I see a
>>>>> (noticeable) performance hit on it?
>>>>>
>>>>> And if so, what's a good "recommended" max for attaching storage like
>>>>>           
>>>> that
>>>>         
>>>>> to a single SCSI controller?
>>>>>
>>>>> Thanks in advance!
>>>>>
>>>>> Robert
>>>>> _______________________________________________
>>>>>           
>>     
>>>> This is almost impossible to answer because performance is affected by
>>>> too many variables: what are the capabilities of the SCSI controller
>>>> (caching and how much? is the firmware efficient? how many channels?
>>>> other performance characteristics), the SCSI drives (all the common hard
>>>> drive performance characteristics), how is it configured (RAID?  If so,
>>>> which level?) and what your application is (mainly read-only, lots of
>>>> writing, lots of deletions after writing, a database).  Further
>>>> complicating matters is the fact that it's not just individual issues
>>>> but combinations of them (matching the configuration and the hardware to
>>>> the application).
>>>>
>>>>         
>> "the (Linux) server performs SO well that ALL work is now done on that
>> server alone"
>> Why is faster Storage needed?
>> "single Linux box with an internal RAID array using an Areca RAID card
>> to pass through all the drives, then Linux does software RAID-5"
>> This configuration might reconfigured for better performance and
>> Information Integrity. A lot of work for little gain...
>> "shared out to Windows via SMB"
>> SMB from Direct Connect Storage on a server doing file cracking could
>> be a major bottle-neck.
>> Perhaps the SMB Storage could be on the network? NAS or SAN? SATA
>> drives would be the choice here.
>> Crack on the existing server and Storage and de-duplicate the
>> "cracked" files to the network Storage.
>> The goal is to keep the "cracked" files stored on the "cracking"
>> server to a minimum.
>> Typically "cracking servers cannot have too much memory. Faster
>> processors make a huge difference.
>>
>> _______________________________________________
>> http://www.ntlug.org/mailman/listinfo/discuss
>>
>>     
> _______________________________________________
> http://www.ntlug.org/mailman/listinfo/discuss
>
>   




More information about the Discuss mailing list