[NTLUG:Discuss] db vs flat file

brian@pongonova.net brian at pongonova.net
Fri Dec 14 15:52:10 CST 2001


At the risk of stating the obvious, the ideal way to determine the answer would be
to write a few small Perl scripts (I'm assuming this is Perl, based on the diamond
operator you used) on the target machine, generate random tables of several
thousand entries, and determine empirically how long each search takes.  I imagine
there would be several factors involved which would affect search speed: HD access
times, latency across network connections, amount of RAM available, CPU speed,
memory cache parameters, etc.  

With that said, I am going to go out on a limb here and say anything that you can
do in memory will obviously be faster than mixed memory/file I/O operations.
However, with xDBM files, you have indexing which is optimized for searches on file
media, which may even out whatever gains you get by slurping up all the records
into a memory-based hash file and searching through them one by one.  Also,
depending on the type of data you're trying to access, saving it in a binary tree
would probably give you much better performance than doing a linear search every
time. 

Contact me off-line if you're interested in some more info, or want some
suggestions for some possible testing scripts.

  --Brian


On Fri, Dec 14, 2001 at 03:05:25PM -0600, David Camm wrote:
> i'm about to write some code to do a search based on condtions among
> several fields in records in a file.
> 
> i can use either a flat file or a hash version of the file.
> 
> as the file gets larger (multiple thousands of records) will it be
> faster to: 
> 
> open(I, 'inputfile'); while (<I>) { .......} close (I); 
> 
> or:
> 
> tie (%D .... input.db...) while (($k,$v) = each %D){ .....} untie %D;
> 
> my ***GUESS*** is that dbm support contains more code than sequential
> file support and hence might be slower, but that's just a guess.
> 
> any opinions (or hard facts) would be greatly appreciated.
> 
> david camm
> advanced web systems
> 
> _______________________________________________
> http://www.ntlug.org/mailman/listinfo/discuss




More information about the Discuss mailing list