[NTLUG:Discuss] OT Perl vs C question
Fred James
fredjame at fredjame.cnc.net
Wed Apr 27 01:01:03 CDT 2005
The "query" in question (the one that runs for 24 hours to get a 30 day
range of data)
(1) for each employee, for each day within the range
(1.1) calls 16 different functions one after another, each of which
(1.1.1) traverses about 42,000 records in one table (cross checking 7
other tables as it goes) to calculate and total one of 16 specific
categories of elapsed time.
(1.1.1.1) A one year date range would make that > 500,000 records in the
central table, currently.
Is that a coordinated or non coordinated query, or neither? I plead
ignorance on the subject, but would like to know.
In any case, 1 simple query against each of the 8 tables for a date
range of one year (8 queries total) will complete in about 10 minutes.
From there even AWK could finish it up fairly quick, parsing it out as
substrings of fixed width data.
I didn't mention, and I doubt it makes any difference, but the final
output is a CSV file so the numbers people can put it in Excel - that
isn't a problem to do.
The data set has to be run for the full year each time because the start
and stop times are continually audited and corrected.
I do want to say thank you for all the thoughts you people have been
willing to share - I very much appreciate any help you may be able to offer.
Regards
Fred James
Chris J Albertson wrote:
>Yup. Wow is the word I came up with as well.
>I don't know your structure and don't want to get into a tiff over
>what-ifs. Your original email made this sound like it was a small, once in
>a while type thing. I'm sorry if I oversimplified your architecture. It
>really sounds to me like you've got much larger problems than what
>language to write your reports in.
>If your systems design indeed requires you to process 24 hours to
>summarize your data, then you may want to think about storing this
>resultant summary information in a database structure for many people in
>your org to have access. Those flat files cost a lot of $ to produce each
>month. If you're willing to spend 24 hours producing the data, it must be
>important. Make that data usable.
>Your company may benefit from looking into the possibility of implementing
>a data cube (OLAP) design in order to present massive amounts of data into
>a high-perfoming, trending analysis engine.
>Just trying to help.
>Good luck
>
>Chris
>
>
>
>>WOW! You could defiantly benefit from some optimization, or it is time to
>>invest in new hardware. I've run a database with millions of monthly
>>financial transactions running queries with multiple joins, calculating:
>>reversals, chargebacks, splits, interchange, surcharge and totals for all
>>the transactions took less that 30 seconds on an AMD xp 1800 with 512MB
>>RAM.
>>That was running yearly totals too. The monthly settlements where much
>>quicker.
>>
>>BTW, sometimes it isn't the query that needs optimization make sure you
>>are
>>indexing the fields used in the queries and it should improve your life
>>drastically. Think about it, if google can index the entire "visible
>>internet" and return results to your queries quicker than your browser can
>>render the page you should be able to run your queries in much less than
>>24
>>hours.
>>
>>-----Original Message-----
>>From: discuss-bounces at ntlug.org [mailto:discuss-bounces at ntlug.org]On
>>Behalf Of Fred James
>>Sent: Tuesday, April 26, 2005 7:44 PM
>>To: alb at chrisalbertson.com; NTLUG Discussion List
>>Subject: Re: [NTLUG:Discuss] OT Perl vs C question
>>
>>Chris Albertson
>>Admitting that I should not be considered a prize SQL programmer, I have
>>relied upon our SQL developers to produce as efficient a query as
>>possible. That said ...
>>
>>Evil flat files not withstanding, the SQL has been tested and it runs
>>for over 24 hours to acquire a 30 days range of data (that test run
>>being on a weekend, which is non-peak hours for us). Our final run must
>>collect a 365 days range of data - estimated running time for the SQL
>>would therefore be about 304 hours (12.67 days) We were kind of hoping
>>to run this query at least once a week, if not daily.
>>
>>All sad, but true.
>>Regards
>>Fred James
>>
>>--
>>Compassion alone stands apart from the continuous traffic between good and
>>evil proceeding within us. "Om Mani Padme Hum"
>>
>>
>>
>>_______________________________________________
>>https://ntlug.org/mailman/listinfo/discuss
>>
>>
>>
>
>
>
>_______________________________________________
>https://ntlug.org/mailman/listinfo/discuss
>
>
>
>
--
Compassion alone stands apart from the continuous traffic between good and evil proceeding within us. "Om Mani Padme Hum"
More information about the Discuss
mailing list