[NTLUG:Discuss] Distributed processing

Wed Jul 25 13:43:00 CDT 2001

Greg,

I work a lot in the fail over type of clusters, and I know that is far from what you are looking at, but I have had some of the same ideas as you.  I know that there is not much out there in the open world, but there are a lot of proects out there that might be able to piece together some of the complicated parts.

The way that I have suggested my idea, is that sometime the failover clusters and the parallel processing cluster will come together and meet in the middle.  Reading this thread has kinda got me thinking a little more in detail about what you are suggesting.

Some of the OSS projects I watch are FailSafe (SGI) and linux-ha.org.  Neither comes close to your requirements, but I thought I would mention them.

Please let me know what you find.  I have been intrested in investing is such a project.

cya
greg

-----Original Message-----
From: discuss-admin at ntlug.org [mailto:discuss-admin at ntlug.org]On Behalf
Of Greg Edwards
Sent: Tuesday, July 17, 2001 8:07 PM
To: ntlug discuss
Subject: [NTLUG:Discuss] Distributed processing

I sent this out last week and was wondering why I never saw it posted. 
Well I guess I sent it to admin at ntlug.org, sorry:)

-------- Original Message --------
Subject: Distributed processing
Date: Sun, 08 Jul 2001 17:54:36 -0500
From: Greg Edwards <greg at nas-inet.com>
Organization: New Age Software, Inc.
To: ntlug Admin <discuss-admin at ntlug.org>

I've been searching for tools that will do distibuted processing at the
function level and haven't had much luck.  There are plenty of
distributed load managers and parallel processing managers (such as
Beowolf).  The load managers work at the program level and parallel
process managers at the calculation level.  Neither of these solutions
answer my needs and multi-threaded has too many drawbacks for a solution
here.  What I need is a distribution manager that will pass the load at
the procedural level.

What I'm trying to do is run an application farm for interactive web
applications.  This solution would be usable well beyond web
applications.  The idea is that during the processing of an application
rather than a single program using a set of libraries on a single box an
API would allow the function request to be distributed among N boxes
that support that function.  Not every box would support every function
available throughout the farm but every box would have knowledge of
every function in the farm.

For example, say the application needs to search a database for all
employees that have 20 years of service and then return that list sorted
by age of the employee.  The entry application would reach a point of
needing the data and call the API which in turn would determine the best
machine in the farm to process the request based on current load and
data availability.  The request would then be passed to that machine and
the entry application would go on about its business until the results
were returned.  During the processing of that request the search for
employees and the sort may be split among mutiple machines as well.

I want to eliminate the issues of connection counts, task counts, user
counts, etc. that a high count of concurrent users can cause.  This will
also maximize process hueristics such as cache usage, repeated dataset
processing, heavy math processing, graphic generators, database access,
etc.

The basic topology of the farm would be a web server that handles the
web connections and static pages.  The farm would handle the processing
and pass dynamic pages back to the web server for delivery as static
pages.  The web server would determine which entry point in the farm to
send the initial request.

I hope this makes since?  Has anyone seen anything along these lines in
the Linux world?  My target language (initially) is C for performance
reasons.

-- 
Greg Edwards
New Age Software, Inc.
http://www.nas-inet.com
_______________________________________________
http://www.ntlug.org/mailman/listinfo/discuss