[NCLUG] Re: parallel processing users?

Mon Oct 17 10:46:11 MDT 2005

Several people write, but Evelyn writes first:

 > I'd be interested in a presentation (perhaps a panel discussion) on
 > clustering (compute, grid, high availability).
 > 
 > Would anyone else be interested? 
 > 
 > Matt/Bob: would you be available to present?

I could talk about parallelizing large scientific/engineering codes
from the user's view.  Tools used, libraries, compilers, debugging,
profiling, writing software.

John writes:

 > Hi Matt,
 > 
 > What do you have in mind? A specific application, or just
 > interested in general?

Interest in general.  I'd like to see for parallel processing in
Northern Colorado what NCLUG does for linux.  I've been doing parallel
processing since the mid 80's so I'm more interested in keeping up to
date and learning something outside my more than narrow niche.

 > For the last couple years I've been doing extreme parallel R&D
 > on FPGA based systems from The Dini Group (www.dinigroup.com)
 > using a pair of 64bit PCI hosted DN2000K10's with XCV2000E's,
 > as well as my own boards. I'm personally interested in
 > reconfigurable computing approaches for applications with
 > high degrees of parallism, as this approach is 2-4 orders of
 > magnitude faster than traditional CPU's, and much more cost
 > effective for applications that are a good fit.

FPGAs are a good example.  I did a small project on SRC's SRC-6 system
and would like to compare notes with someone.  It's the "good fit"
that seems to be the crux of the problem.  I don't see taking a
100,000 line double precision floating point program and putting it on
a FPGA system, but I'd like to be proven wrong.

Mike writes:
 >  I support a cluster of machines running Linux on 3 different architectures
 > as well as Sparc Solaris. The work being done on this cluster uses EDA tools
 > which include quite a bit of simulation and regression type work. In
 > general, none of the tools in use here (100+ tools from 35+ different
 > vendors) are capable of traditional parallel or threaded operation as much
 > of what they are doing is too linear in nature or based upon timing.
 > 
 > As such, we require a large amount of horsepower but not so much like what
 > large monolithic systems or traditional clusters use. Instead, we use a
 > batch queueing system from a company called 'Platform Computing' called LSF,
 > or Load Sharing Facility. At its most simple config, it's just a batch
 > system, but it's very good at handling thousands upon thousands of jobs at a
 > time. It lets us treat any individual member of the cluster as a simple node
 > which can be removed at any point and only minimally affect performance.
 > 
 > Some newer EDA tools are doing some interesting things like splitting up
 > large jobs into groups of smaller jobs and submitting those to the queueing
 > system, some even going so far as to specifically working smoothly with LSF.
 > The submitting process waits for the spawned child jobs to finish and then
 > assembles the finished product, whatever it may be. In this same cluster, we
 > have jobs that run for weeks on end as well as some that run in bunches of
 > several hundred for seconds each at a time. For what we do, it's a great
 > system.
 > 
 > So from this perspecitve, individual chips can't ever be 'fast enough', but
 > more and more our tools and infrastructure are supporting methods to take
 > advantage of more CPUs.

I'd like to hear more about this and compare it to what I do.  The
codes I work with could be considered one big job but a batch queing
model like what you've described could be used to handle load imbalance.
It depends on the amount of data to move, bandwidth, latencies, etc.

Anyway, Evelyn, if you set up a parallel processing presentation I'll
show up and hope to see some other people.