[NCLUG] Re: parallel processing users?

Mon Oct 17 16:08:46 MDT 2005

	John writes:

	 > Hi Matt,
	 > For the last couple years I've been doing extreme parallel R&D
	 > on FPGA based systems from The Dini Group (www.dinigroup.com)
	 > using a pair of 64bit PCI hosted DN2000K10's with XCV2000E's,
	 > as well as my own boards. I'm personally interested in
	 > reconfigurable computing approaches for applications with
	 > high degrees of parallism, as this approach is 2-4 orders of
	 > magnitude faster than traditional CPU's, and much more cost
	 > effective for applications that are a good fit.

	Matt writes:
	FPGAs are a good example.  I did a small project on SRC's SRC-6 system
	and would like to compare notes with someone.  It's the "good fit"
	that seems to be the crux of the problem.  I don't see taking a
	100,000 line double precision floating point program and putting it on
	a FPGA system, but I'd like to be proven wrong.

Actually, it's quite rational to do so, but it takes developing a compiler
that will reduce C/Fortan into several different types of netlists to fit
the problems into FPGA's. Key is that it does take a pretty stiff relearning
the basic principles of architecture and machine design, as what "everybody
knows" is the right or only way to build machines, quickly leads you down
the wrong path for this class of machines.

I actually proposed doing so as a $30-50M project to Sandia earlier this year
with the intent to produce a 1-10 petaflop FPGA/Memory machine specifically
targeting their large simulation applications. Both the nature of the compiler,
and the architecture of the machine, are critical aspects to realizing usable
solutions, along with a critically strong dose of keep it simple.

Sandia took the proposal as a bit crazy, and treated it as something of a
straw man proposal, but the feed back in the process was a critical part of
cleaning up the proposal to try again this winter with the details ironed out.
In fact it was outright ignored at first, as way too off the wall, as everybody
knows the fastest machines that can be built are only 50-100 teraflops, so I
had to be more than a bit pushy to force their response to shoot the proposal
down.

Language/tool support, cosmic radiation, and "nobody has done it before" are
the three primary problems. The first and last are really the primary problems,
as there are several "cool" ways of dealing with the SEU's.

I see a clear keep-it-simple path to actually be able to build a machine
of this class, and it would be "more than fun" to find someone that would fund
it's development and build a company around the project, it would be outright
revolutionary for the whole industry.

VC dollars are always tight after a tech crash, so it might take another
year to find external funding to start the project without an institutional
partner like Sandia.

So in the mean time, I'm still investing my own money into a several thousand
FPGA proof-of-concept prototype built out of Xilinx Virtex, Virtex-2, and
Virtex-Pro parts. It's taken that last year being patient on ebay to find
most of the rest of the parts, but I'm very close to bringing the entire BOM
in-house so I can finally actually fab such a machine as "my home super computer".
it's been fun, but what else does and oldie moldie UNIX guy do these days :)

Next project will be to bring a team togather that can self fund the remainder
of the development without a formal Angel/VC seed, and form the core for a next
generation team to build reconfigurable supercomputers, and bypass the VC market.
The market is getting close to ripe for this technology, and I'm reaching the
limits of what can do as a one man project (actually I've been way over my head
for several years, but that has been the fun part of this learning/development
curve).

John