Harnessing PC power to push database limits

Grid computing may be the business buzz phrase of the moment but for researchers at the famed Swiss physics laboratory CERN, …

Grid computing may be the business buzz phrase of the moment but for researchers at the famed Swiss physics laboratory CERN, who are busy putting together the biggest grid computer in the world, a grid is more like business-as-usual.

"It would be simply impossible for us, with the budget constraints we have, to go any other way. We believe the grid will be, for the user, significantly easier to use than what we've had in the past," says Mr Jamie Shiers, database group leader at CERN.

"We've built the most ambitious system we've ever had, with half of the number of staff, and tripled the number of users."

That's high praise indeed from a laboratory synonymous with innovative computing research. CERN was where the World Wide Web was invented, after British-born researcher Mr Tim Berners-Lee thought a graphic interface - one based on what we now call Web pages - would be a better way to communicate information than using the text-based format of the nascent internet.

READ MORE

It's also where some of the most cutting-edge physics research is being done. At the moment, that work includes the construction of the world's largest atomic particle accelerator, called the Large Hadron Collider (LHC).

The massive underground structure, which has tunnels more than a mile long and, in parts, rises as high as a six-storey building, will be able to smash together atomic particles in an attempt to prove or disprove the "big bang" theory of the origin of the universe.

When the collider is operational in 2007, those collisions will generate phenomenal amounts of data - some 12 to 14 petabytes of data, equal to 20 million CDs - "and that, I expect, will go up", says Mr Shiers.

Analysing it will take the equivalent of 70,000 of today's PCs, which gives a sense of the scope of CERN's grid computing project.

Any concern that all those PCs will sit idle? Mr Shiers grins.

"Physicists will soak up any computing power you give them."

Grid computers, which link together armies of desktop computers to produce a bargain-basement supercomputer, are a relatively new idea in the academic and research worlds, just as in the business realm. The idea is that all the individual processors of those PCs can be roped together into a network, with each PC handling its own subset of processing tasks.

Until recently, grids were more experimental than practical, since harnessing together dozens - much less hundreds or thousands - of PCs and having them share the processing load meant thinking up news ways to manage the data across multiple machines.

Programmers also had to determine how to tweak the hardware for efficiency and grapple with the security issues arising out of having lots of desktop machines - some used simultaneously for other purposes - sharing processing information.

In the case of the CERN grid, the computers won't even all be based at Swiss laboratory. Modern computer networks and the internet allow computers scattered across the globe to come together and work on data from the LHC.

That suits the nature of CERN, according to Mr Shiers, because its researchers come from all the lab's member countries and often work within their home institution while doing CERN research.

He notes that the Republic remains one of the only European countries not to be a CERN member and he doesn't really understand why.

Although government representatives have said in the past that they feel the State still benefits from CERN research without being a full member, Mr Shiers feels it cuts the Republic out of active research and participation in the large international collaborative projects that will increasingly be part of CERN's remit as the LHC kicks into action.

He also points out that CERN, which spends millions of euro every year on everything from hardware and software to office equipment, can only in rare cases buy from a state that is not a member.

"Ireland has all those hardware and software companies. But at the moment, for us to buy from Ireland is next to impossible," he says.

A few years ago, CERN looked into buying PCs from Digital in Ireland, he says. "For Digital, the profit of selling computers to CERN would have more than paid for CERN membership," he says.

In addition, CERN has a commitment to helping its member-states build research infrastructure in the early years of membership, which includes helping to attract top-level researchers to develop research team expertise within member countries.

Mr Shiers job is right at the heart of the grid project - figuring out the best possible database structure and management techniques for a grid supercomputer that will be manipulating more data at a faster pace than ever before in computing history. One hundred negabytes of data per second will be written to disk, and that must eventually be managed across tens of thousands of individual computers.

For the task, Mr Shiers has been working with database giant Oracle, whose database software was used in CERN's first, now-dismantled original particle collider. Mr Shiers provided plenty of feedback to the company as it developed its 10g database for business grid computing systems, and he notes that some of the features CERN asked for have made it into the final product that will run in the corporate environment.

He points out that a research database product and the grid concept aren't as far removed from the business world as they might seem at first. Businesses increasingly have data-heavy processes that need the kind of computing power a grid can supply very cheaply.

CERN also has to operate on tight equipment budgets and he points out that the lab routinely uses older computers, not the latest on the market. Such machines - which typically are peppered through corporate networks - are perfectly fine for a grid network, according to Mr Shiers.

Working on databases at CERN must be one of the more interesting database jobs in the world.

"It's not like running the database for someplace like Gateway," Mr Shiers agrees. "It's the fact that it's CERN. We do make exciting discoveries. We do some very cool stuff. And we're very much at the limits of database technology."

Karlin Lillington

Karlin Lillington

Karlin Lillington, a contributor to The Irish Times, writes about technology