Sophia's choice

UNIVERSITY OF ULSTER: Technology developed between an Irish and a Russian university could have great internet search potential…

UNIVERSITY OF ULSTER:Technology developed between an Irish and a Russian university could have great internet search potential

A new technology developed in Northern Ireland could make everyone's life a little easier when searching for information on computer networks or the internet. Anyone who has ever had the joyless task of trying to find a document on their computer's hard drive, or trawling the web for information on all but the most specific of subjects, will have found themselves cursing the limitations of the search engines available to them.

Take the example of a subject such as heart disease - probably quite a popular search, and obviously an equally popular subject for websites, considering that Google returns more than 27 million hits for it. Not very helpful.

But there is hope out there for frustrated searchers in the form of unique new search technology devised jointly by researchers at the University of Ulster and St Petersburg State University, Russia, which promises to revolutionise the way we search for information within organisations.

READ MORE

The problem up until now has been that conventional search engines throw up thousands of results in response to a keyword query. It is then up to the user to determine which, if any, of these are relevant to what they are looking for. This process is exacerbated if the user does not express their information needs adequately in the query, because the result list is polluted heavily with information that is not focused on the topic that they are interested in.

But the new Sophia intelligent search technology is a much more discriminating tool. It collates information according to key themes which it automatically discovers within the information it searches through. The users can then determine which theme is closest in meaning to their information needs, and browse the pages collated under that theme.

A simple example is the query "Java". Using a conventional search engine, most of the responses could be documents related to web programming. Users then have to scroll through countless pages to discover other meanings, such as an Indonesian island or a type of coffee. The Sophia technology instantly tells the user that there are at least three different meanings associated with "Java" within their organisation, thereby alerting the user to these facts and allowing them to choose the context that best meets their current information needs for further exploration.

At present the new technology can be applied by individual companies to search their own data collections and other sources of information to which they have access.

"Sophia is suited to companies in sectors like the media, life and health sciences, pharmaceutical research and production, research bodies, legal firms, patent offices or financial institutions," says Dr David Patterson, one of the technology's creators and chief executive of Sophia Search, the company established in August 2007 to commercialise it. "All such companies have large amounts of unstructured information and can benefit from Sophia's unique approach. Our search engine would enable a user to access that information in an intelligent manner."

Patterson argues that conventional search engines make two basic assumptions - that the user knows exactly what they are looking for, and that they can prompt the search engine with sufficiently detailed keywords to express those information needs.

"We question the validity of those assumptions. Research shows that users reformulate their queries five times on average before finding the information they are seeking. Often it is only after they scroll through pages and pages of information that they can clarify exactly what they are looking for, using conventional approaches. We enable them to focus on the information they need more effectively," he explains.

"Sophia automatically searches all the information in the data collections open to it, and then organises that information according to key themes which it then presents to the user. The user can determine which theme is closest in meaning to what they are looking for, and then browse all the documents within that theme," Patterson continues. "Sophia searches by meaning, not by keywords. It brings automatic structure to unstructured information."

The technology effectively understands meaning in information, according to Niall Rooney, a member of the Sophia Search development team. "It enables users to search based on the meaning of what they are looking for, presenting more useful information as a result," he says. "You don't get information back simply because it contains the keyword you used in your query, you get information back because it is relevant to the topic and meaning of what you are looking for. This is what we call a thematic clustering approach. By organising the results according to themes which are relevant to your needs, you will get information which may not explicitly contain your query terms."

Searches can be split into two types of task.

The first is recovery, where you know a piece of information is in your repository, and you are trying to create a query that will retrieve it. The second is discovery, where you are unaware of the existence of an important piece of information but it is vital that it is found, based on your query.

"Conventional search can only really address the issue of recovery but not discovery," notes Rooney. "If you don't know it's there, how can you form a query with the right terms to find it? Because Sophia understands meaning, it can address both the recovery and discovery aspects of search."

This offers significant benefits to companies and other organisations, according to David Patterson. "You don't just save time and money because you can search more effectively within your data collection using Sophia, but because you improve your decision-making processes as a result of understanding what that information means," he says.

The new technology arose out of a joint research project between an Ulster team headed by Dr Patterson and a team at St Petersburg State University headed by Dr Vladimir Dobrynin, who is now chief technology officer with Sophia Search. The teams spent four years bringing the technology to the stage where it had commercial potential.

In 2005 they were given a £120,000 (€156,000) award from Invest Northern Ireland to commercialise the technology, and they established Sophia Search as a result. In October 2007 the team won the 25K Entrepreneur Award, organised by the NI Science Park, which provides a platform for talented students and researchers in Northern Ireland's universities to commercialise their research ventures.

The technology is currently being trialled in a number of organisations and the company anticipates a finished product being ready for market by the autumn of this year.