Specialize your searchingBy Pita Enriquez Harris
From: Information World Review
Back in 1994 when I was working as a postdoctoral researcher in Oxford I remember a conversation I had over the Internet with an Australian scientist who was in the process of downloaded the newly-minted Netscape Navigator beta.
"Take it from me," he told me, "These new graphic browsers are going to change the world. In ten years Big Business will own the Internet. We university researchers won’t get a look in."
I wasn’t sure that I believed the latter sentiment, but I too was sure that the Web browser would alter things forever. Anyway, I reasoned, it would create all sorts of business opportunities, maybe even some for me…
Two months later the same guy told me that he was packing his desk away into a box.
"I’m leaving the University. Setting up a Web design agency with a friend."
He wasn’t the only person at the time muttering dire warnings about the commercialisation of the Web. At least he took advantage of it, but many other people seemed to positively resent the worldwide adoption of the ‘Net by non-techies which was by then becoming commonplace.
Much of the resentment was simple jealous protectionism of a previously small community. Once in a while, however, people did point out the very obvious anxiety – if the Web becomes swamped with non-academic information, how will researchers find the information they need?
Then in April 1998, then July 1999, papers were published in Science and Nature, which confirmed the suspicions of people who regularly searched the Web for scientific information. The major commercial search engines covered only a smallish fraction of the Web (latest figures shows Northern Light as the largest search engine, indexing 16% of the total estimated Web). Moreover, Web sites containing scientific information are much less likely to be indexed than those containing commercial information.
So what is a poor scientist to do? On the one hand, the explosion in popularity of the Web has resulted in most journals being available online, complete with freely searchable contents tables, archives and content-relevant email alerts. Engineers, chemists and bioscientists now have their own dedicated Web communities (Engineering Information Village, ChemWeb and BioMedNet). But on the other, like everyone else, scientists are more and more frustrated that Web searches yield in more and more ‘results’, haystacks in which the useful needles are effectively buried. Maybe they are a little more upset than most professional communities when they find out that the information of interest to them may never even register in those search engines they use so trustingly – after all the Web was created by scientists for scientists.
These types of setbacks are likely, however, to prove transient. People have proposed universal, Internet-based citation indices linking every piece of academic research. People are working to improve standards in metadata such as the Dublin Core. The Internet 2 project is working to improve the possibilities of the Internet for the higher education and research communities.
Big projects, big money, big time delay.
The cost of storing and indexing information will decrease, just as the cost of computer memory continues to decrease. One day, search engines may easily cope with indexing the entire Web, every last page. So we need an interim solution; technology that exists right now to make things a good deal better. And that is in the growth in use of specialist search engines in parallel with better meta-searching technology.
Society organises information according to how it can most easily use that information, just as it has organised food into supermarkets, books into bookshops and movies into cinemas. Even the mall obeys these rules, and retail category managers see to it that within a supermarket, so do groceries.
Projects such the InvisibleWeb and Internets lead the way in the development of a searchable interface for locating specialist search engines. Unlike meta-search engines, these engines do not search other search engines but do help you to locate a suitable resource for your search. The largest collection is to be found in the InvisibleWeb, which boasts 10,000 search engines, organised by category.
It is not a big deal to create a specialist search engine. For universities, there are cheap or free software licenses and the cheap labour of their IT staff. For information providers with the information already in database format, giving search access from the Web need not be a huge task. Once the task is done, however, like all acts of Web publishing, the key is to get the word out, tell the search engines you exist. What better than to tell a search engine of search engines about your new search engine?
The success is in the planning; the tower must be built on the proper foundations. A search engine for each academic department; each one of these search engines grouped into collections of similar subjects; a meta-search engine to search custom-built groups of engines. The Web-based medical community already boasts one excellent example of this approach – the Mednets site allows you to browse a list of searchable databases in each medical specialty or to opt for searching the entire collection in one go.
Meta-search offers more than the bonus of hitting several search engines in one go; the better meta-search software has client-side capabilities which can delve further into the context of the search results. This means that you can see your results organised according to content. Dataware’s Knowledge Query server has this feature, so does Intelliseek’s BullsEye.
Scientists rely on the Web to circulate research information. As it has grown there can be little doubt that the usefulness of the Web as an environment has increased, even if the Web search engines have become less reliable. Investing in building good quality, specialised search engines and the meta-search technology to interrogate them will benefit all academics and researchers. Then maybe, who knows, when the search engines can finally boast all-singing, all-dancing coverage of the entire Web and real relevancy of results, we won’t need them…



