Of needles and haystacks: successful Internet searching
Search engines don't work well on the almost unlimited, often unstructured resources of the Internet, says J. Pemberton: "Think of a search engine as a dog whistle. Blow it in a kennel and you'll just attract dogs. Blow it in a zoo, and you'll get a few dogs, plus many other creatures with good high frequency hearing: maybe some lions or tigers, hyenas, coyotes, timber wolves, perhaps a moose. . . . The point is this: the Internet is a zoo." [Jeffery K. Pemberton's column in Online User Magazine, May/June 1996.]
Nothing much has changed to improve this assessment of a typical Internet search in the intervening years: some search engines now interpose human intellect to arrange sites into directories by (very) broad subject; some are starting to use metatags (keywords) in more consistent ways; but basically the Internet is still a zoo only with 5 times as many animals as it had in 1996.
To try and attract just the animal species you want means appreciating 4 broad areas:
- How to construct a question
- How search engines work
- How to evaluate what you find
- When to call it quits, (or not even bother starting!) and look for an alternative source for your information.
In this two part series on Internet Searching, we will look first at how to construct a question and how search engines work. Then our second article will then pick up on how to evaluate your findings and when to look for alternative information sources.
How To Construct A Question
It is often said the answer is only as good as the question, and this is never truer than in Internet Searching.
FORMULATING THE SEARCH QUERY
- Make a list of RELEVANT questions
- Extract KEY PHRASES and KEYWORDS
START BIG AND BROAD (THEN NARROW CAUTIOUSLY)
Effective searching requires a balance between a broad reach and a careful aim. The searcher must cast a net far enough to capture the most important information, and then, once safely contained, must cull the results so that only the best information remains. This is quite the opposite of using a library catalog where all the information is already structured within a preset group of subject headings and all you have to do is identify the appropriate subject.
Too many searchers narrow their search prematurely, thereby condemning themselves to the boundaries and ideas of their prior knowledge.
After you have conducted several browsing searches, you may begin to focus your search more sharply by adding key words to your search in order to limit hits to pages distinctly relevant to your inquiry.
Careful selection and addition of key words which are discriminating, distinguishing and distinctive, puts the spotlight on just those discrete pages which match your interests. Your key words differentiate, separate, and reserve only the best pages.
How different would your results be with each of the following words?
- internal audit
The more you particularize your search, the better your results. Adding particulars and specifics excludes all pages which do not contain those items. The advantage is sharp focus. The danger is bypassing, missing or overlooking key data.
Sometimes it pays to alternate between narrowing and broadening. After zeroing in with some particulars, zoom back out and try some different particulars.
BROWSE BEFORE GRAZING
Early search efforts are meant to provide an overview of the information landscape relevant to the investigation at hand, much like petroleum prospectors flying over a region and noting the terrain, seeking convergence (a combination of geological elements in one location which hints at the presence of oil).
While it is tempting to start right off opening pages and looking for information, it is more effective to wait until you have scanned the brief descriptions most search engines provide for the hits. Scanning the top 100 hits provides a basis for revising the original search to accomplish two goals:
1. Exclude whole categories of irrelevant sites
2. Target more directly those pages and sites most likely to deliver a great return.
How Search Engines Work: the questioning features of search engines
The more powerful the search engine, the more important the syntax - the rules governing how you enter your search query. Because few people stop to read and learn these rules, they end up with crude and clumsy searches.
For example, some search engines care about CAPITAL LETTERS and punctuation.
Others ignore them both. If your search is for information on
Another example: when you want an exact phrase such as ‘HIGH NET WORTH INDIVIDUALS’, some search engines require quotation marks around the words which belong together, while others do not care.
Search logic is also important. For example some search engines support Boolean searches, the use of the words AND, OR and NOT, to target just certain pages.
You can usually find the syntax and the instructions for logical searching for any search engine in the Help pages. Extremely useful are sites that list and compare the syntax and logic of the search engines such as Search Engine Watch (http://searchenginewatch.com/facts/ataglance.html)
As Info-Glut has grown to be more and more of a problem, the search engines have competed fiercely to offer the best tools to support you in your sorting and sifting, and yet many folks ignore these powerful extra tools and features.
HotBot's Super Search allows you to request particular domains, particular dates, particular levels of a Web site, particular countries of origin, particular types of files and as many as 100 results at a time. These can be very helpful search features. Google allows you to search for just images and to view those forever downloading pdf files as instantly loading html.
If you want to eliminate most commercial sites, you might limit your search to .gov or .org, for example.
Power search techniques explore all of the features of a chosen search engine in advance of real searching in order to apply these extra tools with skill when they are needed.
With experience you’ll get a feel for which search engine delivers consistently for your type of enquiries. Learn it. Bookmark it. Stay with that search engine until someone invents a faster and better information trap (which may not take long).
Of needles and haystacks: successful Internet searching - Part II
In Part I we considered how to frame effective questions, and looked at the way search engines worked. Next we need to consider how to evaluate your findings, and when to pays it consider seeking out alternative information resources.
Evaluating Your Findings
Evaluating your findings includes making sure you have selected appropriate sources in the first place, considering the costs, and then checking for currency, validity and reliability.
SELECTING THE APPROPRIATE SOURCE
Some of the best sites on the Internet are not indexed by the search engines. It pays to go to such Web sites and search them directly if they would be the leading source of information on a particular topic.
Many public agencies and news media Web sites do not permit access (perhaps for security reasons) to the spiders of search engines. As a result their contents often elude the search engine's efforts as well as our own search attempts. This section of unindexed resources is referred to as the ‘invisible web’.
For example, if looking for news of current events such as Enron, the best source is often a newspaper site such as the New York Times, the Wall Street Journal and so on where you can search for the topic rather wasting time with a global search engine which would overlook their offerings.
CONSIDER THE COST
In a billable environment time really is money. Your firm wants you to spend it wisely, so follow this simple rule of thumb: if you look for something on the Internet (or in any other resource for that matter) and don’t find it within ten minutes, stop searching. Your Web use may be free, but if you spend 45 minutes searching fruitlessly through irrelevant information, there is definitely an expense to the firm, and it's not one the powers-that-be will be happy to pay. What you need may not be available or may be squirreled away somewhere you haven’t heard about.
Also there are some types of information that are so costly to collate only large consulting firms have the resources to pull all the information together, analyze it and repackage it. This is generally the case with market research reports and benchmarking materials.
It can be far more cost effective to purchase these off-the-shelf, either direct from the firms that produce such reports or from the various online services and aggregators (e.g. FirstResearch, Dialog, Profound, MarketResearch.com, MindBranch, ECNext etc.) who have purchased the right to list and sell them on behalf of the original compiler. In the later case one of the best features of a good aggregator prices can vary quite a bit as can whether you can purchase the entire report or just a page.
EVALUATE YOUR INFORMATION
There is no intellectual review required for posting to a website. Anyone can post information on the Internet and almost none of it goes through any kind of screening. You could be looking at a page of complex financial information posted by a financial expert with years of experience and a list of degrees, or it might have been posted by a failing graduate student. How do you know? Much of what's available on the Internet is unreliable, and it's often difficult to tell for sure. Frequently, inaccuracies aren’t through any malicious intent, either. Most often they result because some generous soul who is trying to be helpful posts something and then simply forgets to keep it current. Four years later, can you tell when it was last revised? Sometimes, however, we users trip ourselves up by assuming that information from a reliable source, especially a government entity, must be completely current because it’s in an electronic format. In fact, most government information, the Code of Federal Regulations for instance, is only as current as its print equivalent.
If you are ever tempted to rely on information from the Internet without evaluating it or verifying it against a reputable print resource, imagine passing it to a partner who gives it to a client who comes back later to let you know that it was incorrect. Always check to see how current a page is. Always look for contact information for the page author. If there’s no one you can contact with a complaint, it’s a very bad sign.
Consider the following questions:
- Who wrote the pages?
- What does the author have to say about the subject?
- Does the author have the authority to present this information?
- Does the author/publishing organization have anything to gain by presenting this information?
- When was the site created and updated?
- Where does the site's information come from?
- Is the information consistent with other published material on the topic?
- Why it the site useful or important?
- Can the information be verified in book, periodical or other sources?
When You Should Consider Looking Elsewhere
Even expert searchers fall into the lazy habit of using the Internet exclusively when trying to find answers to their research questions. It's so easy! It's right there on your desktop! You almost always find something. But, the Internet is just another great tool in an already extensive array.
A good public or college library often provides information in more depth and may have resources which are easier and faster to use, not to mention more current and reputable. You might also be able to access information databases not available without huge subscription costs. Libraries often provide limited forms of access to these resources. And you can always call on highly trained research professionals for assistance.
Sometimes the Internet gets you part of the way, but ultimately a direct phone call is a better option. Remember that electronic searching, especially Internet searching, is not always the best way to find what you seek. Take advantage of all the tools at your disposal.
IT & Web - Menu
Business Roadmap provides practical business advice for small to medium sized businesses that are trying to overcome problems, or are trying to achieve their vision and full potential. Visit the Business Roadmap website for further information.
Disclaimer: While every effort has been made to provide valuable, useful information, Blue Mountains Web Pty Ltd. (trading as Stralia Web), Business Roadmap Pty Ltd. and any related suppliers or associated companies accept no responsibility or any form of liability from reliance upon or use of its contents. Any suggestions should be considered carefully within your own particular circumstances, as they are intended as general information only.