The good news is that the World Wide Web consists of a large number of sites
containing a diverse amount of information. That's also the bad news. The
whole of the WWW has been likened to a flea market. Lots of good stuff scattered
around in haphazard fashion.
So the natural question comes up: how do you find the particular piece of
information you are looking for? One way would be to use a directory much
like the telephone yellow pages. Such directories exist, but by the time
they are printed, they are out of date. The preferred way, therefore, is
to use the web itself and find information by electronic means. Search engines
provide the means.
Indexes
The original web resource listings started out as "favorite sites"
pages and built from there as people submitted their pages to the index
site for inclusion in the list. As you might imagine, these lists quickly
became quite large. You can still find index sites where you start with
high-level topics and drill down through listings that are progressively
more detailed. But, most now use search routines where you provide key words
relating to the information you are looking for and the search engine finds
sites it its database that match that information. You are then presented
with a list of these sites with links for instant jumps to sites of interest.
Search Engines
How do search engines get the information in their database? It depends
on the engine. Some rely on inputs from web creators who submit requests.
There are even central sites that collect various information from web page
authors and then submit that information to the various search engines they
serve.
One of the more common methods of collecting information now, however, is
via an electronic search by the engine itself. Viewed from a distance the
WWW would look exactly like a web; fully interconnected information nodes.
To search this vast information web, search engine sites use "spiders"
that move from node to node searching out each one and cataloging what's
found there. (Some collection engines are also referred to as "bots,"
short for robots.) But, as the web grows this method of collecting information
is starting to bog down. It how takes up to a month just to make a single
pass through the web. New methods are going to have to be developed.
Search engines are a valuable resource, but if you feed them inappropriate
key words you can be left as much in the dark as when you started. Getting
50,000 hits for a particular search strategy is hardly effective (although,
even with that many responses they are usually ranked so you often don't
have to wade through all of them). That's where knowing how to search can
come in handy.
Search Strategy
There is no "standard" search engine language, but there are a
few common concepts that can help you with most every search site you might
use.
| |
|
First,
understand that most engines consider each word unique. If you want
to search on a phrase then you have to tell the engine; usually by
enclosing the phrase in quotes. To find Computer Knowledge, therefore,
you would enter "Computer Knowledge" instead of the phrase
with no quotes (which would find pages with either the word computer
or knowledge in them; a considerable number). |
| |
|
If
there is any doubt, use lower case throughout your search. Search
engines understand this to mean you will accept any capitalization
in response. Also, don't forget that often "*" will mean
"anything" and so "search*" would match search,
searching, and other endings. |
| |
|
You
can often combine words in a variety of ways using the Boolean notation:
AND, OR, and NOT. If you just place a list of words in the search
dialog most search engines will assume an OR between each one. This
results in the maximum number of hits for the topic(s) you are interested
in but often returns much chaff with the wheat. The AND and NOT operators
help here. Your searches can get fairly complicated if you wish. |
| |
|
Frequently
a site will also allow you to specify that a certain term be a mandatory
part of the search instead of optional. Some, in advanced searches,
also allow you to specify a weight for a specific term (i.e., consider
one word as being three times as important as another). |
| |
For
example: With AltaVista, placing a plus sign "+" before
a word means that word is to be considered a mandatory part of the
search; and, a minus sign "-" means that word is to be specifically
excluded from the search. E.g., "+scotland +golf -fishing"
would find all pages that mentioned both Scotland and golf but not
fishing. |
| |
|
Another
helpful way to cut down on responses is to use any proximity commands
the search engine might have. Some will allow you to specify that
your search words must be within "X" number of words of
one another in order to count as a hit. See the search engine help
to see if the engine you use has this feature. |
| |
|
If
you are not multi-lingual and the search engine you use has the option
of displaying a single language or all languages, pick the single
language you understand best. That will eliminate many pages you will
simply have to skip over if you don't understand them. |
| |
|
Finally,
some search engines have reserved terms for specific searches. Use
them if, for example, you are searching for those pages on the web
that link to your page(s). Refer to the help pages of the search engine
to find the terms for that engine. |
Over time and with experimentation you will find the
one or two search engines you prefer and find most useful. Keep them in
your bookmark file as you will use them often.
The Future of Searching
As indicated at the start, the simple index search
and construction of boolean search requests is fast becoming impractical
because of the growth of the web. So, what's up and coming then? Here are
a few hints based on technology in development, in testing, or coming.
| |
|
Collaborative
Search. The theory here is that what the majority want, you will
want. So, the results you see from your request will be the results
that produced the maximum number of click-throughs from others who
have searched using the same or similar search terms. In a similar
vein, there are now programs that track where users go from particular
pages and present those locations as options when you go to that page. |
| |
|
Natural-Language
Search. This is just a front end modification that allow users
to submit their queries in the form of regular questions instead of
having to form a boolean search query. (Some search engines are starting
to implement this type of search.) |
| |
|
Media
Search. Searches now present results based on searches of web
page text. But, what if you want to find a particular picture? Now
you have to largely depend upon the webmaster putting the description
on their page along with the picture. In the future, you'll be able
to search on picture characteristics. Other media (e.g., sound) will
also be included as time goes on. (Some search engines are starting
to implement this type of search.) |
| |
|
XML
Search. The next generation web language, XML, has extensive capability
for categories built into it. This allows web page designers to insert
keywords into a category structure so that searches based on XML can
yield more accurate results. If you are looking for "Word"
as a product instead of a concept, XML searching can give this to
you. |
| |
|
Context
Search. It's likely that a search hit found in the midst of related
terms will be a much more meaningful hit than one isolated amidst
non-related text. New search engines are being developed to perform
just these kinds of analyses. |
Of course, there is still
the problem of the search engine actually combing the web and finding things
to search. That's still not an easy question to address.
For more indepth information
on Search Engines, read this.
|