|
News and Views Internet Technology |
AN INTRODUCTION TO SEARCH ENGINES |
|
Searching for information, usually accomplished with one or more search engines is second only to e-mail for most of us who use the Internet. An informal survey of Division 42 listserve members revealed no particular pattern to website search engine usage. It appears that most of us use one or more search engines, find one or more that fit our perceived needs, and only occasionally try something different or new. The issue is finding what one was looking for and that seems to be happening for those surveyed. This brief tutorial is designed for the beginner, the novice to the Internet. It is not designed to be comprehensive nor necessarily up-to-date. It will get you started and point to the vast and expanding literature on the Internet about search engines. There is first some basic background information on search engines, how they work, why they yield different results, detailed information on how to use one that gives results very rapidly, and suggestions for further exploration of this rapidly changing technology. BACKGROUND. There is a constant tug-of-war between Website designers, who want to maximize visitors to their sites, possibly for financial gain among other goals, and search engine designers, who make their money through advertisers. Those advertisers in turn want their ads to be seen by as many Internet users as possible. So how does one get a top ranking as a website? As the Internet as evolved, there are several ways. Many website designers constantly test and refine their descriptive word(s) and/or phrase(s), also known as "metatags" to catch the attention of the search engines they solicit for listings. These metatags, usually not seen by the website visitor, prove crucial to being classified optimally when the search engines' software robots (also called "bots" or spiders) scan a website. Metatags are like the title, author, and subject cards librarians prepare for library card files or computerized databases to classify books. However, while librarians limit their categories according to well-defined rules, and generally are not influenced by either authors or publishers, website creators create their own classifications, making the web largely self-defining, and ever evolving. Some search engines extract the metatags only, while some go further and analyze the frequency of words and phrases in the first part of a site or even the whole site, possibly overriding the metatags. Their approach is to analyze the actual content of the site, and not depend on the website developer's descriptors. Still other search engines determine rankings by popularity (i.e. number of daily or cumulative visitors and/or the quality and quantity of sites to which a website links). A few search engines only include those sites examined by their staff to ensure quality, while an increasing number are merely capitalistic: for a price your website gets a top ranking. Do not be surprised if on one day you get a list of sites for a given word or phrase that changes by the next day. Website designers and search engines strategies are simply at work. HOW SEARCH ENGINES WORK. After a website designer submits a website to one or more search engine(s), and the engines scan the site, key information including metatags and possibly a summary of site content is stored in a large database, which is accessed during any subsequent Simple search engines such as www.altavista. or www.yahoo rely on their own databases. Metasearch engines, on the other hand, will initiate searches on two or more other search engines with your word or phrase, and either present the results search engine by search engine or collate results to eliminate the usual duplicate listings. While the latter approach is slower, it often catches websites that single searches might miss but which may prove particularly valuable. HIGHLIGHTING ONE SEARCH ENGINE. Since the number of search engines is large and their features vary, only one will be reviewed, www.ussc.alltheweb. The main advantages of this search engine are its amazing speed and ease of use. Since many searches fail, either because the word or phrase initially chosen does not produce the desired result or it produces too many listings, which for practical purposes, must be pared down, it is useful to have search results quickly to make adjustments for the next trial. Here is how it works. From your Browser, type in www.ussc.alltheweb. After the search engine appears, click on the Advanced Search phrase near the top right of your screen. Another, more complex page appears. Bookmark that for future reference. Some prefer to go to the bottom of the page and adjust the number of results per page from the default value of 10 to the maximum of 100 before even using the search engine. This provides up to 100 matches per search and eliminates viewing only ten websites at a time. The www.ussc.alltheweb site also provides for modifying the lists that would otherwise result from a simple word or phrase search. For example, suppose you want to search for websites on Anna Freud, but not all sites that have the name Freud in them. After all, in addition to Sigmund Freud, there is a relatively well-known tool manufacturer with that name that manufactures router bits among other products. From your Browser, type in www.ussc.alltheweb . After the search engine appears, click on the Advanced Search phrase near the top right of your screen. Another, more complex page appears. Bookmark that for future reference. Some prefer to go to the bottom of the page after loading the engine and adjust the number of results per page from the default value of 10 to the maximum of 100 before even using the search engine. This provides up to 100 matches per search and eliminates the tedium of viewing only ten websites at a time. Enter the name Anna Freud in the box labeled Search For. Under Word Filters specify that the results MUST INCLUDE Anna and MUST NOT INCLUDE router bits. (Sites that refer to either Sigmund or Anna Freud most likely would not include the phrase "router bits"). If for some reason you wanted to limit your search results to those sites sponsored as organizations, you could use the DOMAIN FILTERS capability to include only those websites ending with the ORG designation. Finally, click on FAST search and the results will be displayed rapidly. Practice will rapidly improve search skills. LEARN MORE. The amount of information on search engines is vast. There is a website www.searchengineguide. that summarizes and compares search engines. There are various tutorials on search engines which are continually updated for your use. The www.refdesk. site is a very useful reference to search engines amongst a wide variety of topics. It cites the thorough and helpful UC Berkeley library tutorial at www.lib.berkeley.edu/TeachingLib/Guides/index. The Spider's Apprentice at www.monash.com/spidap. is also interesting. A more comprehensive, if somewhat disorganized, list of resources is found at www.leidenuniv.nl/ub/biv/specials. Its range is impressive. For unique approaches to search engine design, look at www.askjeeves. for an engine that initiates searches in responses to questions, at www.northernlight. for its ability to refine your search, and to www.copernic. which can store your search results for later reference. For those wanting to keep up with the latest search engine developments, sign up for a free email newsletter at http://searchenginewatch.com/. Alternatively, one can always conduct a search with any search engine using the phrase "search engine". Your results are nearly guaranteed to change as rapidly as the search engines themselves. Finally, belonging to the Division 42 listserve is quite helpful, for its active members are continually trying new things, including search engines, andcandidly reporting their successes among other experiences. |
|||
| Internet Technology Table of Contents | |||
| Return to News and Views Contents Page | |||