Searching the Internet



Internet searching is the “art” of submitting a word or phrase to a web catalog or engine and receiving a series of URLs containing the word or phrase. Search engines have become an important method of locating data on the web.

Uniform Resource Locator (URL) is a specification of the location of a link. It specifies the protocol (http:// for a web page,) site name, path and file name to the resource. Think of it as a networked extension of the standard filename concept: not only can you point to a file in a directory, but that file and directory can exist on any machine on the network, can be served via any of several different methods, and might not even be something as simple as a file: URLs can also point to … queries stored deep within databases… (…from The Webmaster’s Lexicon)

______________________________________________________________________________

The standard format of a URL is:

scheme://host/resource

1. Scheme: appears before the colon, and describes the protocol, or the way the browser should handle the resource.

http: = HyperText Transfer Protocol, the native transfer method on the Web.

ftp: = File Transport Protocol, for downloading files from an FTP server.

file: = specifies a file on your computer as the resource

news: = specifies a newsserver and newsgroup as the host and resource

mailto: = starts the mail program associated with the browser, with a recipient as

the resource.

2. Host: appears after two forward slashes (//), and references the host computer (site) on which the resource resides. The host segment includes the domain name of the host computer. Domain names end in 2-5 letter zone names to indicate what type of site you are contacting:

.com = commercial organization

.edu = educational institutions

.gov = U.S. government and public sites

.mil = U.S. military sites

.net = networking organizations, communications service providers

.org = non-profit organizations and others not fitting existing categories

.fr, .uk, .us, .ca, … = international domains end in a two-letter country code

Internet Corp for Assigned Names and Numbers () has added 7 new domains:

.info = information services

.biz = trademarked businesses

.name = individual/personal sites

.pro = professionals

.aero = aviation

.coop = business cooperatives

.museum = museums

3. Resource: appears after a single forward slash (/), describing the full path to a file or document. Index.html is the default resource on an http location.

Examples:

or or



(note changes to standard file notation)

mailto:doej0001@unf.edu

Where do I start?

A good place to start Internet searching is through the UNF Library’s home page.

On you will find a link to Internet Search Engines.

This link, , describes and connects to several of the most popular and useful search tools available.

I. Search Engines:

Search Engines are tools to let you explore the databases containing text from over a billion unclassified Web pages (documents.) Most concentrate on providing powerful search capabilities, not organization of the data. Search engines index data, they do not provide a review process on the content or value of the data.

Most of the major search engines now also include additional services such as directories and meta-index searches, as discussed below.

The most comprehensive search engine is AltaVista.

Others are Fast, HotBot, Infoseek, and Excite.

Excite includes reviews, discussion groups and classified ads.

II. Internet Directories:

Internet Directory tools provide multi-level topic directories of a smaller database of documents, allowing you to browse for information on a given subject. Topic directories are established based on reviewing and classifying each Web site for content.

Since classification of Web sites requires human intervention, these directories are smaller in scope, but often lead to more precise results. The data is organized!

Yahoo arranges and reviews over a million sites.

LookSmart contains over 500,000.

Magellan reviews sites for value, allowing the user to screen out “content for mature audiences.”

Lycos includes abstracts for sites matching search results.

III. Meta-Indexes:

Meta-indexes search other indexes. These tools translate your query into the format of several other search tools and return results categorized by the tool used.

Google, Dogpile and MetaCrawler query most of the major search engines.

Google is currently the largest index with access to over 1.3 billion pages.

One site, , claims to search using thirty-seven different engines.

Search Qualifiers / Boolean Operators: examples of commonly used operators

|AND + |Gore AND Bush |returns documents with both Gore and Bush |

| |+Gore +Bush | |

|OR |Gore OR Bush |returns documents with either Gore or Bush |

|NOT - |mickey NOT mouse |returns mickey but not mouse. Mickey Mantle would be found, but not Mickey |

| |+mickey -mouse |Mouse. |

|Capitalization |Mouse |returns proper name. Mickey Mouse would be found, but not field mouse. |

|“phrase in quotes” |“ Duke Blue Devils” |returns exact phrase. Excludes pages about Duke Power, devil worship or blue |

| | |suede shoes |

|NEAR |Duke NEAR Blue NEAR Devils |similar to quotes except proximity of words determines results |

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download