THE USE OF SEARCH ENGINES FOR ARCHITECTURAL …



THE USE OF SEARCH ENGINES FOR ARCHITECTURAL RESEARCH IN NIGERIA.

BARUWA,MOHAMMAD ISSA

( ARC / 08 / 5539 )

A REPORT

SUBMITTED AS PART OF THE REQUIREMENT

FOR THE COURSE TITLED

RESEARCH METHODOLOGY (ARC 805)

AT THE

DEPARTMENT OF ARCHITECTURE,

SCHOOL OF ENVIRONMENTAL TECHNOLOGY,

FEDERAL UNIVERSITY OF TECHNOLOGY, AKURE,ONDO STATE

MENTOR.

PROF. O. O OGUNSOTE

APRIL, 2009.

Abstract:[pic]

The World Wide Web offers information and data from all over the world. Because so much information is available, and because that information can appear to be fairly “anonymous”, it is necessary to develop skills to evaluate what was found . When one use a research or academic library, the books, journals and other resources have already been evaluated by scholars, publishers and librarians. Every resource one find has been evaluated in one way or another before ever seen it. When one is using the World Wide Web, none of this applies. There are no filters. Because anyone can write a Web page, documents of the widest range of quality, written by authors of the widest range of authority, are available on an even playing field. Excellent resources reside along side the most dubious. The Internet epitomizes the concept of Caveat lector: Let the reader beware. This Report discusses The Use of Search Engines for Architectural Research in Nigeria .Criteria by which scholars search for information on Architecture in Nigeria or assess Architecture related information on the Internet.

Introduction:

Search engine databases are selected and built by computer robot programs called spiders. These "crawl" the web, finding pages for potential inclusion by following the links in the pages they already have in their database. They cannot use imagination or enter terms in search boxes that they find on the web.

After spiders find pages, they pass them on to another computer program for "indexing." This program identifies the text, links, and other content in the page and stores it in the search engine database's files so that the database can be searched by keyword and whatever more advanced approaches are offered, and the page will be found if your search matches its content.

Search engines do not really search the World Wide Web directly. Each one searches a database of web pages that it has harvested and cached. When you use a search engine, you are always searching a somewhat stale copy of the real web page. When you click on links provided in a search engine's search results, you retrieve the current version of the page.

If a web page is never linked from any other page, search engine spiders cannot find it. The only way a brand new page can get into a search engine is for other pages to link to it, or for a human to submit its URL for inclusion. All major search engines offer ways to do this.

Many web pages are excluded from most search engines by policy. The contents of most of the searchable databases mounted on the web, such as library catalogs and article databases, are excluded because search engine spiders cannot access them. All this material is referred to as the "Invisible Web" -- what you don't see in search engine results.

The World Wide Web can be a great place to accomplish research on many topics. But putting documents or pages on the web is easy, cheap or free, unregulated, and unmonitored

Therein lies the rationale for evaluating carefully whatever you find on the Web. The burden is on you - the reader - to establish the validity, authorship, timeliness, and integrity of what you find. Documents can easily be copied and falsified or copied with omissions and errors -- intentional or accidental. In the general World Wide Web there are no editors (unlike most print publications) to proofread and "send it back" or "reject it" until it meets the standards of a publishing house's reputation. Most pages found in general search engines for the web are self-published or published by businesses small and large with motives to get you to buy something or believe a point of view. Even within university and library web sites, there can be many pages that the institution does not try to oversee. The web needs to be free like that!! And you, if you want to use it for serious research, need to cultivate the habit of healthy skepticism, of questioning everything you find with critical thinking.

Uses of Search Engines :What to consider:

Authorship

Publishing body

Point of view or bias

Referral to other sources

Verifiability

Currency

How to distinguish propaganda, misinformation and disinformation

The mechanics of determining authorship, publishing body, and currency on the Internet

Authorship is perhaps the major criterion used in evaluating information. Who wrote this? When we look for information with some type of critical value, we want to know the basis of the authority with which the author speaks. Here are some possible filters:

• In your own field of study, the author is a well-known and well-regarded name you recognize.

• When you find an author you do not recognize:

o the author is mentioned in a positive fashion by another author or another person you trust as an authority;

o you found or linked to the author’s Web/Internet document from another document you trust;

o the Web/Internet document you are reading gives biographical information, including the author's position, institutional affiliation and address;

o biographical information is available by linking to another document; this enables you to judge whether the author’s credentials allow him/her to speak with authority on a given topic;

o if none of the above, there is an address and telephone number as well as an e-mail address for the author in order to request further information on his or her work and professional background. An e- mail address alone gives you no more information than you already have.

The publishing body also helps evaluate any kind of document you may be reading. In the print universe, this generally means that the author's manuscript has undergone screening in order to verify that it meets the standards or aims of the organization that serves as publisher. This may include peer review. On the Internet, ask the following questions to assess the role and authority of the "publisher", which in this case means the server (computer) where the document lives:

• Is the name of any organization given on the document you are reading? Are there headers, footers, or a distinctive watermark that show the document to be part of an official academic or scholarly Web site? Can you contact the site Webmaster from this document?

• If not, can you link to a page where such information is listed? Can you tell that it’s on the same server and in the same directory (by looking at the URL)?

• Is this organization recognized in the field in which you are studying?

• Is this organization suitable to address the topic at hand?

• Can you ascertain the relationship of the author and the publisher/server? Was the document that you are viewing prepared as part of the author’s professional duties (and, by extension, within his/her area of expertise)? Or is the relationship of a casual or for-fee nature, telling you nothing about the author’s credentials within an institution?

• Can you verify the identity of the server where the document resides? Internet programs such dnslookup and whois will be of help.

• Does this Web page actually reside in an individual’s personal Internet account, rather than being part of an official Web site? This type of information resource should be approached with the greatest caution.

Point of view or bias reminds us that information is rarely neutral. Because data is used in selective ways to form information, it generally represents a point of view. Every writer wants to prove his point, and will use the data and information that assists him in doing so. When evaluating information found on the Internet, it is important to examine who is providing the "information" you are viewing, and what might be their point of view or bias. The popularity of the Internet makes it the perfect venue for commercial and sociopolitical publishing. These areas in particular are open to highly "interpretative" uses of data.

Steps for evaluating point of view are based on authorship or affiliation:

First, note the URL of the document. Does this document reside on the Web server of an organization that has a clear stake in the issue at hand?

o If you are looking at a corporate Web site, assume that the information on the corporation will present it in the most positive light.

o If you are looking at products produced and sold by that corporation, remember: you are looking at an advertisement.

o If you are reading about a political figure at the Web site of another political party, you are reading the opposition.

• Does this document reside on the Web server of an organization that has a political or philosophical agenda?

o If you are looking for scientific information on human genetics, would you trust a political organization to provide it?

o Never assume that extremist points of view are always easy to detect. Some sites promoting these views may look educational. To learn more, read "Rising Tide: Sites Born of Hate", New York Times, March 18, 1999. (This link will take you to the online edition of the Times; you must register, free of charge, to view the article).

Many areas of research and inquiry deal with controversial questions, and often the more controversial an issue is, the more interesting it is. When looking for information, it is always critical to remember that everyone has an opinion. Because the structure of the Internet allows for easy self publication, the variety of points of view and bias will be the widest possible.

Referral to and/or knowledge of the literature refers to the context in which the author situates his or her work. This reveals what the author knows about his or her discipline and its practices. This allows you to evaluate the author's scholarship or knowledge of trends in the area under discussion. The following criteria serve as a filter for all formats of information:

• The document includes a bibliography.

• The author alludes to or displays knowledge of related sources, with proper attribution.

• The author displays knowledge of theories, schools of thought, or techniques usually considered appropriate in the treatment of his or her subject.

• If the author is using a new theory or technique as a basis for research, he or she discusses the value and/or limitations of this new approach.

• If the author's treatment of the subject is controversial, he or she knows and acknowledges this.

Accuracy or verifiability of details is an important part of the evaluation process, especially when you are reading the work of an unfamiliar author presented by an unfamiliar organization, or presented in a non-traditional way. Criteria for evaluating accuracy include:

• For a research document, the data that was gathered and an explanation of the research method(s) used to gather and interpret it are included.

• The methodology outlined in the document is appropriate to the topic and allows the study to be duplicated for purposes of verification.

• The document relies on other sources that are listed in a bibliography or includes links to the documents themselves.

• The document names individuals and/or sources that provided non- published data used in the preparation of the study.

• The background information that was used can be verified for accuracy.

Currency refers to the timeliness of information. In printed documents, the date of publication is the first indicator of currency. For some types of information, currency is not an issue: authorship or place in the historical record is more important (e.g., T. S. Eliot's essays on tradition in literature). For many other types of data, however, currency is extremely important, as is the regularity with which the data is updated. Apply the following criteria to ascertain currency:

• The document includes the date(s) at which the information was gathered (e.g., US Census data).

• The document refers to clearly dated information (e.g., "Based on 1990 US Census data.").

• Where there is a need to add data or update it on a constant basis, the document includes information on the regularity of updates.

• The document includes a publication date or a "last updated" date.

• The document includes a date of copyright.

• If no date is given in an electronic document, you can view the directory in which it resides and read the date of latest modification.

If you found information using one of the search engines available on the Internet, such as AltaVista or InfoSeek, a directory of the Internet such as Yahoo, or any of the services that rate World Wide Web pages, you need to know:

• How the search engine decides the order in which it returns information requested. Some Internet search engines "sell" top space to advertisers who pay them to do so. Read Pay for Placement? from .

• That Internet search engines aren't like the databases found in libraries. Library databases include subject headings, abstracts, and other evaluative information created by information professionals to make searching more accurate. In addition, library databases index more permanent and reliable information.

•  How that search engine looks for information, and how often their information is updated. An excellent source for search engine information is All information, whether in print or by byte, needs to be evaluated by readers for authority, appropriateness, and other personal criteria for value. If you find information that is "too good to be true", it probably is. Never use information that you cannot verify. Establishing and learning criteria to filter information you find on the Internet is a good beginning for becoming a critical consumer of information in all forms. "Cast a cold eye" (as Yeats wrote) on everything you read. Question it. Look for other sources that can authenticate or corroborate what you find. Learn to be skeptical and then learn to trust your instincts.

© 1996 Elizabeth E. Kirk

Recommended Search Engines

Google has one of the largest databases of Web pages, including many other types of web documents (blog posts, wiki pages, group discussion threads and document formats (e.g., PDFs, Word or Excel documents, PowerPoints). Despite the presence of all these formats, Google's popularity ranking often places worthwhile pages near the top of search results. Our web searching workshop reflects the fact that Google is currently the most used search engine.

Google alone is not always sufficient, however. Not everything on the Web is fully searchable in Google. Overlap studies show that more than 80% of the pages in a major search engine's database exist only in that database. For this reason, getting a "second opinion" can be worth your time. For this purpose, we recommend Yahoo! Search or Exalead.

Table of features

Some common techniques will work in any search engine. However, in this very competitive industry, search engines also strive to offer unique features. When in doubt, look for "help", "FAQ", or "about" links.

Table 1: Table of features. Source: lib.berkeley.edu

|Search Engine |Google |Yahoo! Search |Exalead |

| | |search. |search/ |

|Links to help |Google help |Yahoo! help |Exalead FAQ and features |

|Size, type |IMMENSE. Size not disclosed in any way |HUGE. Claims over 20 billion |LARGE. Claims to have over 8 billion |

| |that allows comparison. Probably the |total "web objects." |searchable pages. |

| |biggest. | | |

|Noteworthy features|PageRank™ system includes hundreds of |Shortcuts give quick access |Truncation lets you search by the first |

| |factors, emphasizing pages most heavily |to dictionary, synonyms, |few letters of a word. |

| |linked from other pages. |patents, traffic, stocks, |Proximity search lets you find terms NEAR|

| |Many additional databases including Book|encyclopedia, and more. |each other or NEXT to each other. |

| |Search, Scholar (journal articles), Blog| |Thumbnail page previews. |

| |Search, Patents, Images, etc. | |Extensive options for refining and |

| | | |limiting your search. |

|Phrase searching |Enclose phrase in "double quotes". |Enclose phrase in "double |Enclose phrase in "double quotes". |

|what's this? | |quotes". | |

|Boolean logic |Partial. AND assumed between words. |Accepts AND, OR, NOT or AND |Partial. AND assumed between words. |

|what's this? |Capitalize OR. |NOT. Must be capitalized. |Capitalize OR. |

| |( ) accepted but not required.In |( ) accepted but not |( ) accepted. |

| |Advanced Search, partial Boolean |required. |See features for more options. |

| |available in boxes. | | |

|+Requires/ |-excludes  |-excludes  |-excludes  |

|-Excludes |+ retrieves "stop words" (e.g., +in) |+ will allow you to search |+ retrieves "stop words" (e.g., +in) |

|what's this? | |common words: "+in truth" | |

|Sub-Searching |The search box at the top of the results|The search box at the top of |The search box at the top of the results |

|what's this? |page shows your current search. Modify |the results page shows your |page shows your current search. Modify |

| |this (e.g., add more terms at the end.) |current search. Modify this |this (e.g., add more terms at the end.) |

| | |(e.g., add more terms at the | |

| | |end.) | |

|Results Ranking |Based on page popularity measured in |Automatic Fuzzy AND. |Popularity ranking emphasizes pages most |

|what's this? |links to it from other pages: high rank | |heavily linked from other pages. |

| |if a lot of other pages link to it. | | |

| |Fuzzy AND also invoked. | | |

| |Matching and ranking based on "cached" | | |

| |version of pages that may not be the | | |

| |most recent version. | | |

|Field limiting |link: |link: |intitle: |

|what's this? |site: |site: |inurl: |

| |intitle: |intitle: |site: |

| |inurl: |inurl: |after:[time period] |

| |Offers U.'t Search and other |url: |before:[time period] |

| |special searches. Patent search. |hostname: |(For details, click on "Advanced search")|

| | |(Explanation of these | |

| | |distinctions.) | |

|Truncation, |No truncation. Stems some words. Search |Neither. Search with OR as in|Use * |

|Stemming |variant endings and synonyms separately,|Google. |example: messag* |

|(what's this?) |separating with OR (capitalized): | | |

| |airline OR airlines | | |

|Language  |Yes. Major Romanized and non-Romanized |Yes. Major Romanized and |Extensive language and geographic |

| |languages in Advanced Search. |non-Romanized languages. |options. Use "Advanced Search". |

|Translation |Yes, in "Translate this page" link |Available as a separate |Yes, in "Translate this page" link |

| |following some pages. To and sometimes |service. |following some pages. |

| |from English and major European | | |

| |languages and Chinese, Japanese, Korean.| | |

| |Ues its own translation software with | | |

| |user feedback. | | |

What Makes a Search Engine Good?

All search engines consist of three parts:

(1)A database of web documents, (2) a search engine operating on that database, and (3) a series of programs that determine how search results are displayed. Because the search engine business is competitive, most search engines also offer additional features that are convenient or fun.

What can vary within each of the three basic parts in search engines is shown below:

Parts of Search

Engines Variables, and their implications for searches

1. Database of web documents Size of database:

How many documents does the search engine claim it has?

How much of the total web are you able to search?

Freshness ("up-to-dateness"):

Search engine databases consist of copies of web pages and other documents that were made when their crawlers or spiders last visited each site.

How often is the database refreshed to find new pages?

How often do their crawlers update the copies of the web pages you are searching?

Completeness of text:

Is the database really "full" text, or only parts of the pages?

Is every word indexed?

Types of documents offered:

All search engines offer web pages.

Do they also have extensive PDF, Word, Excel, PowerPoint, and other formats like WordPerfect?

Are they full-text searchable?

Speed and consistency:

How fast is it?

How consistent is it? Do you get different results at different times?

2. The search engine's capabilities

All search engines let you enter some keywords and search on them. What happens inside?

Can you limit in ways that will increase your chances of finding what you are looking for?

Basic Search options and limitations:

Automatic default of AND assumed between words?

Accepts " " to create phrases?

Is there an easy way to allow for synonyms and equivalent terms (OR searching)?

Can you OR phrases or just single words?

Advanced Search options and limitations:

Can you require your search terms in specific fields, such as the document title?

Can you require some words in certain fields and others anywhere?

Can you restrict to documents only from a certain domain (org, edu, gov, etc.)?

Limit to more than one or only one?

Can you limit by type of document (pdf or excel, etc.)? More than one?

Can you limit by language?

How reliably and easily can you limit to date last updated?

General limitations and features:

What do you have to do make it search on common or stop words?

Maximum limit on search terms or on search complexity?

Ability to search within previous results?

Can you count on consistent results from search to search and from day to day?

Can you customize the search or display?

Is there a "family" filter? Does it work well? Is it easy to turn on or off?

3. Results display

All search engines return a list of results it "thinks" are what you are looking for.

How well does it "think like you expect it think"?

Ranking:

Are they ranked by popularity or relevancy or both?

Do pages with your words juxtaposed (like a phrase) rank highest?

Do you get pages with only some of your words, perhaps in addition to pages with them all?

Display:

Are your keywords highlighted in context, showing excerpts from the web pages which caused the match?

Some other excerpt from the page?

Collapse pages from the same site:

If it shows only one or a few pages from a site, does it show the one(s) with your terms?

How easy is it to see all from the site?

Can this be changed and saved as your preferred search method?

4. Other features Search engine designers try to come up with all kinds of features and services that they hope will allure you to their services.

Basic Search Tips and Advanced Boolean Explained

BASIC SEARCHING EXAMPLES

Quotation marks

“ ”

• Requires words to searched as a phrase, in the exact order you type them.

“working mothers”, ”affirmative action”, Common Words Usually Ignored

+ or “ ”to search them

• Search which versus that.

Only versus is searched on. Which and that are ignored.

• To require common words to be searched:

+which versus +that, ”which versus that”, Excluding

-word

-“phrase in quotes”

“acute pancreatitis” diet –cat –dog –“pancreatic cancer”

OR allows more than one term

OR

dogs OR cats

allows pages with at least one of the terms

• OR requires at least one of the terms joined by it to appear somewhere in the document, in any order.

“african americans” OR blacks ear OR nose OR throat

• The more words you enter connected by OR, the more documents you get.

Broadens the search..

• USES:

o The OR operator is generally used to join similar, equivalent, or synonymous concepts.

"global warming" OR "greenhouse effect"

AND (default)

dogs AND cats is the small overlap where both terms occur

• AND is the default and only needs to be typed if you are using other Boolean operators with ().

infopeople training is logically the same as infopeople and training

• The more words you enter connected by AND, the fewer documents you get. All your words will be searched on

• USES:

o The AND operator is generally used to join different kinds of concepts, different aspects of the question.

o "global warming" AND "sea level rise" AND California

The Search Strategy

The choice of a search strategy is determined by the type of information we are looking

for. For broad general information, start with a web directory. Use a search engine for narrow, specific information.

Schools of Architecture

Many schools of architecture around the world list their programmes, curriculum and

faculty on the Web, making it easier for the interested public member to get more information from the school either seeking for admission ,finding information about courses offering.

Architecture books

There are thousands of books that can be ordered online, usually at a discount. Popular

web sites with architecture books include , , ,

, , , , and

.

Architecture Magazines

Majority of international architecture magazines have web sites giving the profile of the

magazine and containing subscription information.

Online (Electronic) Architecture Journals and Magazines

There are several architecture magazines that are available online. The University of

Berkeley digital library (lib.berkeley.edu/ENVI), for example, has an

environmental design library with a collection of electronic journals. While many of the

journals are restricted to registered faculty members, most are free. The specific articles

can be downloaded as PDF files. See Table 8.

Online books

There are several sources of digital books on the web. is an online

community for architects, planners, urban designers, interior designers, landscape

architects, and scholars, with a special focus on the Islamic world. It has a digital library

with publications, images and a gallery. Publications can be downloaded from the site.

You can search for publications by author, title, building type, country, language and

specific keywords.

The Online Books Page (onlinebooks.library.upenn.edu) lists more than 20,000 online

books in English. All the books are free for personal and non-commercial use. You can

search by author and title, and browse by subject. The site also lists freely accessible

archives of magazines, journals, newspapers and other periodicals.

Advanced Boolean Explained

OPERATOR WHAT IT DOES & WHEN TO USE IT

AND NOT

dogs AND NOT cats excludes pages that mention cats, even if they also mention

dogs

• Excludes documents containing whatever follows it.

• The AND NOT operator is generally used after you have performed a search, looked at the results, and determined that you do not want to see pages containing some word or phrase.

• USES:

o The AND NOT operator should be used with extreme caution, because it eliminates the entire page, and some pages may be of value to you for other information they contain. I almost never use and not for this reason.

o "global warming" AND "sea level rise" AND NOT california -

The first two terms must be somewhere and any page containing california will be thrown out.

NEAR

dogs NEAR cats requires both terms, like AND, with the added requirement that they be within 16 words of each other Available only

• Requires the term following it to occur within a certain proximity of the preceding word in the search. In , NEAR requires the terms to be within 16 words of each other in either direction.

• Joining words by NEAR gives you fewer documents than AND, because it requires the words to be closer together.

• USES:

o The NEAR operator is used when you want to require that certain terms appear in the same sentence or paragraph of the document.

o "global warming" NEAR "sea level rise" - Requires the two phrases to occur within 16 words of each other, in either direction.

( )parentheses:, "Nesting"

• Require the terms and operations that occur inside them to be searched first. This is called "nesting."

• Parentheses MUST BE USED to group terms joined by OR when there is any other Boolean operator in the search.

o "global warming" AND "sea level rise" AND (california OR "pacific coast*") - Requires first two terms somewhere in all documents, and either california or pacific coast.

• Parentheses also MUST BE USED with NEAR:

o ("global warming" NEAR "sea level rise") AND (california

OR "pacific coast*") - Requires sea level rise to be within 16 words of global warming; the rest can be anywhere in the pages.The parentheses guarantee that the effect of near stops with sea level rise.

You do not need or even want to get very complicated with Boolean searching in web searching.

Searching the web is free, and several simpler searches take less time than a humongous search.

Moreover, with complicated searches, you often don't know which parts of the search worked and which did not. Simpler searches can more easily be compared with one another, and you know what worked.

Conclusion

The use of search engines is very common all over the world, which keep one informed with relevant and necessary information that enhance development of human power and knowledge in life generally and in one field of study such as Architecture, Engineering, social sciences and so on. The use of search engines is a very important tools one should get use to, so as to help one in sourcing latest information and current issue all over the world in individual field of specialization, and making exchange of ideas and information between lecturers, students and architects easier. Although Professionals should also develop the habit of constant activation and renewal of website so as to feed website with information so required by a searcher or user.

Glossary of Internet & Web Jargon

BACK / FORWARD:Buttons in most browsers' Tool Button Bar, upper left. BACK returns you to the document previously viewed. FORWARD goes to the next document, after you go BACK.

BLOG or WEB LOG:A blog (short for "web log") is a type of web page that offers a series of posted items (short articles, photos, diary entries, etc.). Blogs usually include a searchable archive of old postings. Blogs have become a common medium for communication in professional, political, news, trendy, and other specialized web communities.

BOOKMARKS/FAVORITES:All major web browsers include a way to store links to sites you wish to return to. Netscape, Mozilla, and Firefox use the term Bookmarks. The equivalent in Internet Explorer (IE) is called a "Favorite."

To create a bookmark, click on BOOKMARKS or FAVORITES, then ADD. Or left-click on and drag the little bookmark icon to the place you want a new bookmark filed. To visit a bookmarked site, click on BOOKMARKS and select the site from the list. Most browsers also include commands to Import and Export lists of bookmarks.

An alternative method is to store your bookmarks on a website, such as delicious or digg, that lets you access them from any computer on the Internet and see what others have bookmarked.

BOOLEAN LOGIC:A system of standardized words ("operators") used to connect search terms. These include AND, OR, NOT and sometimes NEAR. AND requires all terms appear in a record. OR retrieves records with either term. NOT excludes terms. Parentheses may be used to sequence operations and group words. Always enclose terms joined by OR with parentheses.

BROWSE:To browse through a page, exploring what's there and seeing where the links take you, is a bit like window shopping. When you browse, you have to guess which words and links on the page pertain to your interests. The opposite of browsing is searching.

BROWSERS:Software programs that enable you to view web pages and other documents on the Internet. They "translate" HTML-encoded files into the text, images, sounds, and other features you see. The most commonly used browsers are Microsoft Internet Explorer (often called IE), Firefox, Mozilla, Safari, Opera, and Chrome.

CACHE:In browsers, "cache" is used to identify a space where web pages you have visited are stored in your computer. A copy of documents you retrieve is stored in cache. When you use GO, BACK, or any other means to revisit a document, the browser first checks to see if it is in cache and will retrieve it from there because it is much faster than retrieving it from the server.

CACHED LINK:In search results from Google, Yahoo! Search, and some other search engines, there is usually a Cached link which allows you to view the version of a page that the search engine has stored in its database. The live page on the web might differ from this cached copy, because the cached copy dates from whenever the search engine's spider/crawler / webcrawler last visited the page and detected modified content. Use the cached link to see when a page was last crawled and, in Google, where your terms are and why you got a page when all of your search terms are not in it.

CASE SENSITIVE:Capital letters (upper case) retrieve only upper case. Most search tools are not case sensitive or only respond to initial capitals, as in proper names. It is always safe to key all lower case (no capitals), because lower case will always retrieve upper case.

CGI:"Common Gateway Interface," the most common way Web programs interact dynamically with users. Many search boxes and other applications that result in a page with content tailored to the user's search terms rely on CGI to process the data once it's submitted, to pass it to a background program in JAVA, JAVASCRIPT, or another programming language, and then to integrate the response into a display using HTML.

DOMAIN, TOP LEVEL DOMAIN (TLD):Hierarchical scheme for indicating logical and sometimes geographical venue of a web-page from the network. In the US, common domains are .edu (education), .gov (government agency), .net (network related), .com (commercial), .org (nonprofit and research organizations). Outside the US, domains indicate country: ca (Canada), uk (United Kingdom), au (Australia), jp (Japan), fr (France), etc. Neither of these lists is exhaustive.

DOMAIN NAME, DOMAIN NAME SERVER (DNS)ENTRY:Any of these terms refers to the initial part of a URL, down to the first /, where the domain and name of the host or SERVER computer are listed (most often in reversed order, name first, then domain). The domain name gives you who "published" a page, made it public by putting it on the Web.

A domain name is translated in huge tables standardized across the Internet into a numeric IP address unique the host computer sought. These tables are maintained on computers called "Domain Name Servers." Whenever you ask the browser to find a URL, the browser must consult the table on the domain name server that particular computer is networked to consult.

"Domain Name Server entry" frequently appears a browser error message when you try to enter a URL. If this lookup fails for any reason, the "lacks DNS entry" error occurs. The most common remedy is simply to try the URL again, when the domain name server is less busy, and it will find the entry (the corresponding numeric IP address).

DOWNLOAD:To copy something from a primary source to a more peripheral one, as in saving something found on the Web (currently located on its server) to diskette or to a file on your local hard drive.

FIELD SEARCHING:Ability to limit a search by requiring word or phrase to appear in a specific field of documents (e.g., title, url, link).

FIND:Tool in most browsers to search for word(s) keyed in document in screen only. Useful to locate a term in a long document. Can be invoked by the keyboard command, CTRL-F (CMD-F on a Macintosh).

FRESHNESS:How up-to-date a search engine database is, based primarily on how often its spiders recirculate around the Web and update their copies of the web pages they hold, and discover new ones. Also determined by how quickly they integrate new sites that web authors send to them. Two weeks is about as good as most search engines do, but some update certain selected web sites more frequently, even daily.

FTP:File Transfer Protocol. Ability to transfer rapidly entire files from one computer to another, intact for viewing or other purposes.

KEYWORD(S) :A word searched for in a search command. Keywords are searched in any order. Use spaces to separate keywords in simple keyword searching. To search keywords exactly as keyed (in the same order.

PDF or .pdf or pdf file :Abbreviation for Portable Document Format, a file format developed by Adobe Systems, that is used to capture almost any kind of document with the formatting in the original. Viewing a PDF file requires Acrobat Reader, which is built into most browsers and can be downloaded free from Adobe.

POPULARITY RANKING of search results:Some search engines rank the order in which search results appear primarily by how many other sites link to each page (a kind of popularity vote based on the assumption that other pages would create a link to the "best" pages). Google is the best example of this.

RELEVANCY RANKING of search results:The most common method for determining the order in which search results are displayed. Each search tool uses its own unique algorithm. Most use "fuzzy and" combined with factors such as how often your terms occur in documents, whether they occur together as a phrase, and whether they are in title or how near the top of the text. Popularity is another ranking system.

SEARCH:You can search any individual web page using the CTRL-F command (CMD-F on a Macintosh). Many websites also offer search boxes that let you search all the pages in the site, or records in its database. Searching is usually the most efficient way to find information, but sometimes you can find things by browsing that you might miss otherwise because you might not think of the "right" term to search by.

SERVER, WEB SERVER:A computer running that software, assigned an IP address, and connected to the Internet so that it can provide documents via the World Wide Web. Also called HOST computer. Web servers are the closest equivalent to what in the print world is called the "publisher" of a print document. An important difference is that most print publishers carefully edit the content and quality of their publications in an effort to market them and future publications. This convention is not required in the Web world, where anyone can be a publisher; careful evaluation of Web pages is therefore mandatory. Also called a "Host."

SITE or WEB-SITE:This term is often used to mean "web page," but there is supposed to be a difference. A web page is a single entity, one URL, one file that you might find on the Web. A "site," properly speaking, is an location or gathering or center for a bunch of related pages linked to from that site. For example, the site for the present tutorial is the top-level page "Internet Resources." All of the pages associated with it branch out from there -- the web searching tutorial and all its pages, and more. Together they make up a "site." When we estimate there are 5 billion web pages on the Web, we do not mean "sites." There would be far fewer sites.

SPIDERS :Computer robot programs, referred to sometimes as "crawlers" or "knowledge-bots" or "knowbots" that are used by search engines to roam the World Wide Web via the Internet, visit sites and databases, and keep the search engine database of web pages up to date. They obtain new pages, update known pages, and delete obsolete ones. Their findings are then integrated into the "home" database.

Most large search engines operate several robots all the time. Even so, the Web is so enormous that it can take six months for spiders to cover it, resulting in a certain degree of "out-of-datedness" (link rot) in all the search engines.

SPONSOR (of a Web page or site) :Many Web pages have organizations, businesses, institutions like universities or nonprofit foundations, or other interests which "sponsor" the page. Frequently you can find a link titled "Sponsors" or an "About us" link explaining who or what (if anyone) is sponsoring the page. Sometimes the advertisers on the page (banner ads, links, buttons to sites that sell or promote something) are "sponsors." WHY is this important? Sponsors and the funding they provide may, or may not, influence what can be said on the page or site -- can bias what you find, by excluding some opposing viewpoint or causing some other imbalanced information.

STEMMING ;In keyword searching, word endings are automatically removed (lines becomes line); searches are performed on the stem + common endings (line or lines retrieves line, lines, line's, lines', lining, lined). Not very common as a practice, and not always disclosed. Can usually be avoided by placing a term in " ".

STOP WORDS :In database searching, "stop words" are small and frequently occurring words like and, or, in, of that are often ignored when keyed as search terms. Sometimes putting them in quotes " " will allow you to search them.

URL :Uniform Resource Locator. The unique address of any Web document. May be keyed in a browser's OPEN or LOCATION / GO TO box to retrieve a document. There is a logic the layout of a URL:

Table 2: Anatomy of a URL: Source: lib.berkeley.edu

|Type of file |Domain name (computer file is on and |Path or directory on the computer to this |Name of file, and its file |

|(could say ftp:// |its location on the Internet) |file |extension (usually ending in |

|or telnet://) | | |.html or .htm) |

|http:// |lib.berkeley.edu/ |TeachingLib/Guides/Internet/ |FindInfo.html |

[pic]

References

1.

2. Search Engine Showdown, written by Greg R. Notess.

3.

4. Teaching Library, University of California, Berkeley by Joe Barker.

5. Prof.Olu Ola Ogunsote,Dr. (Mrs.) Bodga Prucnal-Ogunsote (2003). “The use of Search Engine, Web directories and indices on the world wide web for Architectural Research in Nigeria"Paper presented at the year 2003 Annual General meeting and Conference Of the (AARCHES),24th – 27th September 2003,Ahamadu Bello University, Zaria, Nigeria. pp.7-9.

TABLE OF CONTENTS

Table of Content 1

1.0 Abstract 2

1.1 Introduction 2

1.2 Uses of Search Engines :What to consider: 3

1.3 Recommended Search Engines 8

1.4 What make a search Engine Good 11

1.5 Basic Search Tools 13

1.6 The Search Strategy 14

1.7 Advanced Boolean Explained 15

1.8 Conclusion 16

1.9 Glossary Of Internet & Web Jargon 17

1.10 References 21

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download