

Linguistics Lab

This document is provided “as-is”. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it.

Some examples depicted herein are provided for illustration only and are fictitious.  No real association or connection is intended or should be inferred.

This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes.

© 2010 Microsoft Corporation. All rights reserved.

Linguistics Lab


Cindy Staley, Senior Technical Instructor, Microsoft Learning

Microsoft Corporation

July 2010

Applies to: Microsoft FAST® Search Server 2010 for SharePoint

Summary: This is the lab for the Linguistics module.


Contents 2

Exercise 1: Work with Content in Multiple Languages 3

Scenario 3

Task 1: Feed sample documents into FAST Search Server for SharePoint 3

Task 2: Search for documents in multiple languages 3

Exercise 2: Manage Keywords 4

Scenario 4

Task 1: Define a keyword with a two-way synonym 4

Exercise 3: Stemming 5

Scenario 5

Conclusion 6

Exercise 1: Work with Content in Multiple Languages


Your search application needs to handle content in multiple languages. In this exercise, you will verify that FAST Search Server for SharePoint can process content in several languages and support searching for content in a specific language.

For this exercise, you have been provided with a set of documents to feed into FAST Search Server for SharePoint. These documents are located at \Labfiles\Linguistics\sampledocs, on the C drive. The documents consist of HTML and text versions of the same file(s), translated into multiple languages. You will feed these documents into FAST Search Server for SharePoint, then using the advanced search page, conduct searches restricted to a specific language.

The main tasks for this exercise are as follows:

1. Feed sample documents into FAST Search Server for SharePoint

1. Search for documents in multiple languages

← Task 1: Feed sample documents into FAST Search Server for SharePoint

1. Feed the sample documents into FAST Search Server for SharePoint using a file system crawler.

a. In Central Administration, select Manage service applications.

a. From the list of service applications, select FASTContent.

b. In the Crawling section of the left navigation area, select Content Sources.

c. Click New Content Source.

d. Enter a content source name: Training Content.

e. For content source type, select the File Shares radio button.

f. In the Start Addresses box, enter \\demo2010a\c$\labfiles\linguistics\sampledocs\.

g. Scroll down and check the Start full crawl of this content source checkbox.

h. Click the OK button.

2. Monitor the progress of the crawl in the Crawl Log area. The crawl for this content source will be complete when the log shows 16 successes.

3. Examine the txt files contained in the sampledocs folder. Note that these files do not contain any metadata on the language of the text to be indexed. Therefore, FAST Search for SharePoint is identifying the language of these documents through direct analysis of the text.

← Task 2: Search for documents in multiple languages

1. In a browser window, navigate to the search application:

2. Click the Advanced link to navigate to the advanced search page.

3. Search for “word”. The search results should include documents from multiple languages.

4. Navigate back to the Advanced Search Page. This time, restrict your search to French documents and search for “word” again. This time, your search should only return documents whose language was identified as French. Notice that the search query, as displayed on the results page, includes the language restriction.


|Results: After this exercise, you should have fed content from multiple languages and tested language identification |

|by using the advanced search page to restrict searches to specific languages. |

Exercise 2: Manage Keywords


Keywords are typically created for frequent search terms for which you wish to associate additional information: a definition, a (visual) best bet, a promotion, or in the context of this module, a synonym. Synonyms can be used to expand the search to include additional terms – expand an acronym into its full phrase, expand a term to one or more product names, etc.

In this exercise, you will gain experience defining and testing a two-way synonym. Specifically, you will create a keyword for page and associate with it a two-way synonym: document.

The main tasks for this exercise are as follows:

1. Define a keyword with a two-way synonym

← Task 1: Define a keyword with a two-way synonym

1. In a browser window, navigate to the search site:

2. Search for document. Record the number of results here:


3. Search for page. Record the number of results here:


4. Define a two-way synonym pagedocument for the home site.

a. In a browser window, navigate to the home site:

b. From the Site Actions menu, select Site Settings, then FAST Search Keywords.

c. Click Add Keyword.

d. In the Add Keyword dialog, specify the following:

Keyword phrase: page

Two-way synonyms: document

Keyword definition: pages and documents are the same things

e. Click OK to save your definition. This will bring you back to the Manage Keywords page.

5. Search again for document and page. You should get the same number of search results for each. In addition, in both cases you should see the keyword definition above the search results.

|Results: After this exercise, you should have defined a keyword with a two-way synonym. |

Exercise 3: Stemming


In this exercise, you will compare search results with stemming turned on and turned off in the Search Core Results web part. There are no explicit instructions for this exercise. Here are some sample searches to try.

|Search query |# Results |# Results |

| |stemming enabled |Stemming disabled |

|Document | | |

|Documents | | |

|Report | | |

|Reports | | |

Do the results match your expectations? Note that your synonym definition from the previous exercise (page document) will affect your results. Can you draw any conclusions about how synonyms and stemming interact?

When you are finished with your test searches, return the stemming setting of the Search Core Results web part to its default: enabled.

|Results: After this exercise, you should have gained experience enabling and disabling stemming in your search front |

|end application. |


This concludes the Lab exercises for the Linguistics module



In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download