Natural Language Processing:



ABSTRACT

The field of Natural Language Processing is emerging as a rich and useful field in Computer Science. With the explosion of the Internet and other large-scale digital document databases, the problem of doing useful analysis of natural language text has come to the forefront of Artificial Intelligence and Machine Learning. This paper presents an overview of the field of Natural Language Processing as the basis for a 10-week introductory class on the subject matter.

The concept for this semester’s independent work stemmed from a previous semester’s independent work. The earlier project entailed the creation of an Internet-based software package that allows users to highlight unknown foreign language words, thus automatically triggering the display of an English translation. Ideally, the concept would be expanded to translate sentences and paragraphs into English, but the technology necessary for the accurate execution of such a task did not exist at the time of this project’s completion.

My curiosity about the absence of this technology evolved into a semester-long research exploration of Natural Language Processing, the field of Computer Science that includes Machine Translation. As a testament to my learning, I decided to propose a class on this topic, for other students like myself would presumably be interested in this subject matter as well. The paper that follows is a summary of the key concepts in NLP that I suggest be covered in a ten-week class. The instructor of the class is encouraged to delve into more detail in certain sections, because there is a plethora of information and specifics that I was unable to include in this paper. Topics such as parsing techniques and statistical natural language processing were not expanded upon because of time/length restrictions; however, they are very rich and fundamental topics in NLP.

1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE

The field of Artificial Intelligence (AI) is the branch of Computer Science that is primarily concerned with the ability of machines to adapt and react to different situations as human do. In order to achieve artificial intelligence, we must first understand the nature of human intelligence. Human intelligence is a behavior that incorporates a sense of purpose in actions and decisions. Intelligent behavior is not a static procedure. Learning, defined as behavioral changes over time that better fulfill an intelligent being’s sense of purpose,[1] is a fundamental aspect of intelligence. An understanding of intelligent behavior will be realized when either intelligence is replicated using machines, or conversely when we prove why human intelligence cannot be replicated.

In an attempt to gain insight into intelligence, researchers have identified three processes that comprise intelligence: searching, knowledge representation, and knowledge acquisition. The field of AI can be broken down into five smaller components, each of which relies on these three processes to be performed properly. They are: game playing, expert systems, neural networks, natural language processing, and robotics programming.[2]

Game playing is concerned with programming computers to play games, such as chess, against human or machine opponents. This sub-field of AI relies mainly on the speed and computational power of machines. Game playing is essentially a search problem because the machine is required to consider a multitude of possibilities. While the computational power of machines is greater than that of the human brain, machines are unable to solve search problems perfectly because the size of the search space grows exponentially with the depth of the search, making the problem intractable.

Expert systems are programmed systems that allow trained machines to make decisions within a very limited and specific domain. Expert systems rely on a huge database of information, guidelines, and rules that suggest the correct decision for the situation at hand. Although they mainly rely on their working memory and knowledge base, the systems must make some inferences. The vital importance of storing information in the database in a manner such that the computer can “understand” it creates a knowledge representation problem.

Neural networks, a field inspired by the human brain, attempts to accurately define learning procedures by simulating the physical neural connections of the human brain. A unique aspect of this field is that the networks change by themselves, adapting to new inputs with respect to the learning procedures they have previously developed. The learning procedures can vary and incorporate many different forms of learning, which include learning by recording cases, by analyzing differences, or by building identity trees (trees that represent hierarchical classification of data).

Natural Language Processing (NLP) and robotics programming are two fields that simulate the way humans acquire information, an integral part to intelligence formation. The two are separate sub-fields because of the drastic difference in the nature of their inputs. Language, the input of NLP, is a more complex form of information to process than the visual and tactile input of robotics. Robotics typically transforms its input into motion, whereas NLP has no such state associated state transformation.

A perfection of each of the sub-fields is not necessary to replicate human intelligence because a fundamental characteristic of humans is to err. However, it is necessary to form a system that puts these components together in an interlocking manner, where the outputs of some of these fields should be inputs for others, to develop a high-level system of understanding. To date, this technology does not exist over a broad domain.

The fundamental nature of intelligence, and the possibility of its replication is a complex philosophical debate, better left for a more esoteric discussion. In order to learn more about the field of AI from the Computer Science perspective, we will entertain the notion that it is possible for machines to possess intelligence.

Before we explore this topic further, we must consider the social, moral and ethical implications of AI. Beginning with the most immediate and realistic consequences, this increase in machine capabilities could lead to a decrease in the amount of human-to-human contact. This could change the development of humans as a species, and is an important consideration to take in deciding the future of AI. Economic crisis could also occur as a result of artificial intelligence. Machines have already begun to replace humans in the workforce who do repetitive, “mindless” jobs. If machines could perform intelligent tasks, they could overtake work fields and might cause mass unemployment and poverty.

The perfection of an artificially intelligent being poses more significant albeit distant consequences. With the advent of this technology we would be decreasing the value of the human species because we would no longer be distinguished or unique. If machines could do all that we can do, and at a fraction of the cost it takes to nurture a new human to its full capability, then many humans may not feel the compelling need to reproduce our species.

Further, the social implications pose a set of ethical dilemmas. If we can create these intelligent beings that have no conscience or moral obligations it can be argued that we will be creating a new breed of mentally deranged criminals, individuals who will never feel responsibility or remorse for their actions. In the society in which we live it is necessary for someone to be held responsible for each reprehensible action. Who would be liable for any mistakes machines make, and who would deal with the consequences? Should the machines be held accountable for their decisions or should the programmers of the machines? If it is appropriate for the machine to be held responsible, how could we punish them? The conventional methods we use to punish humans (e.g. jail, community service) are not effective for machines. If the programmers should be held responsible, it is our duty to warn them now as not to breach the ex-post facto clause of the United States Constitution.

Many science-fiction movies and books give a glimpse into how much our world could change with these technological advances. In 1984, the horror/science fiction movie Terminator was released in theaters. The movie was set in the futuristic year 2029, and was based on the hypothetical situation that intelligent machines became more sophisticated than their human creators were. The machines ‘ran the world’ and had the power to obliterate the human race. There have been two sequels to this movie, each of which is based on the same premise of artificial intelligence superseding human intelligence. Although, these situations seem very far-fetched, one can not ignore their implications.

Upon embarking on significant AI research, one must recognize the potential consequences and strive to resolve them before artificial intelligence is fully realized.

We will end this introduction to Artificial Intelligence with one of the first observations on the field. In 1642 René Descartes articulated his doubts about the possibility of artificial intelligence in his paper “Discourse on the method of Rightly Conducting the Reason and Seeking Truth in Sciences”

“…but if there were machines bearing the image of our bodies, and capable of imitating our actions as far as it is morally possible, there would still remain two most certain tests whereby to know that they were not therefore really men. Of these the first is that they could never use words or other signs arranged in such a manner as is competent to us in order to declare our thoughts to others: for we may easily conceive a machine to be so constructed that it emits vocables, and even that it emits some correspondent to the action upon it of external objects which cause a change in its organs; … but not that it should arrange them variously so as appositely to reply to what is said in its presence, as men of the lowest grade of intellect can do. The second test is, that although such machines might execute many things with equal or perhaps greater perfection than any of us, they would, without doubt, fail in certain others from which it could be discovered that they did not act from knowledge, but solely from the disposition of their organs: for while reason is an universal instrument that is alike available on every occasion, these organs, on the contrary, need a particular arrangement for each particular action; whence it must be morally impossible that there should exist in any machine a diversity of organs sufficient to enable it to act in all the occurrences of life, in the way in which our reason enables us to act.”

His basic statement is that machines cannot possess intelligence as humans possess intelligence for two reasons: First because they do not have the capacity to use language in the dynamic form that we humans can use it, and second because computers lack the ability to reason.

The rest of this class will focus on dispelling the first half of Descartes’ statement. We will use the terms ‘use of language’ and ‘Natural Language Processing’ synonymously. The goal of NLP, and the goal of this class, is to determine a system of symbols, relations and conceptual information that can be used by computer to communicate with humans.

2. INTRODUCTION TO NATURAL LANGUAGE PROCESSING (NLP)

One of the most widely researched applications of Artificial Intelligence is Natural Language Processing. NLP’s goal, as previously stated, is to determine a system of symbols, relations and conceptual information that can be used by computer logic to communicate with humans. This implementation requires the system to have the capacity to translate, analyze and synthesize language. With the goal of NLP well defined, one must clearly understand the problem of NLP. Natural language is any human “spoken or written language governed by sets of rules and conventions sufficiently complex and subtle enough for there to be frequent ambiguity in syntax and meaning.”[3] The processing of language entails the analysis of the relationship between the mental representation of language and its manifestation into spoken or written form.[4]

Humans can process a spoken command into its appropriate action. We can also translate different subsets of human language (e.g. French to English). If the results of these processes are accurate, then the processor (the human) has understood the input. The main tasks of artificial NLP are to replace the human processor with a machine processor and to get a machine to understand the natural language input and then transform it appropriately.

Currently, humans have learned computer languages (e.g. C, Perl, and Java) and can communicate with a machine via these languages. Machine languages (MLs) are a set of instructions that a computer can execute. These instructions are unambiguous and have their own syntax, semantics and morphology. The main advantage of machine languages, and the major difference between ML’s and NL’s, is ML’s unambiguous nature, which is derived from their mathematical foundation. They are also easier to learn because their grammar and syntax are constrained by the finite set of symbols and signals. Developing a means of understanding (a compiler) for these languages is remarkably easy compared to the degree of difficulty of developing a means of understanding for natural languages.

An understanding of natural languages would be much more difficult to develop because of the numerous ambiguities, and levels of meaning in natural language. The ambiguity of language is essentially why NLP is so difficult. There are five main categories into which language ambiguities fall: syntactic, lexical, semantic, referential and pragmatic.[5]

The syntactic level of analysis is strictly concerned with the grammar of the language and the structure of any given sentence. A basic rule of the English language is that each sentence must have a noun phrase and a verb phrase. Each noun phrase may consist of a determiner and a noun, and each verb phrase may consist of a verb, preposition and noun phrase. There are various different valid syntactic structures, and rules such as this make up the grammar of a language and must be represented in a concrete manner for the computer. Secondly, there must exist a parser, which is a system that determines the grammatical structure of an input sentence by comparing it to the existing rules. A parser must break the input down into words and determine by categorizing each word if the sentence is grammatically sound. Often, there may be more than one grammatically sound parsing. (see figure 1)

Figure 1: A parsing example with more than one correct grammatical structure

The lexical level of analysis concerns the meanings of the words that comprise each sentence. Ambiguity increases when a word has more than one meaning (homonyms). For example “duck” could either be a type of bird, or an action involving bending down. Since these two meanings have different grammatical categories (noun and verb) the issue can be resolved by syntactic analysis. The sentence’s structure will be grammatically sound with one of these parts of speech in place. From this information, a machine can determine the definition that appropriately conveys the sense of the word within the sentence. However this process does not resolve all lexical ambiguities. Many words have multiple meanings within the same part of speech, or a part of speech can have sub-categories that also need to be analyzed. The verb “can” can be considered an auxiliary verb or a primary verb. If it is to be considered a primary verb, it can convey different meanings. The primary verb “can” can either mean “to fire” or “the process of putting stuff into a container”. In order to resolve these ambiguities we must resort to semantic analysis.

The semantic level of analysis addresses the contextual meanings of the words as they relate to word definitions. In the “can” example, if another verb follows the word, then it is most likely an auxiliary verb. Otherwise, if the other words in the sentence are related to jobs or work then the former definition of the real verb should be taken. If the other words were related to preserves or jams, the latter definition would be more suitable. The field of statistical analysis provides methodology to resolve this ambiguity. When this type of ambiguity arises, we must rely on the meaning of the word to be defined by the circumstances of its use. Statistical Natural Language Processing (SNLP) looks at language as a non-categorical phenomenon and can use the current domain and environment to determine the meanings of words.

SNLP can also be used to gather another type of contextual information. It can track the slow evolution of word meanings. For example, years ago the word “like” was used in comparisons, as a conjunction or a verb. Currently, it is often inadvertently used as a colloquialism. This is the type of contextual information that is necessary in order to resolve pragmatic ambiguities. Pragmatic ambiguities are cultural phrases or idioms that have not been developed according to any set rules. For example, in the English language, when a person asks, “Do you know what time it is?” he usually is not wondering if you are aware of the hour, but more likely wants you to tell him the time.

Referential ambiguities deal with the way clauses of a sentence are linked together. For example, the sentence “John hit the man with the hammer” has referential ambiguity because it does not specify if John used a hammer to hit a man, or if John hit the man who had a hammer. Referential ambiguities in a sentence are very difficult to reduce because there may be no other clues in the sentence. In order to determine which clauses of the sentence refer to or describe each other (in the example, who the hammer belongs to), the processor would have to increase its scope of analysis and consider surrounding sentences to look for clarification.

There are many tasks that require an understanding of Natural Language. Database queries, fact retrieval, robot command, machine translation and automatic text summarization are just a small subset of the tasks. Although complete understanding has not yet been achieved, there are imperfect versions of NLP technologies on the market. We will look at some of the current NLP technologies and discuss their limitations later in the course.

3. NLP Programs

3.1. ELIZA

Joseph Weizenbaum developed Eliza in 1966. It is a program for the study of natural language communication between man and machine. The format of this program is dialogue via teletype between a computer ‘psychiatrist’ and a human ‘patient’. Eliza merely simulates an understanding of language; it is not an intelligent system because Eliza’s output is not reasoned feedback based on the input. Eliza determines its output by recognizing patterns in the input and transforming them according to a series of programmed scripts.

There are three main algorithmic steps. First, Eliza takes in an English sentence as input and searches it for a key word and key grammatical structure. Potential key words are articulated in Eliza’s programmed script. If more than one key word is identified in the input, the right-most key word before the first punctuation mark is the only word that will be considered. The key grammatical structure is identified by decomposing the sentence. For example, if the structure “I am ____” exists in the sentence, the sentence will be classified as an assertion. Based on these identifications, a transformation of the input is selected. These transformations typically entail word conversions (such as converting “I” in the input to “you” in the output). The transformation also includes a re-assembly rule, which is essentially a series of text manipulations. For instance, if the user inputs “I am feeling sad today”, Eliza will manipulate the assertion to output the “insightful” question “Why are you feeling sad today?”

The huge shortcoming of this program, which precludes it from ‘understanding’, is that it never analyzes the input or output of the program on any other level than the syntactic level. Another problem with this system is its tightly bounded nature. If no key words are found in the input, Eliza has no alternative reasoning method to gain a sense of the input. In this case, Eliza resorts to either outputting non-specific, broad comments (e.g. “tell me more”) or reiterating a previous comment.

While Eliza falls short of any significant language processing, it exists as a proof of how difficult the concept of NLP is. Creating more than an illusion of understanding is too complex for many programming languages to deal with. Eliza can be implemented in a machine language that makes no allowances for the needs of natural language processing, such as Java. There are machine languages that provide better knowledge representation structures, whose capabilities can aid in the development of artificial understanding.

Despite Eliza’s shortcomings, it has been enormously popular as an amusing mock therapist because of its perceived understanding of users’ problems, and because of the anonymity its users maintain. One telling anecdote of the potential for machines to replace humans for companionship follows. After Weizenbaum had developed Eliza, he asked his secretary to test it for him. His secretary ‘conversed’ with Eliza for a few minutes and then asked Weizenbaum to leave the room so she could be alone with her computer.

3.2. SHRDLU

SHRDLU, developed in 1968 by Terry Winograd, is another program that studies natural language communication between man and computer. The program involves a robot that has the ability to manipulate toy blocks in its environment, based on human input. It was not developed as a tool to explore the field of robotics; hence those implementation details will be omitted from this section. Instead we will focus on the language processing components of the system, and what level of understanding they achieve.

SHRDLU is a useable language system for a very limited domain, which demonstrates its understanding of language by carrying out given commands in its domain. A user can type in any English/natural language command related to a predefined environment of toy blocks. The program has a working knowledge of its environment, the block placement, colors, etc. It accurately processes the command into an appropriate action. For example, the command “Pick up the red cone and put it atop the blue box” is a valid command and would be executed if the red cone and blue box existed in the environment.

Its creator argues that SHRDLU is a fully intelligent system because it can answer questions about its environment and actions. Although that argument is debatable, it is unquestionable that SHRDLU achieves a higher level of understanding than Eliza. SHRDLU acknowledges the syntax, semantics and referential problems that could arise in its input, whereas Eliza only addresses syntactical issues. However, each of these ambiguities is simpler to resolve in this context than in general language because of the limited domain of the input. Lexical ambiguities are eliminated from the system because of its vocabulary constraints. These three primary ambiguities are each addressed in different, cooperative modules within the SHRDLU algorithm.

The input starts out in the syntactic module, where the input is parsed, and then it is passed along to the semantic module for further analysis. The semantic module works with the syntactic module to resolve any discrepancies, then attempts to “provide real referents for the objects.”[6] SHRDLU approaches this task as a proof, and defines the existence of a real referent by proving that the negation of the task is impossible. Next, the pragmatic module is called, which takes into account the entire environmental context and attempts to place the request in context. Deduction and an attempt to reason are incorporated in this module. If the command “Pick it up” is given to the program, this module will use deduction to determine what “it” refers to. Deduction is also used if the command “Pick up the block bigger than the green block” is given, in determining relative sizes. After passing through all three modules, if ambiguity still exists, the system will ask the user to explicitly clarify the uncertainty.

One contributing factor to SHRDLU’s sense of understanding is its knowledge representation structure. SHRDLU is implemented in LISP, a programming language developed to address the needs of Natural Language Processing. LISP’s central control structure is recursion, which plays a direct role in SHRDLU’s parsing algorithm. Additionally, LISP’s symbolic expressions enable the program to associate more information than mathematical values to words. These and other LISP features that aid in the understanding of machine language understanding will be discussed more in the next section.

3.3. LISP

LISP is the primary programming language for AI and NLP applications. It was invented in 1959 by John McCarthy. It is unlike other artificial languages because it provides features useful to capture the abstract sense of meaning in concrete data structures. The name LISP was derived from the general capability of the language to perform LISt Processing tasks, but the language’s power extends far beyond list processing.

LISP can create and manipulate complex objects and symbols that represent words or grammar structures. There are two main data structures in LISP: atoms and lists. Atoms are essentially identifiers that may include numbers. Lists are a collection of atoms and/or other lists. To introduce the functionality of LISP in NLP, we will consider simple sentence generation. This generation program will have syntactic categorization of all words to be used in the generation, and will have a set grammatical structure in place. The structure for this example is basic: A sentence is comprised of a noun phrase and a verb phrase. A noun phrase consists of an article and a noun, and a verb phrase consists of a verb and a noun phrase. Each of the parts of speech, noun, article and verb, will have a certain subset of words (atoms) attached to them. In LISP, this is represented as a series of functions as follows:[7]

(defun sentence() (append (noun-phrase) (verb-phrase)))

(defun noun-phrase() (append (Article) (Noun)))

(defun verb-phrase() (append (verb) (noun-phrase)))

(defun Article() (one-of ‘(the a)))

(defun Noun() (one-of ‘(man ball woman table)))

(defun Verb() (one-of ‘(hit took saw liked)))

The LISP program will generate a sentence after a call to the sentence function. This algorithm can generate sentences such as “The man hit the table” or “The table liked the woman”. Obviously, this generator takes no measures to ensure the sentence is semantically correct. This feature is considerably harder to implement, though LISP offers the capabilities. Lists can also be created such that they are the linear representation of identity trees, which incorporate a sense of semantics into the language. An identity tree would help ensure that the sentences generated by the previous example were semantically correct by classifying the words with properties that are indicative of their use.

Another feature that makes LISP attractive for language processing is its recursive control structure. Recursion is particularly useful in parsing sentences. “To parse a sentence means to recover the constituent structure of the sentence, to discover what sequence of generation rules could have been applied to come up with the sentence.”[8] A grammar that has recursive constructs is better for parsing, and leads to more intricate grammar structures. Expanding on the previous example, we could add in a recursive Adjective function:

(defun Adjective() (one-of ‘((), adj, Adjective)))

(defun adj() (one-of ‘(happy blue pretty)))

An adjective function call should be added in the original noun-phrase definition ((defun noun-phrase() (append (Article) (Adjective) (Noun)))) for the grammar to be expanded.

Two other features of LISP central to NLP are its dynamic memory allocation and its extensibility. The dynamic memory allocation in LISP implies that there are no artificial bounds or limits on the number or size of data structures. Functions can be created or destroyed while a program is running. Since we have not yet defined a finite set of grammar rules or structures that encompass natural language, this feature is necessary.

The extensibility of LISP is one of the factors that have enabled it to persevere as the top AI language since its conception in 1959. LISP is flexible enough to adapt to the new language design theories that develop.

LISP is a very powerful language; this section was not an attempt to summarize it, but rather to familiarize the student with the language. LISP is the predominant language in AI technology and is impossible to present in a few pages. Instead, we will present some of the current NLP applications that are in use today (and which are primarily implemented in LISP), as a means of further discussing LISP’s capabilities.

IV. APPLICATIONS OF NLP

One important application of NLP is Machine Translation (MT): “the automatic translation of text…from one [natural] language to another.”[9] The existing MT systems are far from perfect; they usually output a buggy translation, which requires human post-edit. These systems are useful only to those people who are familiar enough with the output language to decipher the inaccurate translations. The inaccuracies are in part a result of the imperfect NLP systems. Without the capacity to understand a text, it is difficult to translate it. Many of the difficulties in realizing MT will be resolved when a system to resolve pragmatic, lexical, semantic and syntactic ambiguities of natural languages is developed.

One further difficulty in Machine Translation is text alignment. Text alignment is not a part of the language translation process, but it is a process that ensures the correct ordering of ideas within sentences and paragraphs in the output. The reason this is such a difficult task is because text alignment is not a one-to-one correspondence. Different languages may use entirely different phrases to convey the same message.

For example, the French phrases “Quant aux eaux minérals et aux lemonades, elles rencontrent toujours plus d’adeptes. En effet notre sondage fait ressortir de ventes nettement supérieures à celles de 1987, pour les boissons à base de cola notamment” would correspond to the English phrase “ With regard to the mineral waters and the lemonades, they encounter still more users. Indeed our survey makes stand out the sales clearly superior to those in 1987 for cola based drinks especially.” While this one-to-one correspondence translation is grammatically accurate, the sense of the sentence would be better conveyed if some phrases were rearranged to read “According to our survey, 1988 sales of mineral water and soft drinks were much higher than in 1987, reflecting the growing popularity of these products. Cola drink manufacturers in particular achieved above average growth rates.”[10]

There are currently three approaches to Machine Translation: direct, semantic transfer and inter-lingual. Direct translation entails a word-for-word translation and syntactic analysis. The word-for-word translation is based on the results of a bilingual dictionary query, and syntactical analysis parses the input and regenerates the sentences according to the output language’s syntax rules. For example the phrase “Les oiseaux jaunes” could be accurately translated into “The yellow birds” using this technology. This kind of translation is most common today in commercial systems, such as Altavista. However this approach to MT does not account for semantic ambiguities in translation.

The semantic transfer approach is more advanced than the direct translation method because it involves representing the meaning of sentences and contexts, not just equivalent word substitutions. This approach consists of a set of templates to represent the meaning of words, and a set of correspondence rules that form an association between word meanings and possible syntax structures in the output language. Semantics, as well as syntax and morphology, are considered in this approach. This is useful because different languages use different words to convey the same meaning. In French, the phrase “Il fait chaud” corresponds to “It is hot”, not “It makes hot” as the literal translation would suggest. However, one limitation of this approach is that each system must be tailored for a particular pair of languages.

The third and closest to ideal (thus inherently most difficult) approach to MT is translation via interlingua. “An interlingua is a knowledge representation formalism that is independent of the way particular languages express meaning.”[11] This approach would form the intermediary step for translation between all languages and enable fluent communication across cultures. This technology, however, greatly depends on the development of a complete NLP system, where all levels of analysis and ambiguities in natural language are resolved in a cohesive nature. This approach to MT is mainly confined to research labs because significant progress has not yet been made to develop accurate translation software for commercial use.

Although MT over a large domain is yet unrealized, MT systems over some limited contexts have been almost perfected. This idea of closed context is essentially the same concept used when developing SHRDLU; one can develop a more perfect system by constraining the context of the input. This constraint resolves many ambiguities and difficulties by eliminating them. However, these closed contexts do not necessarily have to be about a certain subject matter (as SHRDLU was confined to the subject of toy blocks) but can also be in the form of controlled language. For example, “at Xerox technical authors are obliged to compose documents in what is called Multinational Customized English, where not only the use of specific terms is laid down, but also are the construction of sentences.”[12] Hence their MT systems can accurately deal with these texts and the documents can be automatically generated in different languages.

Again, before a perfection of MT over an unrestricted domain can be realized, further research and developments must be made in the field of NLP.

Another application that is enabled by NLP is text summarization, the generation of a condensed but comprehensive version of an original human-composed text. This task, like Machine Translation, is difficult because creating an accurate summary depends heavily on first understanding the original material. Text summarization technologies cannot be perfected until machines are able to accurately process language. However, this does not preclude parallel research on the two topics; text summarization systems are based on the existing NLP capabilities.

There are two predominant approaches to summarization: text extraction and text abstraction. Text extraction removes pieces from the original text and concatenates them to form the summary. The extracted pieces must be the topic, or most important, sentences of the text. These sentences can be identified by several different methods. Among the most popular methods are intuition of general paper format (positional importance), identification of cue phrases (e.g. “in conclusion”), and identification of proper nouns. Some extraction systems assume that words used most frequently represent the most important concepts of the text. These methods are generally successful in topic identification and are used in most commercial summarization software (e.g. Microsoft Word : see appendix 1). However these systems operate on the word level rather than the conceptual level and so the summaries will not always be fluent or properly fused pieces.[13]

Text abstraction is a less contrived and much more complex system for summarization. While extraction mainly entails topic identification, abstraction involves that, and also interpretation and language generation. These two additional steps would make the automated summary more coherent and cohesive. To date, this approach has not been successful because the interpretation stage of this process (the most difficult part of NLP) needs more development before it can aid in summarization.

APPENDIX 1

Microsoft Word’s machine-generated summary of Section 1 (pages 2-6)

1. INTRODUCTION TO ARTIFICIAL INTELLIGENCE

The field of Artificial Intelligence (AI) is the branch of Computer Science that is primarily concerned with the ability of machines to adapt and react to different situations like humans. In order to achieve artificial intelligence, we must first understand the nature of human intelligence. Human intelligence is a behavior that incorporates a sense of purpose in actions and decisions. An understanding of intelligent behavior will be realized when we are able to replicate intelligence using machines, or conversely when we are able to prove why human intelligence can not be replicated.

In an attempt to gain insight into intelligence, researchers have identified three processes that comprise intelligence: searching, knowledge representation and knowledge acquisition.

Game playing is concerned with programming computers to play games such as chess against human or other machine opponents. This sub-field relies mainly on the speed and computational power of machines. Expert systems are programmed systems that allow trained machines to make decisions within a very limited and specific domain. Neural networks is a field inspired by the human brain that attempts to accurately define learning procedures by simulating the physical neural connections of the human brain. Natural Language Processing (NLP) and robotics programming are two fields that simulate the way humans acquire information, which is integral to the formation of intelligence. A perfection of each of these sub-fields is not necessary to replicate human intelligence, because a fundamental characteristic of humans is to err.

In order to learn more about the field of AI from the Computer Science perspective, we will entertain the notion that it is possible for machines to possess intelligence.

Machines have already begun to replace humans in the workforce who do repetitive, “mindless” jobs. If machines could perform intelligent tasks, they could overtake work fields and might cause mass unemployment and poverty.

Should the machines be held accountable for their decisions or should the programmers of the machines? The conventional methods we use to punish humans (e.g. jail, community service) are not effective for machines.

The machines ‘ran the world’ and had the power to obliterate the human race. There have been two sequels to this movie, each of which is based on the same premise of artificial intelligence superseding human intelligence.

WORKS CITED

Automatic text generation and summarization. 29 Apr. 2001 .

Dorr, Bonnie J. Bonnie Dorr - Large Scale Interlingual Machine Translation. 30 Apr. 2001 .

ELIZA -- A Computer Program. 21 Apr. 2001 .

Finlay, Janet, and Alan Dix. An introduction to Artificial Intelligence. London: UCL Press, 1996.

Generation 5. An Introduction to Natural Language Theory. 24 Apr. 2001 .

Ginsberg, Matt. Essentials of Artificial Intelligence. San Mateo: Morgan Kauffman Publishers, 1993.

Hovy, Edward, Chin-Yew Lin, and Daniel Marcu. Automated Text Summarization (SUMMARIST). 29 Apr. 2001 .

Hutchins, John . The development and use of machine translation systems and computer-based translation tools. 28 Apr. 2001 .

Jackson, Philip C. Introduction to Artificial Intelligence. 2nd ed. New York: Dover Publications, 1985.

Jones, Karen Sparck. Workshop in Intelligent Scalable Text Summarization. 29 Apr. 2001 .

Kamin, Samuel N. Programming Languages: An Interpreter-Based Approach. New York: Addison-Wesley, 1990.

Kay, Martin. Machine Translation. 30 Mar. 2001 .

Manning, Christopher D, and Hinrich Schutze. Foundations of Statistical Natural Language Processing. Cambridge: The MIT Press, 2000.

Norvig, Peter. Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp. San Francisco: Morgan Kaufmann Publishers, 1992.

Patent, Dorothy Hinshaw. The Quest for Artificial Intelligence. New York: HBJ, 1986.

Shrdlu- Detailed comments. 21 Apr. 2001 .

Weizenbaum, Joseph. ELIZA. 21 Apr. 2001 .

WINOGRAD’S SHRDLU. 22 Apr. 2001 .

Winston, Patrick Henry. Artificial Intelligence. 3rd ed. New York: Addison-Wesley, 1993.

-----------------------

[1]

[2] 2 May 2001

[3]

[4] C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, (Cambridge: The MIT Press, 2000).

[5] Finlay, Janet, and Alan Dix. An introduction to Artificial Intelligence. London: UCL Press, 1996.

[6]

[7] code adapted from Manning and Schutze

[8]

[9]

[10] example taken from Manning and Schutze, 469.

[11] Ibid, Manning and Schutze, 465.

[12]

[13]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download