Question
Lab Name: Flesch Readability Index.
Purpose: To work with complex algorithms.
Required Files:
Introduction: In this project you will implement a class that determines the Flesch Readability Index for a piece of text. This method of calculating the readability of a piece of text was devised by Rudolf Flesch, author of Why Johnny Can't Read and The Art of Readable Writing. When you check the spelling and grammar in a Word document you can have the readability statistics displayed, including the Flesch index.
[pic]
The Flesch Readability Index is a number, generally between 0 and 100, that indicates how easy a piece of text should be to read. The lower the number, the harder the text is to read. A general breakdown of reading levels based on Flesch Index is:
Flesch Score Approximate grade level
90 to 100 5th grade
80 to 90 6th grade
70 to 80 7th grade
60 to 70 8th to 9th grade
50 to 60 10 to 12th grade (high school)
30 to 50 13th to 16th grade (college level)
0 to 30 college graduate.
The index is calculated by a fixed set of rules for counting the number of sentences, words, and syllables in a piece of text. This can be automated via a computer program. Here is an example. Consider the following sentence:
It was an extraordinarily windy day, and thus the riders were faced with several arduous climbs up the mountain, with the wind trying to push them back down the road.
The Readability Index for that sentence is 58. The following conveys almost the same idea,
It was a very windy day. The riders had many hard climbs up mountains. The wind kept pushing them back down the road.
but has a Readability Index of 92. This method does not do a linguistic analysis so the results can be misleading, but the method usually produces a good answer.
Problem Description: Complete the following method in class Flesch:
/* pre: text != null
post: return an integer array with 4 elements. The elements will represent the following information. [0] = Flesch reading score. [1] = Number of sentences. [2] = Number of words. [3] = number of syllables. If number of words equals 0 or the number of sentences equals zero the Flesch score is set to 1000.
*/
public int[] getReadabilityStats(String text)
1. The readability index itself is calculated by the following formula:
Index = 206.835 - ( 84.6 * ASW ) - ( 1.015 * AWS )
rounded to the nearest integer.
ASW is Average Syllables per Word = total number of syllables / total number of words
AWS is Average Words per Sentence = total number of words / total number of sentences
2. The program needs to count the number of words, number of syllables, and number of sentences. Certain assumptions are made about what is a word, syllable, and sentence in order to make it easier to write a program to do this analysis.
3. A word is sequence of one or more characters delimited by white space or by a sentence terminators as listed in rule 5, whether or not it is an actual English word. White space is defined as a space, tab ( '\t'), a new line character ('\n'), and the end of the String itself.
4. To count the total number of syllables use the following rules. Note, these rules are a heuristic, which is defined as " A rule of thumb or guideline (as opposed to an invariant procedure). Heuristics may not always achieve the desired outcome, but they are extremely valuable to problem-solving processes." Heuristics are valuable because they simplify the problem solving process. The following rules will sometimes give you the wrong answer for the number of syllables in a word, but they usually give the right answer and are much easier to implement then storing ALL the words that might be encountered and their syllable count.
a. Each group of adjacent vowels counts as one syllable. Vowels consist of upper and lower case a, e, i, o, u and y. For example, the "ea" in "real contributes one syllable, but the "e" and the "a" in "regal" count as two syllables. "Happy" has two syllables, because of the 'a' and 'y'.
b. Each word has at least one syllable, even if rule a gives it a count of 0. Thus the String "Shhhhhh Shhhhhhh" has 2 words and each word has 1 syllable.
5. Count all the sentences. Each occurrence of a period, colon, semicolon, question mark, and exclamation mark count as a sentence. Thus the String "Gack!!!" has 1 word with 1 syllable, but 3 sentences. (Again this set of rules is a heuristic. A set of rules that often gives a good answer, but occasionally gives bad or nonsensical answers. It is possible per these rules to have a sentence with no words.)
Examples:
Test sentence 1: "This is a sentence. So is this!"
Number of sentences: 2
Number of words: 7
Number of syllables: 9
Flesch readability index: 95
Test sentence 2: "The following index was invented by Flesch as a simple tool to estimate the legibility of a document without linguistic analysis."
Number of sentences: 1
Number of words: 21
Number of syllables: 42
Flesch readability index: 16
Test sentence 3: "Wette. It 'reven hem, or was revenrage. With hey kince kin himply to justron' wer\", \"stere what willi?"
Number of sentences: 3
Number of words: 18
Number of syllables: 28
Flesch readability index: 69
This example is merely to show the algorithm works regardless of if the input is standard English or not. You could even run the algorithm on source code, although the answer would not be very helpful or meaningful.
Limitations: Do not use the StringTokenizer or split method from the string class.
Hints:
1. Do not try to do this with one big method. Create helper methods, method that do some small part of the task.
2. Do not start coding first. You must design your algorithm first. Think carefully about how you would do the problem with a pencil and paper.
a. One algorithm is to step through the text one character at a time. You need to carefully consider all the possible cases and what actions must occur when the case changes. Possible cases include being in a word, being in a vowel cluster, and being at the end of a sentence. This is not a complete list.
b. Another algorithm would be to take the original String and parse (break up into its component parts) it into sentences and count them. Then parse each sentence into words and count them. And finally parse each word into syllables. This seems like an attractive and intuitive approach but turns out to be fairly complicated.
3. Each sentence terminator counts as a single sentence. So the text "RATS!!!!" Would have 4 sentences. This is a simplification that makes the program a easier to write. Do not try to cover special cases. Follow the rules as stated even though there are cases that they seem to fail on. Remember the Flesch Index is a simple and heuristic to calculate readability
Extras: Notice that more often than not an e at the end of the word should not count as a vowel.
Word Actual Number of Syllables Number of Syllables With Given Rules
role 1 2
estimate 3 4
sentence 2 3
One way to deal with this is to not treat any e at the end of a word as a vowel. This complicates the problem. Implement this new set of for determining the number of syllables in a word:
a. Each group of adjacent vowels counts as one syllable. Vowels consist of upper and lower case a, e, i, o, u and y at the end of a word. For example, the "ea" in "real contributes one syllable, but the "e" and the "a" in "regal" count as two syllables. "Happy" has two syllables, because of the 'a' and 'y' at the end.
b. A lone "e" at the end of a word that is not part of a larger vowel cluster is an exception to the previous rule. So "role" has a single syllable according to these rules.
c. Each word has at least one syllable, even if rules a and b give it a count of 0. Thus the String "Shhhhhh Shhhhhhh" has 2 words and each word has 1 syllable.
How does this set of rules affect the syllable count? Does it make the syllable count more or less accurate?
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- past question papers grade 10
- the why question why important
- why ask the question why
- the question is why
- the next question is
- my question is this
- my question to you is
- ask grammar question online
- wh question activities for kids
- wh question activities
- wh question activities speech therapy
- interactive wh question games