Www2.cs.arizona.edu



Project OrderedMap and Probabilistic Text Generation

Collaboration Complete this by yourself. You may get help from Section Leaders and Rick.

In this project you are asked to add these three methods to the OrderedMap class that uses a Binary Search Tree of MapNode objects:

• public V put(K key, V value) Add a mapping to the BST if key is not present and return null, otherwise return the previous value to indicate the old mapping is removed with the old value replacing the new value.

• public V get(K key) Get the value mapped to the given key or null if the key does not exist

• public boolean containsKey(K key) Return true if key exists in the OrderedMap object, or false if it doesn't

This collection class will then be used to generate probabilistic text that can look amazingly like the text used as the input sample text. If the input is Tom Sawyer, the probabilistic text can look a lot like Tom Sawyer.

First get these the start to this project by downloading this zip file and importing into Eclipse

OrderedMap.zip

Finish OrderedMap before moving onto the second part of this project.

2) Probabilistic Text Generation with OrderedMap and java.util.ArrayList

You are to implement class RandomWriterWithOrderedMap.java that provides a random writing application using a your OrderedMap class to store all possible seeds with its list of all possible following characters. This file must be turned into WebCat in the same project as OrderedMap. While testing your code, use any input file you want from Project Gutenberg. Use your OrderedMap class, java.util.ArrayList, and the following algorithm. 

1. Read all file input into one big string (already done in RandomWriterWithOrderedMap.java)

2. Create an OrderedMap object that has all possible seeds of the given length as the key and an ArrayList of followers as the value. (You do this)

3. Pick a random seed from the original text. (already done in RandomWriterWithOrderedMap.java)

4. For each character you need to print (You do this)

• Randomly select one of the characters in the list of followers

• Print that random character

• Change the seed so the first character is gone and the just printed random character is appended

Example text: "Alice likes icy olives"

Using "Alice likes icy olives", create a OrderedMap of seed/list mappings where the seed is a string of length 2 and the value mapped to each key is a list of ALL characters that follow that seed in the original text.

|Seed |list toString() |

|"Al" |[i] |

|"li" |[c, k ] // 'v' to be added |

|"ic" |[e, y] |

|"ce" |[ ] This ArrayList has one element (that you cannot see): a space ' ' |

|"e " |[l] |

|" l" |[i] |

|"li" |"li" is already a key, add follower 'k' to the list already mapped to the key "li" |

|"ik" |[e] |

|"ke" |[s] |

|"es" |[ ] This ArrayList has one element: a space ' '. Another ' ' should be added later |

|"s " |[i] |

|" i" |[c]  |

|"ic" |"ic" is already a key, add follower 'k' to the list already mapped to the key "ic" |

|... |... 8 more possible seeds to map |

Warmup Section Activity: Probabilistic Text Generation with OrderedMap

Here is a class that uses our familiar OrderedMap class. It is intended to provide a start to the second part of the Project: RandomWriterWithOrderedMap. Method makeTheText reads all of the text from an input file into one big StringBuilder object. The StringBuilder class has the methods of the String class with a very efficient append method that we recommend you use. It also has code to get a random seed initially setRandomSeed. Feel free to use this code as a start

// The beginning of the probabalistic text generation

public class RandomWriterWithOrderedMap {

public static void main(String[] args) {

// Assume there is a file named alice with the text: "Alice likes icy olives"

RandomWriterWithOrderedMap rw = new RandomWriterWithOrderedMap("alice", 2);

rw.printRandom(100);

}

private OrderedMap all;

private int seedLength;

private String fileName;

private StringBuilder theText;

private static Random generator;

private String seed;

public RandomWriterWithOrderedMap(String fileName, int seedLength) {

this.fileName = fileName;

this.seedLength = seedLength;

generator = new Random();

makeTheText();

setRandomSeed();

setUpMap(); // Algorithm to be considered during class

}

private void makeTheText() {

Scanner inFile = null;

try {

inFile = new Scanner(new File(fileName));

}

catch (FileNotFoundException e) {

e.printStackTrace();

}

theText = new StringBuilder();

while (inFile.hasNextLine()) {

theText = theText.append(inFile.nextLine().trim());

theText = theText.append(' ');

}

}

public void setRandomSeed() {

generator = new Random();

int start = generator.nextInt(theText.length() - seedLength);

seed = theText.substring(start, start + seedLength);

}

}

Consider algorithms for these two methods that use the put and get methods of OrderedMap

public void setUpMap()

Example text: "Alice likes icy olives"

Using "Alice likes icy olives", an OrderedMap of seed/list mappings would look like the following when viewed "sideways". In this example, the seed has length 2 and the value mapped to each key is an ArrayList of ALL characters that follow the many seeds in the input file. Here is the tree printed sideways with a height of 5 and size of 17:

| // From OrderedMap | 'y ' ->[o] |

| |'ve' ->[s] |

|public String sideways() { |'s ' ->[i] |

|return sideways(root, 0); |'ol' ->[i] |

|} |'li' ->[c, k, v] |

| |'ke' ->[s] |

|private String sideways(MapNode t, int level) { |'iv' ->[e] |

|if (t == null) |'ik' ->[e] |

|return ""; |'ic' ->[e, y] |

|else { |'es' ->[ ] |

|String pad = ""; |'e ' ->[l] |

|for (int i = 0; i < level; i++) |'cy' ->[ ] |

|pad += " "; |'ce' ->[ ] |

| |'Al' ->[i] |

|return sideways(t.right, level + 1) |' o' ->[l] |

|+ pad + "'" + t.key + "' ->" + t.value + "\n" |' l' ->[i] |

|+ sideways(t.left, level + 1); |' i' ->[c] |

|} | |

|} | |

Sample output (does it look like a bit like the original text?):

cy olikes ice lice lives ice likes icy olice lice

lives icy olives ice lice lives icy olives icy oli

Altogether now

1) Build an OrderedMap from the root down of the following text when the seedLength is 3. Do not print sideways, but keep the keys/value mappings in order by key in a BST. The first two seeds and their first followers are given. Remember, the Map is built first and only once to store the list of all followers (takes more memory, but runs faster).

the true try the truths

"the" [ ' '

"he " [ 't'

void printRandom(int n)

2) Walk through generating 8 probabilistic characters of code with an initial seed of "e t"

Grading Criteria

OrderedMap.java (50pts)

____/ +50 OrderedMap has the put, get, and containsKey methods and they all work perfectly according to WebCat problem and code coverage (you must have your own passing tests that executes all code in your class). 100% will mean 50 out of 100 for this project

RandomWriterWithMap.java (50pts)

___/ +50 Generates text that is gets closer to the original as the seed increases (subjective). For example,

when seed length = 2, a few words may appear; but when 12, some sentences appear close to the original text 

• -50 If no text is generated with a printRandom(400) message

• -40 If you did not use your OrderedMap and the algorithm presented in the project that uses a Map to set up all seeds and the list of followers for each

• -40 If text has no apparent difference with different seed lengths or file input

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download