CSCE 2014 – Programming Project 3 Midpoint Due Date – 02 ...
CSCE 2014 ¨C Programming Project 3
Midpoint Due Date ¨C 02/24/2021 at 11:59pm
Final Due Date ¨C 03/03/2021 at 11:59pm
1. Problem Statement:
The goal of this programming assignment is to gain experience with recursive
binary search, and also reading and processing text files. Your task is to write a
program that can take in an English book and create an abridged version of the book
written using only the 1000 most common English words, proper names, and basic
punctuation marks. All other words in the original book should be removed.
You will be given a data file ¡°top1000.txt¡± with the top 1000 most commonly used
English words based on frequency analysis of a collection of novels. Each line in the
data file consists of an integer rank (1 = most common, 1000 = least common)
followed by the word. To simplify your analysis, all of the upper case letters have
been converted to lower case, and the file is sorted in alphabetical order.
To create the abridged version of an input document, you must read the words one
at a time, convert the word to lower case, and use binary search to look the word up
in your dictionary data structure to see if it falls in the top 1000. If it does, the
original word should be printed in the abridged book.
In order to determine if a word is a proper name, you should look at the first letter
in the word, and if it is a capital letter and the word is not the first word in a
sentence, then you can assume the word is a proper name, and you should print the
proper name in your abridged book. This approach may not be perfect, but it should
work most of the time.
Once you have the top 1000 words and the proper names printed properly, you can
extend your program to print the most five common punctuation marks (period,
comma, semicolon, question mark, and exclamation point). You can ignore all other
ASCII characters, and print a space in its place in the abridged book.
To test your program, you will be given five samples of text taken from the
beginnings of five well-known public domain books. You can use these short
documents to create your abridged books. Hopefully we will be able to recognize
and understand the abridged book created by your program.
book1.txt from Anne of Green Gables.
book2.txt from David Copperfield.
book3.txt from Adventures of Huckleberry Finn.
book4.txt from The Time Machine.
book5.txt from The Jungle Book.
2. Design:
There are two essential problems you need to solve in order to complete your
program. First, you need some way to read and store all 1000 words and their
corresponding ranks, and some way to search this data structure to look up a word
and find its rank. The most natural way to do this would be to create a "Dictionary"
class that has a "read_file" method to load words and their ranks into private arrays,
and a "binary_search" method to recursively search these arrays to look up the rank
of a word. To test and debug this class, you can write a tiny main program that
prompts the user for words, and prints out their corresponding ranks. See the
programs "dictionary.cpp" and "numbers2.cpp" in the source directory for some
sample code.
The second problem you must address is the reading and processing of the input
document. At one level, this is relatively easy because you just need to read an input
file one word at a time until the end of file is reached, and look up each word using
your Dictionary class above to see if the word is in the top 1000 or not. The tricky
part of this process is dealing with upper case letters, numbers, and other characters
in the input file. You need to convert upper case letters [A..Z] to lower case letters
[a..z], and remove all other characters from the word before you look the word up in
the dictionary. If the character that was removed was one of the five punctuation
marks listed above, you should print character this after the word is processed.
3. Implementation:
To implement your project, you should break your code down into multiple files
using techniques discussed in lab and in class. You are welcome to look at programs
on the class website for sample code to assist in the implementation of this project.
As always, it would be a good idea to start with "skeleton methods" to get something
to compile, and then add the desired code to each method incrementally writing
comments, adding code, compiling, debugging, a little bit at a time. Once you have
the methods implemented, you can create a main program with s simple menu
interface that calls these methods to complete your project.
Remember to use good programming style when creating your program (good
names for variables and constants, proper indenting for loops and conditionals,
clear comments). Be sure to save backup copies of your program somewhere safe.
Otherwise, you may end up retyping your whole program if something goes wrong.
4. Testing:
Once your program is fully debugged, copy your source code and test files to turing
using FileZilla or a similar tool. Then login to turing and do the following to
complete your program testing:
?
?
?
?
?
?
?
?
?
Type ¡°script¡± to start recording your testing session.
Type ¡°g++ -Wall *.cpp ¨Co project3¡± to compile your project.
Type ¡°./project3¡± to run your program.
Type in the name of the book file you want to process.
Your program should print the abridged book to the screen.
Run your program again to process a second book.
Type ¡°exit¡± to finish recording your testing session.
Copy the file ¡°typescript¡± from turing onto your local computer.
Include ¡°typescript¡± with your code and project report when you upload the
project into blackboard.
5. Documentation:
When you have completed your C++ program, write a short report using the project
report template describing what the objectives were, what you did, and the status of
the program. Does it work properly for all test cases? Are there any known
problems? Save this report to be submitted electronically.
6. Project Submission:
In this class, we will be using electronic project submission to make sure that all
students hand their programming projects and labs on time, and to perform
automatic plagiarism analysis of all programs that are submitted.
When you have completed the tasks above, copy all of your source code, your
¡°typescript¡± file, and your project documentation into a folder called ¡°project3¡±.
Compress this directory into a single ZIP file called ¡°project3.zip¡±, and upload this
ZIP file into Blackboard. The GTAs will download and unzip your ZIP file and
compile your code using ¡°g++ -Wall *.cpp¡± and test it to verify correctness.
The dates on your electronic submission will be used to verify that you met the due
date above. All late projects will receive reduced credit:
?
?
?
?
10% off if less than 1 day late,
20% off if less than 2 days late,
30% off if less than 3 days late,
no credit if more than 3 days late.
You will receive partial credit for all programs that compile even if they do not meet
all program requirements, so handing projects in on time is highly recommended.
7. Academic Honesty Statement:
Students are expected to submit their own work on all programming projects,
unless group projects have been explicitly assigned. Students are NOT allowed to
distribute code to each other, or copy code from another individual or website.
Students ARE allowed to use any materials on the class website, or in the textbook,
or ask the instructor and/or GTAs for assistance.
This course will be using highly effective program comparison software to calculate
the similarity of all programs to each other, and to homework assignments from
previous semesters. Please do not be tempted to plagiarize from another student.
Violations of the policies above will be reported to the Provost's office and may
result in a ZERO on the programming project, an F in the class, or suspension from
the university, depending on the severity of the violation and any history of prior
violations.
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- introduction to unix
- programming language support for natural language interaction
- csci 3155 lab assignment 5
- declare local string in typescript
- type driven development tdd and idiomatic data structures
- csce 2014 programming project 3 midpoint due date 02
- learning type annotation is big data enough
- project 1 change in character
- file management search and replace keyboard shortcuts for
- cognitive complexity sonarsource