COMPUTER AND INFORMATION SYSTEMS DEPARTMENT



[pic]

INFS 4220 – Data Mining Applications

|Section: B |

|Time: Thursday 6 – 10 p.m. |

|Room: Bus 118 (plus Virtual Rotation)[1] |

|Lab: Bus 118 |

INSTRUCTOR INFORMATION

INSTRUCTOR: Dr. G. Alan Davis OFFICE: Wheatley Center #222

E-MAIL: davis@rmu.edu PHONE: 412.397.6440

WEBSITE:

(or via rmu.edu - search for “davis”)

OFFICE HOURS: posted on

COURSE INFORMATION

COURSE MATERIAL:

1. Required Text: Data Mining for Business Analytics –3rd Edition, by Schmueli, Patel, and Bruce (Wiley Publishing, Hoboken, NJ 2016)

COURSE DESCRIPTION:

INFS 4220 Data Mining offers the student an introduction to data mining and data warehousing concepts with emphasis placed on formulating and leveraging the de-normalized "star schema" data warehouse design. The statistical techniques behind the wide variety of data mining tools available are analyzed and applied. Students are introduced to the analytical principles and processes involved in data mining. Various software packages, such as XLMiner, MS-Excel, and/or SPSS may be used with case problems to give the student hands-on experience.

PRIMARY GOAL:

The primary goal of INFS 4220 Data Mining Applications is to provide the student with a thorough understanding of the tools, techniques, and best practices associated with modern data mining. Through instruction and practice drills, students will learn how to extract decision-making value from stockpiles of organizational data.

LEARNING OUTCOMES:

At the completion of the course, the student will be able to:

Topic 1: Introduction (Chapter 1)

• Define Data Mining (DM)

• Cite examples of where DM is used

• Utilize proper DM Terminology and Notation

Topic 2: Overview of the Data Mining Process (Chapter 2)

• Describe the core ideas and the overall data mining process

• Apply Supervised and Unsupervised Learning are appropriate

• Apply the Steps in DM

• Build a DM Model

• Use Excel for DM

Topic 3: k-Nearest Neighbor (Chapter 7)

• Describe the k-NN

• Differentiate the Advantages and Disadvantages of k-NN

Topic 4: Naive Bayes (Chapter 8)

• Predict Fraudulent Financial Reporting using Bayes

• Compare and contrast the Advantages and Disadvantages of Bayes Classifying

Topic 5: Social Network Analysis (SNA) / Big Data Analytics (Chapter 19)

• Describe Big Data and Social Network Analytics

• Compare and Contrast a Directed Network and an Undirected Network

• Describe a SNA network and network characteristics

• Discuss Node Metrics within a SNA network

Topic 6: Classification and Regression Trees (Chapter 9)

• Describe Classification Trees

• Discuss Measure of Impurity

• Avoid Overfitting in analyses

• Develop Classification Rules from Trees

• Create Classification Trees for more than 2 classes

• Develop Regression Trees

Topic 7: Neural Nets (Chapter 11)

• Describe the concept and structure of a Neural Network

• Fit a Network to data

• Compare and contrast Predictors and Responses

• Argue the advantages and weaknesses of Neural Networks

Topic 8: Association Rules (Chapter 14)

• Discover Association Rules in Transaction Databases

• Generate Candidate Rules

• Select Strong Rules

Topic 9: Cluster Analysis (Chapter 15)

• Calculate the distance between two records

• Calculate the distance between two clusters

• Perform Hierarchical Clustering

COVID-19 INFORMATION

REQUIREMENTS FOR MASKS AND CLASSROOM SEATING:

All students must wear appropriate masks (no shirts or bandannas) covering their mouth and nose while in the classroom. All students must sit in marked seats to allow for necessary physical spacing in the classrooms. Instructors will ask students who are in non-compliance with these requirements to immediately comply. If a student does not comply immediately, the campus police will be called. The student will be removed, and a Student Conduct report filed. The student will be marked absent from class.

COVID-19 ATTENDANCE POLICY

As a result of the Covid-19 pandemic, all students are encouraged to remain home or in their residence hall room when experiencing any signs of illness. Students who test positive for the virus or who must be quarantined after exposure to the virus will be excused from class attendance. Instructors will be notified by the Dean of Students Office if a student is in quarantine or has contracted the virus. A student who is absent due to observed symptoms of Covid-19, is in quarantine due to suspected exposure, or who has a confirmed case of Covid-19, is entitled to makeup work missed if the student fulfills the instructor notification requirements of the policy.

Students are not to be penalized for any missed assignments, projects, examinations, tests, etc. or to have their daily grades automatically reduced when covered by this policy. While the faculty member must allow the student to "make up" or complete any assignments, etc., that were missed due to officially sanctioned obligations, faculty members are under no obligation to tutor or otherwise provide missed instruction. Faculty will determine when make-up exams are scheduled and when missed assignments are due. Students must notify the Dean of Students Office at 412-397-6483 to be excused from class attendance and for this policy to be in effect. Instructors will be notified by the Dean of Students Office.

WHAT TO DO IF YOU HAVE SYMPTOMS OR MAY HAVE BEEN EXPOSED TO COVID-19

Students who have symptoms or think they have been exposed to COVID-19 should immediately leave the classroom and call UPMC MyHealth@School Center at 412-397-6220 for phone screening/triage during business hours. The student should not visit the clinic. If the Center is closed, the student should contact his/her own medical provider for assistance, or the UPMC Anywhere Care App - Virtual Urgent Care (please note: fees apply) and call the MyHealth@School Center on the next business day.

WHAT TO DO IF YOU TEST POSITIVE FOR COVID-19

Students who test positive for COVID-19 should immediately notify the Dean of Students Office at 412-397-6483, and not return to class until they have been cleared to return. The Office of Student Life will notify instructors when students are cleared to return to class. No student should return to class until they are cleared to return.

WHAT IF RMU CAMPUS CLOSES DUE TO COVID-19?

If the RMU campus must be closed due to COVID-19, this class will continue under its normal schedule as a fully-online class. All class materials will continue to be available within Blackboard. Unless otherwise noted, all assignment due dates and exam dates will remain as scheduled. Class sessions will continue as asynchronous (i.e., not live) and/or synchronous (i.e., live) during normal class meeting times.

COURSE STRUCTURE:

The methods used in INFS 4220 - Data Mining Applications include lecture and classroom discussion through examples and demonstration. At times, the instructor may make use of a computer projector and/or presentation software in a classroom lecture. The course will also include review and/or use of leading data mining software tools, currently utilized in the field of data mining/data warehousing.

STUDENT RESPONSIBILITIES

READING ASSIGNMENTS:

The student is responsible for doing all the respective reading assignments prior to the scheduled lectures.

WRITTEN ASSIGNMENTS:

The student is responsible for completing all assignments within the allotted periods of time as outlined by the instructor. Written assignment due dates will be established either in the syllabus or provided to the students when relevant lectures are completed.

Important notes:

1. The student is responsible to back up his/her valuable diskette files appropriately

2. The student must protect his/her assignments, files, diskettes, etc. from copying by other students and against viruses.

3. Significant time outside of class is necessary to work on the various components of the written assignments.

FOLLOW-UP:

IIf a student does not fully understand a lecture subject or assignment and would like further explanation; the student is responsible to raise the topic(s) for discussion in class. If further explanation is required on an individual basis, the student is encouraged to see the instructor during office hours or make an appointment.

PASSIGNMENT DUE DATES:

R

It is the student’s responsibility complete assignments when they are due. Due dates are announced during class and clearly posted in the weekly schedule at the end of this syllabus. Assignments that are submitted after due dates will be PENALIZED 25% for each day the assignment is late. It is the responsibility of the student (not the instructor) to stay current on class assignments.

ATTENDANCE:

R

Attendance will be taken at the beginning of each class period. The CIS Department’s 25% Absence Policy will be enforced; that is, if a student misses 25% or more of the allotted semester classes, he/she will automatically receive a letter grade of F. The student is responsible for keeping a record of missed classes.

If a student is absent from a class session, that student is responsible for turning in (on time) any assignments that are due or completed/collected during that class session. It is the responsibility of the student (not the instructor) to stay current on class assignments.

M

MAKE-UP EXAMINATIONS:

This is an eight (8) week graduate course; therefore, the class follows a very tight schedule. Make-up examinations will ONLY be given in EMERGENCY situations. The instructor will make the final decision as to what constitutes an emergency situation and whether or not a make-up examination will be given. Also, you may be required to submit documentation from your medical doctor or from your employer if you have to miss a scheduled exam or a scheduled presentation.

EVALUATION CRITERIA:

Your final grade will be calculated using weighted percentages, with each of the following categories contributing, as listed:

Midterm Exam 20%

Final Exam 20%

Final Project 15%

Discussion Questions 10%

Homework Assignments 25%

Attendance / Participation 10%

100%

Your final grade will be calculated as follows:

GRADING SCALE:

92.51 – 100 % A

89.51 - 92.5 A-

86.51 - 89.5 B+

82.51 - 86.5 B

79.51 - 82.5 B-

76.51 - 79.5 C+

69.51 - 76.5 C

59.51 - 69.5 D

0.0 - 59.5 F

ACADEMIC INTEGRITY POLICY

Academic Integrity is valued at Robert Morris University. All students are expected to understand and adhere to the standards if Academic Integrity as stated in the RMU Academic Integrity Policy, which can be found on the RMU website at rmu.edu/academicintegrity. Any student who violates Academic Integrity Policy is subject to possible judicial proceedings which may result in sanctions as outlined in the Policy. Depending upon the severity of the violation, sanctions may range from receiving a zero on an assignment to being dismissed from the university. If you have any questions about the policy, please consult your course instructor.

PLAGIARISM POLICY

Plagiarism, taking someone else's words or ideas and representing them as your own, is expressly prohibited by Robert Morris University.  Good academic work must be based on honesty.  The attempt of any student to present as his or her own work that which he or she has not produced is regarded by the faculty and administration as a serious offense.  Student academic dishonesty includes but is not limited to: 

• Copying the work on another during an examination or turning in a paper or an assignment written, in whole or in part, by someone else;

• Copying from books, magazines, or other sources, including Internet or other electronic databases like ProQuest and InfoTrac, or paraphrasing ideas from such sources without acknowledging them;

• Submitting an essay for one course to a second course without having sought prior permission from your instructor;

• Giving a speech and using information from books, magazines, or other sources or paraphrasing ideas from sources without acknowledging them;

• Knowingly assisting others in the dishonest use of course materials, such as papers, lab data, reports and/or electronic files to be used by another student as that student's work.

• NOTE on team or group assignments:  When you have an assignment that requires collaboration, it is expected that the work that results is credited to the team unless individual parts have been assigned.  However, the academic integrity policy applies to the team as well as to its members.  All outside sources must be credited as outlined.

ACCOMMODATIONS FOR STUDENTS WITH DISABILITIES

• Robert Morris University welcomes students with disabilities into all of the University's educational programs. If you have (or think you may have) a disability that would impact your educational experience in this class, please contact Services for Students with Disabilities (SSD) to schedule a meeting with the SSD Coordinator, Molly Hill. Ms. Hill will confidentially discuss your needs, review your documentation, and determine your eligibility for reasonable accommodations. To learn more about SSD and available supports, please visit the SSD Website at rmu.edu/ssd, email ssd@rmu.edu, call (412)-397-6884, or visit the SSD office, located in Nicholson Center, Room 280.

FINAL NOTE TO STUDENTS

The instructor reserves the right to modify any schedule or policy in this class syllabus at any time throughout the class. Modifications may be made as necessary to improve the learning experience or learning environment of the student. Any such modifications will be announced during regular class or exam meeting times.

Finally, any (anonymous) data extracted from the course may be used for research purposes.

GENERAL TOPIC OUTLINE

| | | | |

|DATE |DESCRIPTION |EST. TIME |REFERENCE TO TEXTBOOK MATERIALS, TUTORIALS, |

| | |(based on a 8 week |or READING SUPPLEMENTS |

| | |session) | |

| | | | |

|1 |Intro / Statistic Refresher |1 week |Read Chapter 1 |

|(10/29) | | | |

| |Data Mining videos | |Read Chapter 2 |

| | | | |

| |The Data Mining Process | | |

| | | | |

|2 | |1 week |Read Chapter 7 |

|(11/5) |k-Nearest Neighbor | | |

| | | |Chapter 2 Lab Due |

| | | | |

| | | |Discussion #1 Due |

| | | | |

|3 |Naïve Bayes |1 week |Read Chapter 8 |

|(11/12) | | | |

| |Big Data / Social Network Analytics | |Read Chapter 19 |

| | | | |

| | | |Chapter 7 Lab Due |

| | | | |

|4 |Midterm Exam |1 week |Midterm Exam |

|(11/19) | | |(Chapters 1, 2, 7, 8 and Statistics Review) |

| | | | |

| | | |Read Chapter 9 |

| |Classification & Regression Trees | | |

| | | |Chapter 8 Lab Due |

| | | | |

|5 |Neural Nets |1 week |Read Chapter 11 |

|(11/26) | | | |

| | | |Chapter 9 Lab Due |

| | | | |

|6 |Association Rules |1 week |Read Chapter 14 |

|(12/3) | | | |

| | | |Chapter 11 Lab Due |

| | | | |

|7 |Cluster Analysis |1 week |Read Chapter 15 |

|(12/10) | | | |

| | | |Chapter 14 Lab Due |

| | | | |

|8 |Final Project Due |1 week |Submit Data Mining Analysis & Recommendations|

|(12/17) | | | |

| |Final Exam | |Final Exam |

| | | |(Chapters 9, 11, 14, 15, and 19) |

| | | | |

| | | |Chapter 15 Lab Due |

YOU CONTROL YOUR GRADE!!!

You are in complete control of your grade . . .

1. I do NOT “give” grades; I only report the grade that you earn in the course.

2. I do NOT allow “extra credit” assignments to raise your grade.

3. I do NOT allow “do overs” on assignments or exams.

4. I do NOT allow the use of cell phones during in-class examinations.

5. I DO penalize 25% for each day an assignment is late (0% for assignment on 4th day late); therefore, turn in assignments on time.

6. I am happy to meet with any student who does not understand the material or an assignment. I am available during regular office hours, or by appointment.

7. If you do not earn the grade that you wanted in the class, blame the person in the mirror!

Some of my favorite quotations related to education . . .

• Educators open the door, but you must enter by yourself. – Chinese Proverb

• I never teach my pupils, I only provide the conditions in which they can learn.

– A. Einstein

• If you think education is expensive, try ignorance. – D. Bok

• You pay for your education . . . but you must earn your grade! – G. Davis

• I have never “failed” a student . . . students always fail on their own! – G. Davis

-----------------------

[1] Students in Group A attend the physical classroom during odd weeks, students in Group B attend the physical classroom during even weeks. When not attending physical classroom, students are to connect to live, virtual class.

7 of 8

COMPUTER AND INFORMATION SYSTEMS DEPARTMENT

% Course Syllabus %

lassroom, students are to connect to live, virtual class.

-----------------------

COMPUTER AND INFORMATION SYSTEMS DEPARTMENT

─ Course Syllabus ─

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download