MIS 373 – Basic Operations Management



INFO 634 Data Mining

Syllabus

College of Computing and Informatics

Drexel University

Instructor: Xiaohua Tony Hu, PhD

Office: 3401 Market Street, Suite 300, Room 325.

Telephone: 215-898-0551

Email:

URL:

Lecture Time & Location: Monday 2-4:50pm for face-to-face session

Office Hour: Monday 12:50-1:50pm or make an appointment

Course Description:

This course introduces the concepts and principles of knowledge discovery in databases (KDD), with a focus on the techniques of data mining and its function in business, governmental, medical or other information-intensive environments.

Course Pre-Requisites:

INFO 605

Objectives

This course introduces the concepts and principles of knowledge discovery in databases (KDD), with a focus on the techniques of data mining and its function in business, governmental, medical and other information-intensive environments.

This course is offered to provide students with advanced knowledge in data mining technique, algorithm and methods. Students learn data pre-processing, various data mining algorithms, including supervised learning, semi-supervising learning and unsupervised learning.

COURSE MATERIALS

0. Textbook: Han, Jiawei and Kamber, Micheline: Data Mining: Concepts and Techniques. San Francisco, Morgan Kaufmann Publishers, 3rd Edition (or 2nd Edition)

1. Slides: check the course website

2. Other reading materials

3. Practice data sets

SOFTWARE

Weka data mining software suite:

• Go to the “Download” page. There are several different versions you can download and install. I suggest you choose the stable version

• If you use Windows and choose a version without the Java VM, you may need to download and install the Java Runtime Environment (JRE) before your install Weka:



• Download the Weka documentation/manual from the Weka site.

• Add this page () to your favorites/bookmarks. You may need it.

• If needed, you may download the Oracle Database Express Edition from Oracle and install at your own PC or Request an Oracle ID at

Evaluation

Homework (25%)

• 2 graded homework assignments.

• Assignments will focus on using Weka to find patterns or build models.

• All assignments must be submitted on time. Late submission will not be accepted.

Final Exam (30%)

Course Project (45%)

• Term project: you can work as a team (upto 3 people per team) or individually.

• You can work on an implementation project or review paper project

• The Review paper project is an individual project

• The implementation project could be a team project or individual project

Review paper project

• The review paper is a full-scale research project, in the form of a review article. Please see any issue of the IEEE SURVEY or Annual Review of Information Science and Technology (ARIST) for examples of a scientific review article. Some important issues to keep in mind:

• You can select any domain you desire as the focus for your paper. This could be bioinformatics, genomics, fraud detection, marketing, Web mining, text mining, the theory of a single data mining method, social network analysis, or a comparison of two or more data mining methods. These are just examples!

• Review paper is not group project. Individual students must do their own work.

• There is no predefined page minimum or maximum. The length of your paper will be determined largely by the domain you have chosen. However, you should expect your paper to be in the range of 15-20 pages.

• You must demonstrate that you fully understand the material in your paper, by integrating the substance of the articles you use with a larger conceptual aim.

• In general, Web page and Web site references are not acceptable. This project requires that you perform searches of the standard (published and peer-reviewed) literature.

• You must adhere to a standard for citations and bibliography. The bibliographies for review articles can exceed 100 references, so it is extremely important to be well-organized in the mechanics of citations. Consult a writer’s manual, such as The Chicago Manual of Style or Turabian (A Manual for Writers of Term Papers, Theses, and Dissertations).

• You have nine weeks to complete the paper, and it is the cornerstone of the course. There is no excuse for sloppiness, poor grammar, or incomplete work

Implementation Project

• Basic programming skills are necessary for data processing and possibly algorithm implementation. Therefore, if you do not have much experience in programming, I suggest you team up with someone who does.

• Each project requires extensive amount of efforts in problem formulation, data processing, programming, model design and tweaking.

• Deliverables and important dates:

▪ Proposal:

➢ Due date: End of Week 4

➢ Proposal template (< 500 words):

• Title / Team members

• Introduction (e.g., what application or mining problem)

• Data Descriptions (e.g., descriptive stats and characteristics)

• Preliminary Plan (e.g., problem formulation, possible techniques/models)

• Timeline

▪ Final report :

➢ Due date: Final Exam Week

➢ Less than 20 pages, double-spaced, font size 12.

➢ Report template:

• Title / Team members

• Introduction (e.g., problem description)

• Data Descriptions (e.g., data collection, processing, descriptive stats, and characteristics)

• Method(s) (Justify your choice of techniques/design and describe them in detail)

• Analysis & Results (Describe your analytical process and results)

• Conclusions (Summarize the your project and experiences)

• References

• Appendix (Contributions of each group member)

Academic Honesty

All students are expected to adhere to the highest standards of academic integrity. If you cheat or plagiarize in any way you will, at a minimum, fail the course. All work in this course is to be your own work. You are encouraged to confer with me if you are having trouble or any questions about the assignments or other course work. The University’s statement on Academic Honesty is in Chapter 10 of the Student Handbook () and will be strictly adhered to.

Course Schedule (tentative)

|Week |Topic |Readings |

|1 |Introduction |Chapter 1 |

|2 |Data Preprocessing |Chapters 2 & 3 |

|3 |Data Warehouse and OLAP Technology |Chapter 4 |

|4 |Association Rule Mining |Chapter 6 |

|5 |Classification |Chapter 8 |

|6 |Classification (cont’d) |Chapter 9 |

|7 |Cluster analysis |Chapter 10 |

|8 |Mining Complex Data |Chapter 13 |

|9 |Graph Mining, Social Network Analysis |Chapter 13 |

|10 |Project discussion and review | |

|11 |Final Project Due and Final Exam | |

Grade Scale

>=97 A+

>=93 A

>=90 A-

>=87 B+

>=83 B

>=80 B-

>=77 C+

>=73 C

>=70 C-

>=67 D+

>=63 D

>=60 D

Otherwise F

Academic Integrity & Plagiarism:

All submitted assignments must be your own work, with sources properly cited, and abide by the Academic Integrity Statement submitted at the start of the course certifying that the work done in this class is your own. Any incidence of plagiarism or other academic dishonesty will be discussed with the student. If warranted, a Drexel University Alleged Academic Misconduct Report () will be filed. Any incidence of plagiarism can result in an F for this course.

The Drexel University regulations governing plagiarism and other forms of academic misconduct can be found at:

Class Cancellation:

On rare occasions, instructors may be delayed or unable to attend a scheduled class due to unforeseen circumstances.  In the event that an instructor does not appear in class and has not notified the class of his/her expected arrival time, class is cancelled 15 minutes after the scheduled start of class. More information about class cancellations can be found at the Office of the Provost website at:

Class Lecture Recording:

Lectures and class discussions may be audio-recorded and streamed or rebroadcast for educational purposes only.

Computer Requirements:

The College of Computing & Informatics requires students to have access to the computer hardware & software and competency in the computer skills listed here:

Course Add:

Students may add a course before the end of the second week of classes. If a student elects to add this course, he or she must contact the instructor to develop a plan and schedule a timetable for the completion of any missed coursework. Information about adding a course can be found at the Office of the Provost website at:

Course Changes: It is the prerogative of the instructor to make changes or adjustments to this course during the term at his, or her, discretion. The instructor will announce these changes and/or adjustments via Blackboard or in class as soon as the changes and/or adjustments are known.

Course Drop/Withdrawal:

Students may drop a course before the end of the second week of classes. Information about course drops can be found at:

Students may withdraw from a course before the end of the seventh week of classes. Information about course withdrawals can be found at:



Incompletes:

Incompletes will be assigned ONLY in cases of medical or family emergency, where evidence of the emergency is provided and the student has completed 80% of the course work at the time of the request for an Incomplete. The student must complete a Contract for Grade of Incomplete with the instructor in order to receive an incomplete. The Contract is available at:

Late assignments:

Late assignments are not accepted unless specific arrangements are made with the instructor at least 24 hours prior to their deadline and with the understanding that a lower grade may be given.

Student Conduct:

Drexel University has adopted a student conduct policy charging all students with the responsibility to be aware of, and abide by, the University’s policies, rules, regulations, and standards of conduct. This information is available in the Drexel University Official Student Handbook at:

The CCI Online Survival Skills and Professional Behavior Guidelines can be found at:

Support for Students with Disabilities:

Drexel University ensures people with disabilities will have an equal opportunity to participate in its programs and activities. Members and guests of the Drexel community who have a disability must register with the Office of Disability Resources (ODR) if requesting auxiliary aids, accommodations, and services to participate in Drexel University’s programs. Please contact the ODR at or at 3201 Arch St., Suite 210, Philadelphia, PA, 19104, for further information. Telephone numbers are: (215) 895-1401 or TTY (215) 895.2299.

Support for Equality and Diversity:

Drexel University strives to promote an environment of equality of opportunity and compliance with University policies and federal, state and local laws prohibiting discrimination based upon race, color, religion, gender (sex), marital status, pregnancy, national origin, age, disability and veteran status. Students, faculty, and staff with questions about or complaints concerning discrimination, harassment, and/or retaliation should contact the Office of Equality and Diversity at (215) 895-1403 or

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download