Spb.hse.ru



Course SyllabusTitle of the courseMachine Learning and Data MiningTitle of the Academic Programme Advanced social analysisType of the course ElectivePrerequisitesFoundations of calculus, probability theory and mathematical statisticсsECTS workload2Total indicative study hoursDirected StudySelf-directed study Total284876Course OverviewRapid developments of social networking sites, online media and other internet-generated data are making machine learning an essential analytical tool of social scientists and industrial analysts of social data.Nowadays, social researchers should not only be able to work with different types of data, such as textual or relational data, but should also have skills to interpret results obtained with complex mathematical algorithms. In this course students will, first, get to know basic machine learning algorithms and their main advantages and limitations for social science goals. Second, they will obtain skills to work with machine learning software / codes. Third, by the end of the course all students will produce small-scale research project that may be used in their Master theses. Depending on the level of a student group, the course may be based on one of the following software tools:OrangeR PythonIntended Learning Outcomes (ILO)By the end of the course the students will be able:To collect data from the Internet for social science research;To analyze them with machine learning tools;To visualize results of the analysis;To present the resulting project.Teaching and Learning MethodsSeminars, team work, project workContent and Structure of the Course№Topic / Course ChapterTotalDirected StudySelf-directed StudyLecturesTutorials1Introduction to machine learning.112Overview of mathematical formalism necessary for understanding of machine learning2113Data collection and existing databases for machine learning41124Data preprocessing51225Cluster analysis (Kmeans, Cmeans, Hierarchical clustering)91266Linear models of classification and regressions71247KNN and SVM classification111288Na?ve Bayes classifier111289Topic Modeling15141010Decision trees11128Total study hours76101848Indicative Assessment Methods and Strategy 80% - class work20% - Presentation of projectReadings / Indicative Learning ResourcesMandatory Murphy K. P., Machine learning : a probabilistic perspective, The MIT Press, 2012, 004 M96, ISBN: 978-0-262-01802-9Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshiran, An Introduction to Statistical Learning (with applications in R), 1st ed. 2013, Corr. 7th printing 2017 EditionOrange Data Mining Library Documentation, Release 3, 2018, : video course: Cheat Sheets: Tools for Machine Learning: Self- Study StrategiesType+/–HoursReading for seminars / tutorials (lecture materials, mandatory and optional resources)25Assignments for seminars / tutorials / labsE-learning / distance learning (MOOC / LMS)FieldworkProject workOther (please specify)Preparation for the examAcademic Support for the CourseAcademic support for the course is provided via LMS, where students can find: guidelines and recommendations for doing the course; guidelines and recommendations for self-study; samples of assessment materialsFacilities, Equipment and SoftwareOrange - an open-source data visualization, machine learning and data mining toolkitR Studio - free and open-source integrated development environment (IDE) for RJupyter Notebook - open-source software for pythonCourse InstructorSergei Koltcov ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

Related download