Amizone.net



Annexure ‘CD – 01’ LTPSW/FWTOTAL CREDIT UNITS 02-02-03Course Title: R Studio and Python for Machine LearningCredit Units: 03Course Code: Course Level: PGCourse Objectives: The objective of the course is to learn applications of various machine learning concepts using R & Python language. The course would enable the ability to understand and critically assess available data using machine learning methods. To prepare post graduates who can conduct data-driven investigations and conduct visual and advanced analytics by acquiring and managing data of all types. Through this course, graduates will develop an in-depth understanding of data science and the techniques for analysis of quantitative and qualitative data to arrive at solutions. They will be able to identify patterns, predict trends, and analyse data from sectors such as Insurance, Actuarial Science, Banking and Finance, Retail, and Healthcare.Pre-requisites: Student should have the knowledge of statistical methods and probability distribution, mathematical statistics and statistical inference.Course Contents/Syllabus:Weightage (%)Module I: Introduction to Data Science 25%Relevance in industry?& need of the hour, Types of analytics – Marketing, Risk, Operations, etc, Business & Technology drivers for analytics, Future of analytics?& critical requirement, What is Big Data, Types of Data, SQL / NoSQL, Characteristics of Big Data, Need for understanding Big Data (Application of Big Data), Traditional Approaches and its limitations, Introduction to Hadoop and eco-system, Getting Started with Hadoop.Module II: Introduction to R Studio Programming20%Basic fundamentals, installation and use of software, data editing, use of R as a calculator, functions and assignments, Use of R as a calculator, functions and matrix operations, missing data and logical operators, Conditional executions and loops, The Workspace, Input/ Output, Useful packages (Base & other packages in R), Graphic user Interfaces (R studio), Customizing startup, Batch processing, Social Media Analytics using R: Social Media – Characteristics of Social Media, Applications of Social Media Analytics, Metrics(Measures Actions) in social media analytics, Examples & Actionable Insights using Social Media Analytics, Text Analytics – Sentiment Analysis using R, Text Analytics, Word cloud analysis using R. Data Input and output in R Studio - Data structure?& Data types (Vectors, Matrices, Factors, Data frames?& Lists), Importing data (importing data from csv, text, excel?& other files), Keyboard input (creating input by entering data), Database input (connecting to database and use the data), Exporting data (exporting?files into different formats)Module III: Data Management with R Studio15%Data management with sequences, Data management with repeats, sorting, ordering, and lists, Vector indexing, factors, Data management with strings, display and formatting, Data management with display paste, split, find and replacement, manipulations with alphabets, evaluation of strings, data frames, Data frames, import of external data in various file formats, statistical functions, a compilation of data.Module IV: Introduction to Python Programming20%Program structure in Python. Execution steps - Interactive Shell, Executable or script files, User Interface or IDE. Memory management and Garbage collections - Object creation and deletion, Object properties. Data Types and Operations - Numbers, Strings, List, Tuple, Dictionary, Other Core Types. Statements and Syntax in Python - Assignments, Expressions and prints, If tests and Syntax Rules, While and For Loops, Iterations and Comprehensions. File Operations -Opening a file, Using Files, and Other File tools. Functions in Python - Function definition and call, Function Scope, Arguments, Function Objects, and Anonymous Functions. Modules and Packages - Module Creations and Usage, Module Search Path, Module Vs. Script, Package Creation and Importing. Module V: Analytics through Python Programming20%Data types and data structures in python; Loops and Conditionals in python; Defining functions. Numpy: Arrays; Basic arrays operations; Comparison operators and value testing for arrays; Array item selection and manipulation; Statistics; Random Numbers. Pandas: a quick start; Data structures in pandas; Essential basic functionality of pandas including indexing and accessing data.Student Learning Outcomes:On completion the student will be able to:The use of R & Python in Statistical distribution.Data Science AnalyticsHadoop & Big DataSoftware Required:R Studio: 1.1.463 VersionAnaconda for Python: 3.6 VersionPedagogy for Course Delivery:PracticalAssignmentsCase StudyAssessment/ Examination Scheme:Theory L/T (%)Lab/Practical/Studio (%)5050Theory Assessment (L&T):Continuous Assessment/Internal AssessmentEnd Term ExaminationComponents (Drop down)Class TestClass PerformanceClass attendanceHome AssignmentsTotalWeightage (%) 10%05%05%10%70%Text & References:Robert?A. Muenchen and Joseph?M. Hilbe. R for Stata Users. Statistics and Computing. Springer, 2010Rob Kabacoff. R in Action. Manning, 2010Big Data: A Revolution That Will Transform How We Live, Work, and ThinkMachine Learning (in Python and R) For Dummies by John Paul Mueller (Author), Luca MassaronCore Python Programming by HYPERLINK "" R. Nageswara Rao ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download