Scikit-Learn - Tutorialspoint

Scikit-Learn i

Scikit-Learn

About the Tutorial

Scikit-learn (Sklearn) is the most useful and robust library for machine learning in Python. It provides a selection of efficient tools for machine learning and statistical modeling including classification, regression, clustering and dimensionality reduction via a consistence interface in Python. This library, which is largely written in Python, is built upon NumPy, SciPy and Matplotlib.

Audience

This tutorial will be useful for graduates, postgraduates, and research students who either have an interest in this Machine Learning subject or have this subject as a part of their curriculum. The reader can be a beginner or an advanced learner.

Prerequisites

The reader must have basic knowledge about Machine Learning. He/she should also be aware about Python, NumPy, Scipy, Matplotlib. If you are new to any of these concepts, we recommend you take up tutorials concerning these topics, before you dig further into this tutorial.

Copyright & Disclaimer

Copyright 2019 by Tutorials Point (I) Pvt. Ltd. All the content and graphics published in this e-book are the property of Tutorials Point (I) Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute or republish any contents or a part of contents of this e-book in any manner without written consent of the publisher. We strive to update the contents of our website and tutorials as timely and as precisely as possible, however, the contents may contain inaccuracies or errors. Tutorials Point (I) Pvt. Ltd. provides no guarantee regarding the accuracy, timeliness or completeness of our website or its contents including this tutorial. If you discover any errors on our website or in this tutorial, please notify us at contact@

ii

Scikit-Learn

Table of Contents

About the Tutorial ........................................................................................................................................... ii Audience.......................................................................................................................................................... ii Prerequisites.................................................................................................................................................... ii Copyright & Disclaimer .................................................................................................................................... ii Table of Contents ........................................................................................................................................... iii 1. Scikit-Learn -- Introduction ......................................................................................................................1 What is Scikit-Learn (Sklearn)? ........................................................................................................................ 1 Origin of Scikit-Learn ....................................................................................................................................... 1 Community & contributors.............................................................................................................................. 1 Prerequisites.................................................................................................................................................... 2 Installation ....................................................................................................................................................... 2 Features ........................................................................................................................................................... 3 2. Scikit-Learn Modelling Process .............................................................................................................4 Dataset Loading ............................................................................................................................................... 4 Splitting the dataset ........................................................................................................................................ 6 Train the Model ............................................................................................................................................... 7 Model Persistence ........................................................................................................................................... 8 Preprocessing the Data ................................................................................................................................... 9 Binarisation...................................................................................................................................................... 9 Mean Removal................................................................................................................................................. 9 Scaling............................................................................................................................................................ 10 Normalisation ................................................................................................................................................ 11 3. Scikit-Learn -- Data Representation .......................................................................................................13 Data as table.................................................................................................................................................. 13 Data as Feature Matrix .................................................................................................................................. 13 Data as Target array ...................................................................................................................................... 14

iii

Scikit-Learn

4. Scikit-Learn Estimator API...................................................................................................................16 What is Estimator API? .................................................................................................................................. 16 Use of Estimator API...................................................................................................................................... 16 Guiding Principles .......................................................................................................................................... 17 Steps in using Estimator API .......................................................................................................................... 18 Supervised Learning Example ........................................................................................................................ 18 Unsupervised Learning Example ................................................................................................................... 23

5. Scikit-Learn -- Conventions ....................................................................................................................26 Purpose of Conventions ................................................................................................................................ 26 Various Conventions...................................................................................................................................... 26

6. Scikit-Learn Linear Modeling ..............................................................................................................31 Linear Regression .......................................................................................................................................... 32 Logistic Regression ........................................................................................................................................ 34 Ridge Regression ........................................................................................................................................... 37 Bayesian Ridge Regression ............................................................................................................................ 40 LASSO (Least Absolute Shrinkage and Selection Operator)........................................................................... 43 Multi-task LASSO ........................................................................................................................................... 45 Elastic-Net...................................................................................................................................................... 47 MultiTaskElasticNet ....................................................................................................................................... 51

7. Scikit-Learn -- Extended Linear Modeling...............................................................................................54 Introduction to Polynomial Features............................................................................................................. 54 Streamlining using Pipeline tools .................................................................................................................. 55

8. Scikit-Learn Stochastic Gradient Descent............................................................................................57 SGD Classifier................................................................................................................................................. 57 SGD Regressor ............................................................................................................................................... 61 Pros and Cons of SGD .................................................................................................................................... 63

9. Scikit-Learn -- Support Vector Machines (SVMs) ....................................................................................64 Introduction................................................................................................................................................... 64 iv

Scikit-Learn

Classification of SVM ..................................................................................................................................... 65 SVC................................................................................................................................................................. 65 NuSVC ............................................................................................................................................................ 69 LinearSVC....................................................................................................................................................... 70 Regression with SVM ..................................................................................................................................... 71 SVR................................................................................................................................................................. 71 NuSVR ............................................................................................................................................................ 72 LinearSVR....................................................................................................................................................... 73 10. Scikit-Learn Anomaly Detection..........................................................................................................75 Methods ........................................................................................................................................................ 75 Sklearn algorithms for Outlier Detection ...................................................................................................... 76 Fitting an elliptic envelop .............................................................................................................................. 76 Isolation Forest .............................................................................................................................................. 78 Local Outlier Factor ....................................................................................................................................... 80 One-Class SVM............................................................................................................................................... 82 11. Scikit-Learn -- K-Nearest Neighbors (KNN) .............................................................................................84 Types of algorithms ....................................................................................................................................... 84 Choosing Nearest Neighbors Algorithm ........................................................................................................ 85 12. Scikit-Learn KNN Learning...................................................................................................................87 Unsupervised KNN Learning .......................................................................................................................... 87 Supervised KNN Learning .............................................................................................................................. 91 KNeighborsClassifier ...................................................................................................................................... 91 RadiusNeighborsClassifier ............................................................................................................................. 97 Nearest Neighbor Regressor ......................................................................................................................... 99 KNeighborsRegressor .................................................................................................................................... 99 RadiusNeighborsRegressor.......................................................................................................................... 101 13. Scikit-Learn Classification with Na?ve Bayes .....................................................................................104 Gaussian Na?ve Bayes .................................................................................................................................. 105

v

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download