9 - Preprocessing - Data Science Practicum 2021/22, Lesson 9
9 - Preprocessing Data Science Practicum 2021/22, Lesson 9
Marko Tkalcic
Univerza na Primorskem
Marko Tkalcic, DP-202122-09
1/30
Table of Contents
Pre-processing Missing Values Standardization and Normalization Assignment References
Marko Tkalcic, DP-202122-09
2/30
Pre-processing
? The typical machine learning work-flow has the following steps: 1. Acquire data 2. Pre-process data 3. Train/learn model 4. Evaluate model 5. Deploy model
Marko Tkalcic, DP-202122-09
3/30
Pre-processing
? The typical machine learning work-flow has the following steps: 1. Acquire data 2. Pre-process data 3. Train/learn model 4. Evaluate model 5. Deploy model
? The pre-processing step can do many things: ? Data cleaning
? Missing values management ? Duplicate values ? Inconsistent data (e.g. Gender: M, Pregnant: True) ? Feature scaling: ? Standardization ? Normalization ? Binning ? Dimensionality reduction
Marko Tkalcic, DP-202122-09
3/30
Table of Contents
Pre-processing Missing Values Standardization and Normalization Assignment References
Marko Tkalcic, DP-202122-09
4/30
Missing Values
AB
C
D
0 1.0 2.0 3.0 4.0 1 5.0 6.0 NaN 8.0 2 0.0 11.0 12.0 NaN
Marko Tkalcic, DP-202122-09
5/30
Missing Values
df.isnull()
AB
C
D
0 1.0 2.0 3.0 4.0 1 5.0 6.0 NaN 8.0 2 0.0 11.0 12.0 NaN
A
B
C
D
0 False False False False 1 False False True False 2 False False False True
Marko Tkalcic, DP-202122-09
5/30
Missing values
? What can we do? ? Remove data with missing values (rows, columns) ? Replace missing values (impute) with some data (mean, median, constant, random . . . )
? Imputed values may by systematically above or below their actual values ? Rows with missing values may be unique in some other way
Marko Tkalcic, DP-202122-09
6/30
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- tidy data a foundation for wrangling in pandas ingesting
- advanced python programming university of sheffield
- with pandas f m a vectorized m a f operations cheat sheet
- ecopy documentation read the docs
- load adataframefromamicrosoftexcelfile preliminaries
- data wrangling tidy data pandas
- data science 3 etl github pages
- pandas dataframe notes university of idaho
- introduction to data science in python week 1 weebly
- 9 preprocessing data science practicum 2021 22 lesson 9
Related searches
- 2021 22 fafsa pdf
- data science vs data analysis
- fafsa 2021 22 application
- fafsa 2021 22 application paper
- fafsa 2021 22 pdf
- fafsa application 2021 22 pdf
- 2021 22 seattle schools calendar
- high standard 22 revolver 9 shot
- 2021 22 school year calendar
- fafsa 2021 22 application deadline
- taurus 22 revolver 9 shot
- 2021 22 academic calendar word