5. Traversing DataFrame Elements using

Part 2 : Data Frame Continued ................................

5. Traversing DataFrame Elements using

i. iterrows() , iteritems() and itertuples()

To iterate over the rows of the DataFrame, we can use the following functions -

iteritems() - to iterate over the (key,value) pairs

iterrows() - iterate over the rows as (index,series) pairs

itertuples() - iterate over the rows as named tuples

Lets take an example >>>import pandas as pd >>>import numpy as np >>>d={'Name':['Shalini','Varsha','Shanti','Madhu'],'Age':[23,56,54,34]} >>> print df

Name Age 0 Shalini 23 1 Varsha 56 2 Shanti 54 3 Madhu 34

5.1 USE of iterrows()

>>>print('ITER ROWS') >>>for key,values in df.iterrows():

for val in values: print('Hello',val)

ITER ROWS Hello Shalini Hello 23 Hello Varsha Hello 56 Hello Shanti Hello 54 Hello Madhu Hello 34

5.2 USE of iteritems()

>>>print('ITER ITEMS')

>>>for key,values in df.iteritems():

for val in values: print('Hello',val)

ITER ITEMS Hello Shalini Hello Varsha Hello Shanti Hello Madhu Hello 23 Hello 56 Hello 54 Hello 34

pythonclassroomdiary.

by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

5.3 USE of itertuples()

>>>print('ITER TUPLES')

>>>for rows in df.itertuples():

print(rows)

ITER TUPLES Pandas(Index=0, Name='Shalini', Age=23) Pandas(Index=1, Name='Varsha', Age=56) Pandas(Index=2, Name='Shanti', Age=54) Pandas(Index=3, Name='Madhu', Age=34)

6. Binary Operations in a DataFrame (add, sub, mul, div, radd , rsub) : Lets take a DataFrames with numeric data

>>> s1=[[1,2,3],[4,5,6]]

>>> s2=[[3,2,5],[5,7,8]]

>>> s3=[[5,5,5],[4,4,4]]

>>> dfr1=pd.DataFrame(s1)

>>> dfr2=pd.DataFrame(s2)

>>> dfr3=pd.DataFrame(s3)

6.1

ADDITION

>>> dfr1

0 1 2 0 1 2 3 1 4 5 6

Created three data frames namely dfr1, dfr2 and dfr3

>>> dfr2 0 1 2

0 3 2 5 1 5 7 8

>>> dfr3 0 1 2

0 5 5 5 1 4 4 4

An individual value or a Data frame can be added to another Dataframe

>>> dfr1+2

0 1 2 0 3 4 5 1 6 7 8

Here 2 is added to each element of Data Frame dfr2

pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

>>> dfr1+dfr2

0 1 2 04 4 8 1 9 12 14

>>> dfr1.add(dfr2)

0 1 2 04 4 8 1 9 12 14

>>> dfr1.radd(dfr2)

0 1 2 04 4 8 1 9 12 14

>>> dfr3+dfr1+dfr2

0 1 2 0 9 9 13 1 13 16 18

6.2

SUBTRACTION

>>> dfr1-dfr2

0 1 2 0 -2 0 -2 1 -1 -2 -2

>>> dfr1.sub(dfr2)

0 1 2 0 -2 0 -2 1 -1 -2 -2

>>> dfr1.rsub(dfr2)

0 1 2 0 2 0 2 1 1 2 2

>>> dfr1-2

0 1 2 0 -1 0 1 1 2 3 4

Corresponding element of dfr1 and dfr2 is added It will add Corresponding elements of dfr2 with dfr1 (dfr2+dfr1)

Here `r' stands for reverse it will add Corresponding elements

of dfr2 with dfr1 (dfr2+dfr1) It will add Corresponding elements of dfr1, dfr2 and dfr13

It will subtract Corresponding elements of dfr1 with dfr2

It will subtract Corresponding elements of dfr1 with dfr2

Here `r' stands for reverse it will subtract Corresponding

elements of dfr2 with dfr1 (dfr2 - dfr1)

Here 2 is subtracted with each element of Data Frame dfr1

pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

In the Same way Multiplication can be done with * operator and mul() function and Division can be done with / operator and div() function

7. Matching and Broadcasting Operations:

7.1 Matching : Whenever we perform arithmetic operations on dataframe data is aligned on the basis of matching indexes and then performed arithmetic ; for non-overlapping indexes the arithmetic operations result as a NaN . This default behavior of data alignment on the basis of matching indexes is known as MATCHING

import pandas as pd s1=[[21,52,43],[41,55,66]] s2=[[34,4],[4,6]] dfr1=pd.DataFrame(s1) dfr2=pd.DataFrame(s2) print('Data Frame 1') print(dfr1) print('Data Frame 2') print(dfr2) print('Matching is done') print(dfr1+dfr2)

Data Frame 1 0 1 2

0 21 52 43 1 41 55 66

Data Frame 2 0 1

0 34 4 1 4 6

Matching is done 0 1 2

0 55 56 NaN 1 45 61 NaN

Output

7.2 Broadcasting : Enlarging the smaller object in a binary operation by replicating its elements so as to match the shape of larger object.

import pandas as pd s1=[[21,52,43],[41,55,66]] s2=[[34,4],[4,6]] dfr1=pd.DataFrame(s1) dfr2=pd.DataFrame(s2) print('Data Frame 1') print(dfr1) print('Data Frame 2') print(dfr2) print('Matching is done') print(dfr1+dfr2) print('Broadcasting is done') s3=[3,4,5] print(dfr1+s3)

Data Frame 1

01 2

0 21 52 43

1 41 55 66

Data Frame 2

0

1

0 34

4

1 4

6

Broadcasting is done

01 2

0 24 56 48

1 44 59 71

Output

1

pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

8. Handling Missing Data :

As data comes in many shapes and forms, pandas aims to be flexible with regard to handling missing data. While NaN is the default missing value marker for reasons of computational speed and convenience, we need to be able to easily detect this value with data of different types: floating point, integer, boolean, and general object. In many cases, however, the Python None will arise and we wish to also consider that "missing" or "not available" or "NA".

Function Name isnull()

notnull()

dropna()

Use Returns True or False for each value in pandas object if it is a missing value or not Returns True or False for each value in pandas object if it is a data value or not It will remove(drop) all the rows which contain NaN values anywhere in row

Dropna(how='all') fillna()

It will remove nly those rows that have all NaN values It will fill missing Values with the value specified

>>>import pandas as pd >>>KV_shift1={'Computer':[20,25,22,50],'Projectors':[1,1,1,14],'iPad':[1,1,1,7],'AppleTv':[1,1,1,7]} >>>dfr1=pd.DataFrame(KV_shift1,index=['SrCompLab','SecCompLab','PriLab','Others']) >>>print(dfr1) >>>KV_shift2={'Computer':[20,25,22,50],'Visualizers':[1,1,1,14],'iPad':[1,1,1,7],'AppleTv':[1,1,1,7]} >>>dfr2=pd.DataFrame(KV_shift2,index=['SrCompLab','SecCompLab','PriLab','Others']) >>>print(dfr2) >>>KV3Gwl=dfr1+dfr2 >>>print(KV3Gwl)

Computer Projectors iPad AppleTv

SrCompLab

20

1 1

1

SecCompLab

25

1 1

1

PriLab

22

1 1

1

Others

50

14 7

7

DataFrame : dfr1

Computer Visualizers iPad AppleTv

SrCompLab

20

1 1

1

SecCompLab

25

1 1

1

PriLab

22

1 1

1

Others

50

14 7

7

DataFrame : dfr2

AppleTv Computer Projectors Visualizers iPad

SrCompLab

2

40

NaN

NaN 2

SecCompLab 2

50

NaN

NaN 2

PriLab

2

44

NaN

NaN 2

Others

14 100

NaN

NaN 14

8.1 Use of isnull() and notnull()

DataFrame : KV3Gwl

print('ISNULL ()' )

print(KV3Gwl.isnull())

print('NOTNULL ()' ) print(KV3Gwl.notnull())

Isnull() Will Give True If

notnull() Will Give True If Corresponding Element

Corresponding Element

Contains an data

pythonclassroomdiary.wordpreCsso.ncotamins bNyaSNangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

Isnull() Will Give True If

Corresponding Element

Contains NaN

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download