5. Traversing DataFrame Elements using

[Pages:8]Part 2 : Data Frame Continued ................................

5. Traversing DataFrame Elements using

i. iterrows() , iteritems() and itertuples()

To iterate over the rows of the DataFrame, we can use the following functions -

iteritems() - to iterate over the (key,value) pairs

iterrows() - iterate over the rows as (index,series) pairs

itertuples() - iterate over the rows as named tuples

Lets take an example >>>import pandas as pd >>>import numpy as np >>>d={'Name':['Shalini','Varsha','Shanti','Madhu'],'Age':[23,56,54,34]} >>> print df

Name Age 0 Shalini 23 1 Varsha 56 2 Shanti 54 3 Madhu 34

5.1 USE of iterrows()

>>>print('ITER ROWS') >>>for key,values in df.iterrows():

for val in values: print('Hello',val)

ITER ROWS Hello Shalini Hello 23 Hello Varsha Hello 56 Hello Shanti Hello 54 Hello Madhu Hello 34

5.2 USE of iteritems()

>>>print('ITER ITEMS')

>>>for key,values in df.iteritems():

for val in values: print('Hello',val)

ITER ITEMS Hello Shalini Hello Varsha Hello Shanti Hello Madhu Hello 23 Hello 56 Hello 54 Hello 34

pythonclassroomdiary.

by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

5.3 USE of itertuples()

>>>print('ITER TUPLES')

>>>for rows in df.itertuples():

print(rows)

ITER TUPLES Pandas(Index=0, Name='Shalini', Age=23) Pandas(Index=1, Name='Varsha', Age=56) Pandas(Index=2, Name='Shanti', Age=54) Pandas(Index=3, Name='Madhu', Age=34)

6. Binary Operations in a DataFrame (add, sub, mul, div, radd , rsub) : Lets take a DataFrames with numeric data

>>> s1=[[1,2,3],[4,5,6]]

>>> s2=[[3,2,5],[5,7,8]]

>>> s3=[[5,5,5],[4,4,4]]

>>> dfr1=pd.DataFrame(s1)

>>> dfr2=pd.DataFrame(s2)

>>> dfr3=pd.DataFrame(s3)

6.1

ADDITION

>>> dfr1

0 1 2 0 1 2 3 1 4 5 6

Created three data frames namely dfr1, dfr2 and dfr3

>>> dfr2 0 1 2

0 3 2 5 1 5 7 8

>>> dfr3 0 1 2

0 5 5 5 1 4 4 4

An individual value or a Data frame can be added to another Dataframe

>>> dfr1+2

0 1 2 0 3 4 5 1 6 7 8

Here 2 is added to each element of Data Frame dfr2

pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

>>> dfr1+dfr2

0 1 2 04 4 8 1 9 12 14

>>> dfr1.add(dfr2)

0 1 2 04 4 8 1 9 12 14

>>> dfr1.radd(dfr2)

0 1 2 04 4 8 1 9 12 14

>>> dfr3+dfr1+dfr2

0 1 2 0 9 9 13 1 13 16 18

6.2

SUBTRACTION

>>> dfr1-dfr2

0 1 2 0 -2 0 -2 1 -1 -2 -2

>>> dfr1.sub(dfr2)

0 1 2 0 -2 0 -2 1 -1 -2 -2

>>> dfr1.rsub(dfr2)

0 1 2 0 2 0 2 1 1 2 2

>>> dfr1-2

0 1 2 0 -1 0 1 1 2 3 4

Corresponding element of dfr1 and dfr2 is added It will add Corresponding elements of dfr2 with dfr1 (dfr2+dfr1)

Here `r' stands for reverse it will add Corresponding elements

of dfr2 with dfr1 (dfr2+dfr1) It will add Corresponding elements of dfr1, dfr2 and dfr13

It will subtract Corresponding elements of dfr1 with dfr2

It will subtract Corresponding elements of dfr1 with dfr2

Here `r' stands for reverse it will subtract Corresponding

elements of dfr2 with dfr1 (dfr2 - dfr1)

Here 2 is subtracted with each element of Data Frame dfr1

pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

In the Same way Multiplication can be done with * operator and mul() function and Division can be done with / operator and div() function

7. Matching and Broadcasting Operations:

7.1 Matching : Whenever we perform arithmetic operations on dataframe data is aligned on the basis of matching indexes and then performed arithmetic ; for non-overlapping indexes the arithmetic operations result as a NaN . This default behavior of data alignment on the basis of matching indexes is known as MATCHING

import pandas as pd s1=[[21,52,43],[41,55,66]] s2=[[34,4],[4,6]] dfr1=pd.DataFrame(s1) dfr2=pd.DataFrame(s2) print('Data Frame 1') print(dfr1) print('Data Frame 2') print(dfr2) print('Matching is done') print(dfr1+dfr2)

Data Frame 1 0 1 2

0 21 52 43 1 41 55 66

Data Frame 2 0 1

0 34 4 1 4 6

Matching is done 0 1 2

0 55 56 NaN 1 45 61 NaN

Output

7.2 Broadcasting : Enlarging the smaller object in a binary operation by replicating its elements so as to match the shape of larger object.

import pandas as pd s1=[[21,52,43],[41,55,66]] s2=[[34,4],[4,6]] dfr1=pd.DataFrame(s1) dfr2=pd.DataFrame(s2) print('Data Frame 1') print(dfr1) print('Data Frame 2') print(dfr2) print('Matching is done') print(dfr1+dfr2) print('Broadcasting is done') s3=[3,4,5] print(dfr1+s3)

Data Frame 1

01 2

0 21 52 43

1 41 55 66

Data Frame 2

0

1

0 34

4

1 4

6

Broadcasting is done

01 2

0 24 56 48

1 44 59 71

Output

1

pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

8. Handling Missing Data :

As data comes in many shapes and forms, pandas aims to be flexible with regard to handling missing data. While NaN is the default missing value marker for reasons of computational speed and convenience, we need to be able to easily detect this value with data of different types: floating point, integer, boolean, and general object. In many cases, however, the Python None will arise and we wish to also consider that "missing" or "not available" or "NA".

Function Name isnull()

notnull()

dropna()

Use Returns True or False for each value in pandas object if it is a missing value or not Returns True or False for each value in pandas object if it is a data value or not It will remove(drop) all the rows which contain NaN values anywhere in row

Dropna(how='all') fillna()

It will remove nly those rows that have all NaN values It will fill missing Values with the value specified

>>>import pandas as pd >>>KV_shift1={'Computer':[20,25,22,50],'Projectors':[1,1,1,14],'iPad':[1,1,1,7],'AppleTv':[1,1,1,7]} >>>dfr1=pd.DataFrame(KV_shift1,index=['SrCompLab','SecCompLab','PriLab','Others']) >>>print(dfr1) >>>KV_shift2={'Computer':[20,25,22,50],'Visualizers':[1,1,1,14],'iPad':[1,1,1,7],'AppleTv':[1,1,1,7]} >>>dfr2=pd.DataFrame(KV_shift2,index=['SrCompLab','SecCompLab','PriLab','Others']) >>>print(dfr2) >>>KV3Gwl=dfr1+dfr2 >>>print(KV3Gwl)

Computer Projectors iPad AppleTv

SrCompLab

20

1 1

1

SecCompLab

25

1 1

1

PriLab

22

1 1

1

Others

50

14 7

7

DataFrame : dfr1

Computer Visualizers iPad AppleTv

SrCompLab

20

1 1

1

SecCompLab

25

1 1

1

PriLab

22

1 1

1

Others

50

14 7

7

DataFrame : dfr2

AppleTv Computer Projectors Visualizers iPad

SrCompLab

2

40

NaN

NaN 2

SecCompLab 2

50

NaN

NaN 2

PriLab

2

44

NaN

NaN 2

Others

14 100

NaN

NaN 14

8.1 Use of isnull() and notnull()

DataFrame : KV3Gwl

print('ISNULL ()' )

print(KV3Gwl.isnull())

print('NOTNULL ()' ) print(KV3Gwl.notnull())

Isnull() Will Give True If

notnull() Will Give True If Corresponding Element

Corresponding Element

Contains an data

pythonclassroomdiary.wordpreCsso.ncotamins bNyaSNangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

Isnull() Will Give True If

Corresponding Element

Contains NaN

ISNULL ()

AppleTv Computer Projectors Visualizers iPad

SrCompLab False False

True

True

False

SecCompLab False False

True

True

False

PriLab

False False

True

True

False

Others

False False

True

True

False

NOTNULL ()

AppleTv Computer Projectors Visualizers iPad

SrCompLab True True

False

False

True

SecCompLab True True

False

False

True

PriLab

True True

False

False

True

Others

True True

False

False

True

8.2 Use of dropna()

>>> KV3Gwl

AppleTv Computer Projectors Visualizers iPad

SrCompLab

2 40

6.0

2.0

2

SecCompLab 2 50

NaN

5.0

2

PriLab

2 44

7.0

NaN

2

Others

14 100

5.0

2.0

14

>>> KV3Gwl.dropna()

AppleTv Computer Projectors Visualizers iPad

SrCompLab 2

40

6.0

2.0

2

Others

14

100

5.0

2.0

14

8.3 Use of fillna() >>> KV3Gwl.fillna({'Projectors':0,'Visualizers':0})

AppleTv Computer Projectors Visualizers iPad

SrCompLab

2

40

6.0

2.0

2

SecCompLab

2

50

0.0

5.0

2

PriLab

2

44

7.0

0.0

2

Others

14 100

5.0

2.0

14

9. Comparision among Panda Objects (Series, DataFrame)

We can compare Panda Objects using == operator or using equals() function. The difference between these two is that == compares each element of first dataframe with corresponding element of second dataframe.

Lets clear with following example

import pandas as pd import numpy as np

pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

KV_Shift1={'Computer':[20,25,22,50],'Projectors':[1,1,np.NaN,14],'iPad':[np.NaN,1,2,7],'Ap pleTv':[1,1,1,7]} dfr1=pd.DataFrame(KV_Shift1,index=['SrCompLab','SecCompLab','PriLab','Others']) KV_Shift2={'Computer':[20,25,22,50],'Projectors':[1,1,np.NaN,14],'iPad':[np.NaN,1,2,7],'Ap pleTv':[1,1,1,7]} dfr2=pd.DataFrame(KV_Shift2,index=['SrCompLab','SecCompLab','PriLab','Others']) print('Data Frame 1 :') print(dfr1) print('Data Frame 2 :') print(dfr2) print('Checking Equality using == operator') print(dfr1==dfr1) print('Checking Equality using aequal() funcition') print(dfr1.equals(dfr1))

Data Frame 1 :

Computer Projectors iPad AppleTv

SrCompLab

20

1.0

NaN

1

SecCompLab

25

1.0

1.0

1

PriLab

22

NaN

2.0

1

Others

50

14.0

7.0

7

Data Frame 2 :

Computer Projectors iPad AppleTv

SrCompLab

20

1.0

NaN 1.0

SecCompLab 25

1.0

1.0

1.0

PriLab

22

NaN

2.0

1.0

Others

50

14.0

7.0

7.0

Checking Equality using == operator

Computer Projectors iPad AppleTv

SrCompLab True

True

False True

SecCompLab True

True

True True

PriLab

True False

True True

Others

True

True

True True

Checking Equality using equals() funcition True

10. Boolean Reduction :

With Boolean Reduction ,You can get overall result for a row or a column with a single True or False. For this purpose Pandas offers following Boolean reduction functions or attributes

10.1 empty : Tells whether the Data Frame is Empty. 10.2 any () : It returns True if any of the element is True over requested axis. 10.3 all () : This function will return True if all the values on an axis are satisfying condition.

pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

import pandas as pd import numpy as np df1=pd.DataFrame() s1=[[2,5,8],[10,5,2]] df2=pd.DataFrame(s1)

if df1.empty==True: print('Data1 Frame is Empty')

if df2.empty==True: print('Data Frame2 is Empty')

else: print('Data Frame2 is not Empty')

print('Data Frame') print(df2) print('Used function all()') print((df2 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download