5. Traversing DataFrame Elements using
[Pages:8]Part 2 : Data Frame Continued ................................
5. Traversing DataFrame Elements using
i. iterrows() , iteritems() and itertuples()
To iterate over the rows of the DataFrame, we can use the following functions -
iteritems() - to iterate over the (key,value) pairs
iterrows() - iterate over the rows as (index,series) pairs
itertuples() - iterate over the rows as named tuples
Lets take an example >>>import pandas as pd >>>import numpy as np >>>d={'Name':['Shalini','Varsha','Shanti','Madhu'],'Age':[23,56,54,34]} >>> print df
Name Age 0 Shalini 23 1 Varsha 56 2 Shanti 54 3 Madhu 34
5.1 USE of iterrows()
>>>print('ITER ROWS') >>>for key,values in df.iterrows():
for val in values: print('Hello',val)
ITER ROWS Hello Shalini Hello 23 Hello Varsha Hello 56 Hello Shanti Hello 54 Hello Madhu Hello 34
5.2 USE of iteritems()
>>>print('ITER ITEMS')
>>>for key,values in df.iteritems():
for val in values: print('Hello',val)
ITER ITEMS Hello Shalini Hello Varsha Hello Shanti Hello Madhu Hello 23 Hello 56 Hello 54 Hello 34
pythonclassroomdiary.
by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior
5.3 USE of itertuples()
>>>print('ITER TUPLES')
>>>for rows in df.itertuples():
print(rows)
ITER TUPLES Pandas(Index=0, Name='Shalini', Age=23) Pandas(Index=1, Name='Varsha', Age=56) Pandas(Index=2, Name='Shanti', Age=54) Pandas(Index=3, Name='Madhu', Age=34)
6. Binary Operations in a DataFrame (add, sub, mul, div, radd , rsub) : Lets take a DataFrames with numeric data
>>> s1=[[1,2,3],[4,5,6]]
>>> s2=[[3,2,5],[5,7,8]]
>>> s3=[[5,5,5],[4,4,4]]
>>> dfr1=pd.DataFrame(s1)
>>> dfr2=pd.DataFrame(s2)
>>> dfr3=pd.DataFrame(s3)
6.1
ADDITION
>>> dfr1
0 1 2 0 1 2 3 1 4 5 6
Created three data frames namely dfr1, dfr2 and dfr3
>>> dfr2 0 1 2
0 3 2 5 1 5 7 8
>>> dfr3 0 1 2
0 5 5 5 1 4 4 4
An individual value or a Data frame can be added to another Dataframe
>>> dfr1+2
0 1 2 0 3 4 5 1 6 7 8
Here 2 is added to each element of Data Frame dfr2
pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior
>>> dfr1+dfr2
0 1 2 04 4 8 1 9 12 14
>>> dfr1.add(dfr2)
0 1 2 04 4 8 1 9 12 14
>>> dfr1.radd(dfr2)
0 1 2 04 4 8 1 9 12 14
>>> dfr3+dfr1+dfr2
0 1 2 0 9 9 13 1 13 16 18
6.2
SUBTRACTION
>>> dfr1-dfr2
0 1 2 0 -2 0 -2 1 -1 -2 -2
>>> dfr1.sub(dfr2)
0 1 2 0 -2 0 -2 1 -1 -2 -2
>>> dfr1.rsub(dfr2)
0 1 2 0 2 0 2 1 1 2 2
>>> dfr1-2
0 1 2 0 -1 0 1 1 2 3 4
Corresponding element of dfr1 and dfr2 is added It will add Corresponding elements of dfr2 with dfr1 (dfr2+dfr1)
Here `r' stands for reverse it will add Corresponding elements
of dfr2 with dfr1 (dfr2+dfr1) It will add Corresponding elements of dfr1, dfr2 and dfr13
It will subtract Corresponding elements of dfr1 with dfr2
It will subtract Corresponding elements of dfr1 with dfr2
Here `r' stands for reverse it will subtract Corresponding
elements of dfr2 with dfr1 (dfr2 - dfr1)
Here 2 is subtracted with each element of Data Frame dfr1
pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior
In the Same way Multiplication can be done with * operator and mul() function and Division can be done with / operator and div() function
7. Matching and Broadcasting Operations:
7.1 Matching : Whenever we perform arithmetic operations on dataframe data is aligned on the basis of matching indexes and then performed arithmetic ; for non-overlapping indexes the arithmetic operations result as a NaN . This default behavior of data alignment on the basis of matching indexes is known as MATCHING
import pandas as pd s1=[[21,52,43],[41,55,66]] s2=[[34,4],[4,6]] dfr1=pd.DataFrame(s1) dfr2=pd.DataFrame(s2) print('Data Frame 1') print(dfr1) print('Data Frame 2') print(dfr2) print('Matching is done') print(dfr1+dfr2)
Data Frame 1 0 1 2
0 21 52 43 1 41 55 66
Data Frame 2 0 1
0 34 4 1 4 6
Matching is done 0 1 2
0 55 56 NaN 1 45 61 NaN
Output
7.2 Broadcasting : Enlarging the smaller object in a binary operation by replicating its elements so as to match the shape of larger object.
import pandas as pd s1=[[21,52,43],[41,55,66]] s2=[[34,4],[4,6]] dfr1=pd.DataFrame(s1) dfr2=pd.DataFrame(s2) print('Data Frame 1') print(dfr1) print('Data Frame 2') print(dfr2) print('Matching is done') print(dfr1+dfr2) print('Broadcasting is done') s3=[3,4,5] print(dfr1+s3)
Data Frame 1
01 2
0 21 52 43
1 41 55 66
Data Frame 2
0
1
0 34
4
1 4
6
Broadcasting is done
01 2
0 24 56 48
1 44 59 71
Output
1
pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior
8. Handling Missing Data :
As data comes in many shapes and forms, pandas aims to be flexible with regard to handling missing data. While NaN is the default missing value marker for reasons of computational speed and convenience, we need to be able to easily detect this value with data of different types: floating point, integer, boolean, and general object. In many cases, however, the Python None will arise and we wish to also consider that "missing" or "not available" or "NA".
Function Name isnull()
notnull()
dropna()
Use Returns True or False for each value in pandas object if it is a missing value or not Returns True or False for each value in pandas object if it is a data value or not It will remove(drop) all the rows which contain NaN values anywhere in row
Dropna(how='all') fillna()
It will remove nly those rows that have all NaN values It will fill missing Values with the value specified
>>>import pandas as pd >>>KV_shift1={'Computer':[20,25,22,50],'Projectors':[1,1,1,14],'iPad':[1,1,1,7],'AppleTv':[1,1,1,7]} >>>dfr1=pd.DataFrame(KV_shift1,index=['SrCompLab','SecCompLab','PriLab','Others']) >>>print(dfr1) >>>KV_shift2={'Computer':[20,25,22,50],'Visualizers':[1,1,1,14],'iPad':[1,1,1,7],'AppleTv':[1,1,1,7]} >>>dfr2=pd.DataFrame(KV_shift2,index=['SrCompLab','SecCompLab','PriLab','Others']) >>>print(dfr2) >>>KV3Gwl=dfr1+dfr2 >>>print(KV3Gwl)
Computer Projectors iPad AppleTv
SrCompLab
20
1 1
1
SecCompLab
25
1 1
1
PriLab
22
1 1
1
Others
50
14 7
7
DataFrame : dfr1
Computer Visualizers iPad AppleTv
SrCompLab
20
1 1
1
SecCompLab
25
1 1
1
PriLab
22
1 1
1
Others
50
14 7
7
DataFrame : dfr2
AppleTv Computer Projectors Visualizers iPad
SrCompLab
2
40
NaN
NaN 2
SecCompLab 2
50
NaN
NaN 2
PriLab
2
44
NaN
NaN 2
Others
14 100
NaN
NaN 14
8.1 Use of isnull() and notnull()
DataFrame : KV3Gwl
print('ISNULL ()' )
print(KV3Gwl.isnull())
print('NOTNULL ()' ) print(KV3Gwl.notnull())
Isnull() Will Give True If
notnull() Will Give True If Corresponding Element
Corresponding Element
Contains an data
pythonclassroomdiary.wordpreCsso.ncotamins bNyaSNangeeta M Chauhan, PGT CS , KV NO.3 Gwalior
Isnull() Will Give True If
Corresponding Element
Contains NaN
ISNULL ()
AppleTv Computer Projectors Visualizers iPad
SrCompLab False False
True
True
False
SecCompLab False False
True
True
False
PriLab
False False
True
True
False
Others
False False
True
True
False
NOTNULL ()
AppleTv Computer Projectors Visualizers iPad
SrCompLab True True
False
False
True
SecCompLab True True
False
False
True
PriLab
True True
False
False
True
Others
True True
False
False
True
8.2 Use of dropna()
>>> KV3Gwl
AppleTv Computer Projectors Visualizers iPad
SrCompLab
2 40
6.0
2.0
2
SecCompLab 2 50
NaN
5.0
2
PriLab
2 44
7.0
NaN
2
Others
14 100
5.0
2.0
14
>>> KV3Gwl.dropna()
AppleTv Computer Projectors Visualizers iPad
SrCompLab 2
40
6.0
2.0
2
Others
14
100
5.0
2.0
14
8.3 Use of fillna() >>> KV3Gwl.fillna({'Projectors':0,'Visualizers':0})
AppleTv Computer Projectors Visualizers iPad
SrCompLab
2
40
6.0
2.0
2
SecCompLab
2
50
0.0
5.0
2
PriLab
2
44
7.0
0.0
2
Others
14 100
5.0
2.0
14
9. Comparision among Panda Objects (Series, DataFrame)
We can compare Panda Objects using == operator or using equals() function. The difference between these two is that == compares each element of first dataframe with corresponding element of second dataframe.
Lets clear with following example
import pandas as pd import numpy as np
pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior
KV_Shift1={'Computer':[20,25,22,50],'Projectors':[1,1,np.NaN,14],'iPad':[np.NaN,1,2,7],'Ap pleTv':[1,1,1,7]} dfr1=pd.DataFrame(KV_Shift1,index=['SrCompLab','SecCompLab','PriLab','Others']) KV_Shift2={'Computer':[20,25,22,50],'Projectors':[1,1,np.NaN,14],'iPad':[np.NaN,1,2,7],'Ap pleTv':[1,1,1,7]} dfr2=pd.DataFrame(KV_Shift2,index=['SrCompLab','SecCompLab','PriLab','Others']) print('Data Frame 1 :') print(dfr1) print('Data Frame 2 :') print(dfr2) print('Checking Equality using == operator') print(dfr1==dfr1) print('Checking Equality using aequal() funcition') print(dfr1.equals(dfr1))
Data Frame 1 :
Computer Projectors iPad AppleTv
SrCompLab
20
1.0
NaN
1
SecCompLab
25
1.0
1.0
1
PriLab
22
NaN
2.0
1
Others
50
14.0
7.0
7
Data Frame 2 :
Computer Projectors iPad AppleTv
SrCompLab
20
1.0
NaN 1.0
SecCompLab 25
1.0
1.0
1.0
PriLab
22
NaN
2.0
1.0
Others
50
14.0
7.0
7.0
Checking Equality using == operator
Computer Projectors iPad AppleTv
SrCompLab True
True
False True
SecCompLab True
True
True True
PriLab
True False
True True
Others
True
True
True True
Checking Equality using equals() funcition True
10. Boolean Reduction :
With Boolean Reduction ,You can get overall result for a row or a column with a single True or False. For this purpose Pandas offers following Boolean reduction functions or attributes
10.1 empty : Tells whether the Data Frame is Empty. 10.2 any () : It returns True if any of the element is True over requested axis. 10.3 all () : This function will return True if all the values on an axis are satisfying condition.
pythonclassroomdiary. by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior
import pandas as pd import numpy as np df1=pd.DataFrame() s1=[[2,5,8],[10,5,2]] df2=pd.DataFrame(s1)
if df1.empty==True: print('Data1 Frame is Empty')
if df2.empty==True: print('Data Frame2 is Empty')
else: print('Data Frame2 is not Empty')
print('Data Frame') print(df2) print('Used function all()') print((df2 ................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- subject i p 065 practical file solution
- python tutorial for cse 446 university of washington
- 5 traversing dataframe elements using
- iterating over a dataframe
- comparisons of community detection algorithms in the
- building and operating a big data service based on apache
- interaction between sas and python for data handling and
- pandas dataframe notes university of idaho
- python pandas quick guide math
- pandas under the hood
Related searches
- 5 elements of narrative writing
- 5 basic elements of a story
- 5 elements of social structure
- 5 elements of nature
- 5 elements of nature chinese
- 5 elements of chinese zodiac
- 5 elements in the same horizontal period
- 5 elements of information system
- 5 elements of life
- 5 elements of wellbeing
- gallup 5 elements of wellbeing
- 5 essential elements of wellbeing