Iterating over a dataframe

Class: XII

Class Notes Date: 09-06-2021

Subject: Informatics Practices

Topic: Chapter-2 Python Pandas - II

Iterating over a dataframe:

There are many ways using which you can iterate over a dataframe. Most common of which are

1. DataFrame.iterrows(): The method iterrows() views a dataframe in the form of horizontal subset i.e. row-wise.

For example: We have created a dataframe by using the following coding:

import pandas as pd dict={'2019':{'Q1':125,'Q2':230,'Q3':275,'Q4':320},

'2020':{'Q1':105,'Q2':130,'Q3':145,'Q4':210}, '2021':{'Q1':195,'Q2':290,'Q3':105,'Q4':120}} df=pd.DataFrame(dict) print(df)

The output is:

2019 2020 2021 Q1 125 105 195 Q2 230 130 290 Q3 275 145 105 Q4 320 210 120

For making subset row-wise add the following coding in your previous coding.

for(c,values) in df.iterrows(): print('RowIndex:', c) print('Contains') print('\nValues:','\n',values)

This will create subset of dataframe df row-wise. And the output will look like:

RowIndex: Q1 Contains Values: 2019 125 2020 105 2021 195 RowIndex: Q2 Contains Values: 2019 230 2020 130 2021 290 RowIndex: Q3 Contains Values: 2019 275

2020 145 2021 105 RowIndex: Q4 Contains Values: 2019 320 2020 210 2021 120 The iterrows() function is used to iterate over DataFrame rows as (index, Series) pairs. The syntx is : DataFrame.iterrows() Important points about Dataframe.iterrows() ? Do not Preserve the data types:As iterrows() returns each row contents as series but it does not preserve

dtypes of values in the rows. ? We can not modify something while iterating over the rows using iterrows(). The iterator does not

returns a view instead it returns a copy. So, making any modification in returned row contents will have no effect on actual dataframe

2. DataFrame.iteritems() iteritems() Dataframe class provides a member function iteritems() i.e. It yields an iterator which can be used to iterate over all the columns of a dataframe. For each column in the Dataframe it returns an iterator to the tuple containing the column name and column contents as series. Example: import pandas as pd dict={'2019':{'Q1':125,'Q2':230,'Q3':275,'Q4':320}, '2020':{'Q1':105,'Q2':130,'Q3':145,'Q4':210}, '2021':{'Q1':195,'Q2':290,'Q3':105,'Q4':120}} df=pd.DataFrame(dict) print(df) for(c,values) in df.iteritems(): print('RowIndex:', c) print('Contains') print('\nValues:','\n',values)

2019 2020 2021 Q1 125 105 195 Q2 230 130 290 Q3 275 145 105 Q4 320 210 120 RowIndex: 2019 Contains Values: Q1 125 Q2 230 Q3 275 Q4 320 Name: 2019, dtype: int64 RowIndex: 2020 Contains Values: Q1 105 Q2 130 Q3 145 Q4 210 Name: 2020, dtype: int64

RowIndex: 2021 Contains Values: Q1 195 Q2 290 Q3 105 Q4 120 Name: 2021, dtype: int64

Assignment Questions:

1. Create a DataFrame and display its ROWs one at a time.

2. Create a DataFrame and display its Columns one at a time.

Binary Operations in a DataFrame:

Binary Operation means operations requiring two values to perform and these values are picked elementwise.

In binary operations the data from the two DataFrames are aligned on the bases of their rows and column indexes and for matching row, column index, the given operation is performed and for the non-matching row, column indexes NaN value Is stored.

DF 0 1 2

0 1 2 3 1 4 5 6 2 7 8 9 DF1

0 1 2 0 10 20 30 1 40 50 60 2 70 80 90 DF2

0 1 2 0 11 12 13 1 14 15 16

Binary Operations using +, add() and radd()

DF+DF1 0 1 2 0 11 22 33 1 44 55 66 2 77 88 99

>>> df.add(df1) 0 1 2

0 11 22 33 1 44 55 66 2 77 88 99

DF+DF2 0 1 2 0 12.0 14.0 16.0 1 18.0 20.0 22.0 2 NaN NaN NaN

>>> df.radd(df1) 0 1 2

0 11 22 33 1 44 55 66 2 77 88 99

Same can be practiced for subtraction, multiplication and division operation.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download