WORKSHEET Data Handling Using Pandas



WORKSHEET ¨C Data Handling Using Pandas

1

What will be the output of following codeimport pandas as pd

s1=pd.Series([1,2,2,7,¡¯Sachin¡¯,77.5])

print(s1.head())

print(s1.head(3))

Ans:

0

1

1

2

2

2

3

7

4 Sachin

dtype: object

0 1

1 2

2 2

dtype: object

2

Write a program in python to find maximum value over index in Data frame.

Ans:

# importing pandas as pd

import pandas as pd

# Creating the dataframe

df = pd.DataFrame({"A":[4, 5, 2, 6],

"B":[11, 2, 5, 8],

"C":[1, 8, 66, 4]})

# Print the dataframe

df

# applying idxmax() function.

df.idxmax(axis = 0)

3

What

1.

2.

3.

4.

5.

are the purpose of following statementsdf.columns

df.iloc[ : , :-5]

df[2:8]

df[ :]

df.iloc[ : -4 , : ]

Ans:

1. It displays the names of columns of the Dataframe.

2. It will display all columns except the last 5 columns.

1|Page



3.

4.

5.

4

It displays all columns with row index 2 to 7.

It will display entire dataframe with all rows and columns.

It will display all rows except the last 4 four rows.

Write a python program to sort the following data according to ascending order

of Age.

Name

Age

Designation

Sanjeev

37

Manager

Keshav

42

Clerk

Rahul

38

Accountant

Ans:

import pandas as pd

name=pd.Series(['Sanjeev','Keshav','Rahul'])

age=pd.Series([37,42,38])

designation=pd.Series(['Manager','Clerk','Accountant'])

d1={'Name':name,'Age':age,'Designation':designation}

df=pd.DataFrame(d1)

print(df)

df1=df.sort_values(by='Age')

print(df1)

5

Write a python program to sort the following data according to descending

order of Name.

Name

Age

Designation

Sanjeev

37

Manager

Keshav

42

Clerk

Rahul

38

Accountant

Ans:

import pandas as pd

name=pd.Series(['Sanjeev','Keshav','Rahul'])

age=pd.Series([37,42,38])

designation=pd.Series(['Manager','Clerk','Accountant'])

d1={'Name':name,'Age':age,'Designation':designation}

df=pd.DataFrame(d1)

print(df)

2|Page



df2=df.sort_values(by='Name',ascending=0)

print(df2)

6

Which of the following thing can be data in Pandas?

1. A python dictionary

2. An nd array

3. A scalar value

4. All of above

Ans:

5. All the above

7

All pandas data structure are ___________mutable but not always

________mutable.

1. Size, value

2. Semantic , size

3. Value, size

4. None of the above

Ans:

3. Value,size

8

Data and index in an nd array must be of same length1. True

2. False

Ans:

1. True

9

What is the output of the following program?

3. import pandas as pd

df=pd.DataFrame(index=[0,1,2,3,4,5],columns=[¡®one¡¯,¡¯two¡¯])

print df[¡®one¡¯].sum()

Ans:

10

It will produce an error.

What will be the output of following code:

11

Users.groupby(¡®occupation¡¯).age.mean()

1. Get mean age of occupation

2. Groups users by mean age

3. Groups user by age and occupation

4. None

Ans:

1. Get mean age of occupation

Which object do you get after reading a CSV file using pandas.read_csv()?

1. Dataframe

2. Nd array

3. Char Vector

3|Page



4. None

Ans:

1. Dataframe

12

13

What will be the output of df.iloc[3:7,3:6]?

Ans:

It will display the rows with index 3 to 6 and columns with index 3 to 5 in a

dataframe ¡®df¡¯

How to select the rows where where age is missing?

1. df[df[¡®age¡¯].isnull]

2. df[df[¡®age¡¯]==NaN]

3. df[df[¡®age¡¯]==0]

4. None

Ans:

14

4. None As the right answer is df[df['age'].isnull()]

Consider the following record in dataframe IPL

Player

Team

Category

Hardik Pandya Mumbai Indians

Batsman

KL Rahul

Kings Eleven

Batsman

Andre Russel

Kolkata Knight riders Batsman

Jasprit Bumrah Mumbai Indians

Bowler

Virat Kohli

RCB

Batsman

Rohit Sharma

Mumbai Indians

Batsman

BidPrice

13

12

7

10

17

15

Runs

1000

2400

900

200

3600

3700

Retrieve first 2 and last 3 rows using python program.

Ans:

d={'Player':['Hardik Pandya','K L Rahul','AndreRussel','Jasprit Bumrah','Virat

Kohli','Rohit Sharma'],

'Team':['Mumbai Indians','Kings Eleven','Kolkata Knight Riders','Mumbai

Indians','RCB','Mumbai Indians'],

'Category':['Batsman','Batsman','Batsman','Bowler','Batsman','Batsman'] ,

'Bidprice':[13,12,7,10,17,15],

'Runs':[1000,2400,900,200,3600,3700]}

df=pd.DataFrame(d)

print(df)

print(df.iloc[:2,:])

print(df.iloc[-3:,:])

15

Write a command to Find most expensive Player.

Ans:

16

print(df[df['BidPrice']==df['BidPrice'].max()])

Write a command to Print total players per team.

4|Page



Ans:

print(df.groupby('Team').Player.count())

17

Write a command to Find player who had highest BidPrice from each team.

Ans:

val=df.groupby('Team')

print(val['Player','BidPrice'].max())

18

Write a command to Find average runs of each team.

Ans:

print(df.groupby(['Team']).Runs.mean())

19

Write a command to Sort all players according to BidPrice.

Ans:

print(df.sort_values(by='BidPrice'))

20

We need to define an index in pandas1. True

2. False

Ans:

2 False

21

Who is data scientist?

1.

2.

3.

4.

Mathematician

Statistician

Software Programmer

All of the above

Ans:

4 All the above

22

23

What

1.

2.

3.

4.

Ans:

is the built-in database used for python?

Mysql

Pysqlite

Sqlite3

Pysqln

3 Sqlite3

How can you drop columns in python that contain NaN?

Ans:

df1.dropna(axis=1)

5|Page

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download