WORKSHEET Data Handling Using Pandas

[Pages:16]

WORKSHEET ? Data Handling Using Pandas

1 What will be the output of following codeimport pandas as pd s1=pd.Series([1,2,2,7,'Sachin',77.5]) print(s1.head()) print(s1.head(3))

Ans:

0

1

1

2

2

2

3

7

4 Sachin

dtype: object

0 1 1 2 2 2 dtype: object

2 Write a program in python to find maximum value over index in Data frame.

Ans: # importing pandas as pd import pandas as pd

# Creating the dataframe df = pd.DataFrame({"A":[4, 5, 2, 6],

"B":[11, 2, 5, 8], "C":[1, 8, 66, 4]})

# Print the dataframe df # applying idxmax() function. df.idxmax(axis = 0)

3 What are the purpose of following statements1. df.columns 2. df.iloc[ : , :-5] 3. df[2:8] 4. df[ :] 5. df.iloc[ : -4 , : ]

Ans:

1. It displays the names of columns of the Dataframe. 2. It will display all columns except the last 5 columns.

1|Page



3. It displays all columns with row index 2 to 7. 4. It will display entire dataframe with all rows and columns. 5. It will display all rows except the last 4 four rows.

4 Write a python program to sort the following data according to ascending order

of Age.

Name

Age

Designation

Sanjeev

37

Manager

Keshav

42

Clerk

Rahul

38

Accountant

Ans: import pandas as pd name=pd.Series(['Sanjeev','Keshav','Rahul']) age=pd.Series([37,42,38]) designation=pd.Series(['Manager','Clerk','Accountant']) d1={'Name':name,'Age':age,'Designation':designation} df=pd.DataFrame(d1) print(df) df1=df.sort_values(by='Age') print(df1)

5 Write a python program to sort the following data according to descending

order of Name.

Name

Age

Designation

Sanjeev

37

Manager

Keshav

42

Clerk

Rahul

38

Accountant

Ans: import pandas as pd name=pd.Series(['Sanjeev','Keshav','Rahul']) age=pd.Series([37,42,38]) designation=pd.Series(['Manager','Clerk','Accountant']) d1={'Name':name,'Age':age,'Designation':designation} df=pd.DataFrame(d1) print(df)

2|Page



df2=df.sort_values(by='Name',ascending=0) print(df2)

6 Which of the following thing can be data in Pandas? 1. A python dictionary 2. An nd array 3. A scalar value 4. All of above

Ans:

5. All the above

7 All pandas data structure are ___________mutable but not always ________mutable. 1. Size, value 2. Semantic , size 3. Value, size 4. None of the above

Ans: 3. Value,size

8 Data and index in an nd array must be of same length1. True 2. False

Ans: 1. True

9 What is the output of the following program? 3. import pandas as pd df=pd.DataFrame(index=[0,1,2,3,4,5],columns=[`one','two']) print df[`one'].sum()

Ans:

It will produce an error. 10 What will be the output of following code:

Users.groupby(`occupation').age.mean() 1. Get mean age of occupation 2. Groups users by mean age 3. Groups user by age and occupation 4. None

Ans: 1. Get mean age of occupation

11 Which object do you get after reading a CSV file using pandas.read_csv()? 1. Dataframe 2. Nd array 3. Char Vector

3|Page



4. None Ans:

1. Dataframe 12 What will be the output of df.iloc[3:7,3:6]?

Ans: It will display the rows with index 3 to 6 and columns with index 3 to 5 in a dataframe `df' 13 How to select the rows where where age is missing?

1. df[df[`age'].isnull] 2. df[df[`age']==NaN] 3. df[df[`age']==0] 4. None

Ans:

4. None As the right answer is df[df['age'].isnull()]

14 Consider the following record in dataframe IPL

Player

Team

Category

Hardik Pandya Mumbai Indians

Batsman

KL Rahul

Kings Eleven

Batsman

Andre Russel Kolkata Knight riders Batsman

Jasprit Bumrah Mumbai Indians

Bowler

Virat Kohli

RCB

Batsman

Rohit Sharma Mumbai Indians

Batsman

BidPrice 13 12 7 10 17 15

Runs 1000 2400 900 200 3600 3700

Retrieve first 2 and last 3 rows using python program.

Ans:

d={'Player':['Hardik Pandya','K L Rahul','AndreRussel','Jasprit Bumrah','Virat Kohli','Rohit Sharma'],

'Team':['Mumbai Indians','Kings Eleven','Kolkata Knight Riders','Mumbai Indians','RCB','Mumbai Indians'], 'Category':['Batsman','Batsman','Batsman','Bowler','Batsman','Batsman'] ,

'Bidprice':[13,12,7,10,17,15], 'Runs':[1000,2400,900,200,3600,3700]} df=pd.DataFrame(d) print(df) print(df.iloc[:2,:]) print(df.iloc[-3:,:])

15 Write a command to Find most expensive Player.

Ans:

print(df[df['BidPrice']==df['BidPrice'].max()]) 16 Write a command to Print total players per team.

4|Page



Ans: print(df.groupby('Team').Player.count())

17 Write a command to Find player who had highest BidPrice from each team. Ans: val=df.groupby('Team') print(val['Player','BidPrice'].max())

18 Write a command to Find average runs of each team. Ans: print(df.groupby(['Team']).Runs.mean())

19 Write a command to Sort all players according to BidPrice. Ans: print(df.sort_values(by='BidPrice'))

20 We need to define an index in pandas1. True 2. False

Ans: 2 False 21 Who is data scientist?

1. Mathematician 2. Statistician 3. Software Programmer 4. All of the above Ans: 4 All the above 22 What is the built-in database used for python? 1. Mysql 2. Pysqlite 3. Sqlite3 4. Pysqln Ans:

3 Sqlite3 23 How can you drop columns in python that contain NaN?

Ans: df1.dropna(axis=1)

5|Page

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download