WORKSHEET Data Handling Using Pandas

WORKSHEET ¨C Data Handling Using Pandas
1
What will be the output of following codeimport pandas as pd
s1=pd.Series([1,2,2,7,¡¯Sachin¡¯,77.5])
print(s1.head())
print(s1.head(3))
Ans:
0
1
1
2
2
2
3
7
4 Sachin
dtype: object
0 1
1 2
2 2
dtype: object
2
Write a program in python to find maximum value over index in Data frame.
Ans:
# importing pandas as pd
import pandas as pd
# Creating the dataframe
df = pd.DataFrame({"A":[4, 5, 2, 6],
"B":[11, 2, 5, 8],
"C":[1, 8, 66, 4]})
# Print the dataframe
df
# applying idxmax() function.
df.idxmax(axis = 0)
3
What
1.
2.
3.
4.
5.
are the purpose of following statementsdf.columns
df.iloc[ : , :-5]
df[2:8]
df[ :]
df.iloc[ : -4 , : ]
Ans:
1. It displays the names of columns of the Dataframe.
2. It will display all columns except the last 5 columns.
1|Page
3.
4.
5.
4
It displays all columns with row index 2 to 7.
It will display entire dataframe with all rows and columns.
It will display all rows except the last 4 four rows.
Write a python program to sort the following data according to ascending order
of Age.
Name
Age
Designation
Sanjeev
37
Manager
Keshav
42
Clerk
Rahul
38
Accountant
Ans:
import pandas as pd
name=pd.Series(['Sanjeev','Keshav','Rahul'])
age=pd.Series([37,42,38])
designation=pd.Series(['Manager','Clerk','Accountant'])
d1={'Name':name,'Age':age,'Designation':designation}
df=pd.DataFrame(d1)
print(df)
df1=df.sort_values(by='Age')
print(df1)
5
Write a python program to sort the following data according to descending
order of Name.
Name
Age
Designation
Sanjeev
37
Manager
Keshav
42
Clerk
Rahul
38
Accountant
Ans:
import pandas as pd
name=pd.Series(['Sanjeev','Keshav','Rahul'])
age=pd.Series([37,42,38])
designation=pd.Series(['Manager','Clerk','Accountant'])
d1={'Name':name,'Age':age,'Designation':designation}
df=pd.DataFrame(d1)
print(df)
2|Page
df2=df.sort_values(by='Name',ascending=0)
print(df2)
6
Which of the following thing can be data in Pandas?
1. A python dictionary
2. An nd array
3. A scalar value
4. All of above
Ans:
5. All the above
7
All pandas data structure are ___________mutable but not always
________mutable.
1. Size, value
2. Semantic , size
3. Value, size
4. None of the above
Ans:
3. Value,size
8
Data and index in an nd array must be of same length1. True
2. False
Ans:
1. True
9
What is the output of the following program?
3. import pandas as pd
df=pd.DataFrame(index=[0,1,2,3,4,5],columns=[¡®one¡¯,¡¯two¡¯])
print df[¡®one¡¯].sum()
Ans:
10
It will produce an error.
What will be the output of following code:
11
Users.groupby(¡®occupation¡¯).age.mean()
1. Get mean age of occupation
2. Groups users by mean age
3. Groups user by age and occupation
4. None
Ans:
1. Get mean age of occupation
Which object do you get after reading a CSV file using pandas.read_csv()?
1. Dataframe
2. Nd array
3. Char Vector
3|Page
4. None
Ans:
1. Dataframe
12
13
What will be the output of df.iloc[3:7,3:6]?
Ans:
It will display the rows with index 3 to 6 and columns with index 3 to 5 in a
dataframe ¡®df¡¯
How to select the rows where where age is missing?
1. df[df[¡®age¡¯].isnull]
2. df[df[¡®age¡¯]==NaN]
3. df[df[¡®age¡¯]==0]
4. None
Ans:
14
4. None As the right answer is df[df['age'].isnull()]
Consider the following record in dataframe IPL
Player
Team
Category
Hardik Pandya Mumbai Indians
Batsman
KL Rahul
Kings Eleven
Batsman
Andre Russel
Kolkata Knight riders Batsman
Jasprit Bumrah Mumbai Indians
Bowler
Virat Kohli
RCB
Batsman
Rohit Sharma
Mumbai Indians
Batsman
BidPrice
13
12
7
10
17
15
Runs
1000
2400
900
200
3600
3700
Retrieve first 2 and last 3 rows using python program.
Ans:
d={'Player':['Hardik Pandya','K L Rahul','AndreRussel','Jasprit Bumrah','Virat
Kohli','Rohit Sharma'],
'Team':['Mumbai Indians','Kings Eleven','Kolkata Knight Riders','Mumbai
Indians','RCB','Mumbai Indians'],
'Category':['Batsman','Batsman','Batsman','Bowler','Batsman','Batsman'] ,
'Bidprice':[13,12,7,10,17,15],
'Runs':[1000,2400,900,200,3600,3700]}
df=pd.DataFrame(d)
print(df)
print(df.iloc[:2,:])
print(df.iloc[-3:,:])
15
Write a command to Find most expensive Player.
Ans:
16
print(df[df['BidPrice']==df['BidPrice'].max()])
Write a command to Print total players per team.
4|Page
Ans:
print(df.groupby('Team').Player.count())
17
Write a command to Find player who had highest BidPrice from each team.
Ans:
val=df.groupby('Team')
print(val['Player','BidPrice'].max())
18
Write a command to Find average runs of each team.
Ans:
print(df.groupby(['Team']).Runs.mean())
19
Write a command to Sort all players according to BidPrice.
Ans:
print(df.sort_values(by='BidPrice'))
20
We need to define an index in pandas1. True
2. False
Ans:
2 False
21
Who is data scientist?
1.
2.
3.
4.
Mathematician
Statistician
Software Programmer
All of the above
Ans:
4 All the above
22
23
What
1.
2.
3.
4.
Ans:
is the built-in database used for python?
Mysql
Pysqlite
Sqlite3
Pysqln
3 Sqlite3
How can you drop columns in python that contain NaN?
Ans:
df1.dropna(axis=1)
5|Page
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- data classification and handling policy
- data analysis using excel
- using sas for data analysis
- data types in pandas dataframe
- using excel for data analysis
- aggregating data using queries
- data analytics using excel examples
- analyzing data using excel
- find data value using z score
- update worksheet data excel
- sort pandas columns using a list
- data analysis using spss pdf