Create DataFrame - Be easy in My Python class

What is Data Frame?

A Data frame is a 2D (two-dimensional) data structure, i.e., data is arranged

in tabular form i.e. In rows and columns.

Or we can say that, Pandas DataFrame is similar to excel sheet

Let¡¯s understand it through an example

known as Indexes

0

1

2

3

4

5

6

7

8

9

Id

Name

Arprit

Age

62

Department

Surgery

Charges Gender

300

M

Zarina

Kareem

Arun

Zubin

Kettaki

Ankita

Zareen

Kush

Shilpa

22

32

12

30

16

29

45

19

23

ENT

Orthopaedic

Surgery

ENT

ENT

Cardiology

250

200

300

250

250

800

300

800

400

1.

Cardiology

Nuclear

Medicine

F

M

M

M

F

F

F

M

F

Known as

Columns

Data Values

Create DataFrame

pandas DataFrame can be created using the following constructor ?

pandas.DataFrame( data, index, columns, dtype, copy)

The parameters of the constructor are as follows ?

Sr.No

Parameter & Description

1

Data data takes various forms like ndarray, series, map, lists, dict, constants and

also another DataFrame.

2

Index For the row labels, the Index to be used for the resulting frame is Optional

Default np.arrange(n) if no index is passed.

3

Columns For column labels, the optional default syntax is - np.arrange(n). This is

only true if no index is passed.

4

Dtype Data type of each column.

5

Copy This command (or whatever it is) is used for copying of data, if the default is

False.

pythonclassroomdiary. by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior

A pandas DataFrame can be created using various inputs like ?

?

Lists

?

dictionary

?

Series

?

Numpy ndarrays

?

Another DataFrame

1.1 Create an Empty DataFrame

>>> import pandas as pd

>>> df=pd.DataFrame()

>>> df

Empty DataFrame

Columns: []

Index: []

1.2 Create a DataFrame from Lists

Example 1

>>> MyList=[10,20,30,40]

>>> MyFrame=pd.DataFrame(MyList)

>>> MyFrame

0

1

2

3

0

10

20

30

40

Example 2: (Nested List)

>>> Friends =

[['Shraddha','Doctor'],['Shanti','Teacher'],['Monica','Engineer']]

>>> MyFrame=pd.DataFrame(Friends,columns=['Name','Occupation'])

>>> MyFrame

Name Occupation

0 Shraddha

Doctor

1 Shanti Teacher

2 Monica Engineer

1.3 Creation of a DataFrame from Dictionary of ndarrays / Lists

?

All the ndarrays must be of same length.

?

If index is passed, then the length of the index should equal to the length of the

arrays.

?

If no index is passed, then by default, index will be range(n), where n is the array

length.

pythonclassroomdiary. by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior

Example 1 (without index)

>>> data = {'Name':['Shraddha', 'Shanti', 'Monica',

'Yogita'],'Age':[28,34,29,39]}

>>> df = pd.DataFrame(data)

>>> df

Name Age

0 Shraddha

28

1

Shanti

34

2

Monica

29

3

Yogita

39

Example 2 (with index)

>>> data = {'Name':['Shraddha', 'Shanti', 'Monica',

'Yogita'],'Age':[28,34,29,39]}

>>> df = pd.DataFrame(data, index=['Friend1','Friend2','Relative1','Relative2'])

>>> df

Name Age

Friend1

Shraddha

28

Friend2

Shanti

34

Relative1

Monica

29

Relative2

Yogita

39

1.4 Create a DataFrame from List of Dictionaries

Here we are passing list of dictionary to create a DataFrame. The dictionary

keys are by default taken as column names.

Example 1:

>>> Mydict= [{'Won': 15, 'Loose': 2},{'Won': 5, 'Loose': 10},

{'Won': 8, 'Loose': 9},{'Won':4}]

>>> df = pd.DataFrame(Mydict)

>>> df

Loose Won

0

2.0

15

1

10.0

5

2

9.0

8

3

NaN

4

Notice that Missing Value is stored as NaN (Not a Number)

Example 2:

>>> Mydict=[{'Won': 15, 'Loose': 2},{'Won': 5, 'Loose': 10},{'Won': 8, 'Loose':

9}]

>>> df = pd.DataFrame(Mydict, index=['India', 'Pakistan','Autralia'])

>>> df

India

Pakistan

Autralia

Loose

2

10

9

Won

15

5

8

pythonclassroomdiary. by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior

Example 3

We can also create a DataFrame with by specifying list of dictionaries, row

indices, and column indices.

>>> L_dict = [{'Maths': 78, 'Chemistry': 78,'Physics':87},{'Maths': 67,

'Chemistryb': 70},{'Physics':77,'Maths':87}]

A

>>> df1 = pd.DataFrame(L_dict, index=['Student1', 'Student2','Student3'],

columns=['Physics', 'Chemistry','Maths'])

>>> df1

Student1

Student2

Student3

B

Physics

87.0

NaN

77.0

Chemistry

78.0

NaN

NaN

Maths

78

67

87

>>> df2 = pd.DataFrame(L_dict, index=['Student1', 'Student2','Student3'],

columns=['Chemistry','Maths'])

>>> df2

Student1

Student2

Student3

Chemistry

78.0

NaN

NaN

Maths

78

67

87

>>> df3 = pd.DataFrame(L_dict, index=['Student1', 'Student2','Student3'],

C

columns=['English','Chemistry','Maths'])

>>> df3

Student1

Student2

Student3

English

NaN

NaN

NaN

Chemistry

78.0

NaN

NaN

Maths

78

67

87

Observe the lines mentioned with A, B and C above.Output of A,B,C

are depends upon the COLUMNS MENTIONED while creating DataFrame. If

Dictionary Keys are matched with Columns specified then the

corresponding data will be shown. If columns mentioned are not

matched with Keys then NaN will be displayed

2. Addition of New Column & Row

2.1 Column Addition

>>> L_dict = [{'Maths': 78, 'Chemistry': 78,'Physics':87},{'Maths': 67,

'Chemistry': 70},{'Physics':77,'Maths':87,'Chemistry':90}]

df3 = pd.DataFrame(L_dict, index=['Student1', 'Student2','Student3'],

columns=['English','Chemistry','Maths'])

>>> df3['Physics']=[45,56,65]

pythonclassroomdiary. by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior

A new column¡¯ Physics¡¯ has

been added with new data

>>> df3

Student1

Student2

Student3

?

English

NaN

NaN

NaN

Chemistry

78

70

90

Maths

78

67

87

Physics

45

56

65

We can Update column Data also by using same method

>>> df3['English']=[78,98,89]

>>> df3

Student1

Student2

Student3

?

English

78

98

89

Chemistry

78

70

90

Maths

78

67

87

Physics

45

56

65

We can add new column using Data ,stored in existing Frame

>>> df3['Total']=df3.English+df3.Chemistry+df3.Maths+df3.Physics

>>> df3

Student1

Student2

Student3

English

78

98

89

Chemistry

78

70

90

Maths

78

67

87

Physics

45

56

65

Total

279

291

331

Look a new Column

Total has been added

with total of marks in

other subjects

2.2 Row Addition

i.

To add row with by specifying row index

>>> df3.loc['Student4']=[45,67,45]

>>> df3

Student1

Student2

Student3

Student4

English

78

98

89

45

Chemistry

78

70

90

67

Maths

78

67

87

45

To add/Modify row with by specifying row index no.

ii.

>>> df3.iloc[3]=[45,67,45]

>>> df3

Student1

Student2

Student3

Student4

English Chemistry Maths

78

78

78

98

70

67

89

90

87

45

67

45

>>> df3.iloc[3]=[65,77,90]

>>> df3

Student1

Student2

Student3

Student4

English

78

98

89

65

Chemistry

78

70

90

77

Maths

78

67

87

90

pythonclassroomdiary. by Sangeeta M Chauhan , PGT CS KV NO.3 Gwalior

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download