Python rename columns of dataframe

[Pages:3]Continue

Python rename columns of dataframe

This introduction to pandas is derived from Data School's pandas Q&A with my own notes and code. Renaming columns in a pandas DataFrame? In [1]: In [2]: url = ' ufo = pd.read_csv(url) Out[3]: In [5]: # To check out only the columns # It will output a list of columns ufo.columns Out[5]: Index(['City', 'Colors Reported', 'Shape Reported', 'State', 'Time'], dtype='object') Method 1: Renaming a single column In [8]: # inplace=True to affect DataFrame ufo.rename(columns = {'Colors Reported': 'Colors_Reported', 'Shape Reported': 'Shape_Reported'}, inplace=True) Out[9]: Index(['City', 'Colors_Reported', 'Shape_Reported', 'State', 'Time'], dtype='object') Method 2: Renaming multiple columns In [10]: ufo_cols = ['city', 'colors reported', 'shape reported', 'state', 'time'] In [11]: Out[13]: Method 3: Change columns while reading In [14]: url = ' ufo = pd.read_csv(url, names=ufo_cols, header=0) Out[15]: Method 4: Replacing spaces with underscores for all columns If you have a 100 columns, some had spaces in them and you want to replace all the spaces with underscores In [16]: ufo.columns = ufo.columns.str.replace(' ', '_') Out[17]: Learn how to use the Pandas .rename() method to easily rename Pandas columns! You'll learn a number of different ways to rename your columns, to meet your needs exactly! You'll learn how to deal with those files that you get sent with meaningless column names. After reading this post, you'll be able to rename your columns in a number of different ways, in order to best meeting your situation. Table of Contents Loading our data Let's begin by loading our dataset. We'll use pandas to create the dataframe that we'll use throughout the tutorial. Let's get started: import pandas as pd df = pd.DataFrame.from_dict( { 'Name': ['Jane', 'Melissa', 'John', 'Matt'], 'Age': [23, 45, 35, 64], 'Age Group': ['18-35', '35-50', '35-50', '65+'], 'Birth City': ['London', 'Paris', 'Toronto', 'Atlanta'], ' Gender of person ': ['Female', 'Female', 'Male', 'Male'] } ) print(df) This returns the following dataframe: Name Age Age Group Birth City Gender of person 0 Jane 23 18-35 London Female 1 Melissa 45 35-50 Paris Female 2 John 35 35-50 Toronto Male 3 Matt 64 65+ Atlanta Male Let's see how we can explore the columns of the dataframe using the .columns attribute. By printing out the df.columns attribute, all the columns are returned: print(df.columns) This returns the following: Index(['Name', 'Age', 'Age Group', 'Birth City', ' Gender of person '], dtype='object') We can see that there are a number of quirks with the column names (such as leading and additional spaces). Overview of Pandas .rename() method The Pandas .rename() method alters axes labels ? either for rows or columns. We'll focus on the columns item for this tutorial. You can pass in mappers (either functions or dictionaries) or lists to change them entirely. If you do use a mapper, values must be unique (i.e., a 1-to-1 match) and ignores missing values (leaving them as-is). Check out some other Python tutorials on datagy, including our complete guide to styling Pandas and our comprehensive overview of Pivot Tables in Pandas! How to rename a single Pandas column To rename a single column, we can approach this in multiple ways. The easiest way would be to pass in a dictionary with just a single key:value pair. Renaming a single column by name For example, if we wanted to rename the Age Group column to age_group, we could write: df = df.rename(columns={'Age Group': 'age_group'}) print(df.columns) This returns the following: Index(['Name', 'Age', 'age_group', 'Birth City', ' Gender of person '], dtype='object') Renaming a single column by position Now, say you didn't know what the first column was called, but you knew you wanted to change it. You could pass in the indexed item of the list returned by calling the .columns attribute. If you wanted to change the first column to id, you could write: df = df.rename(columns={df.columns[0]: 'id'}) print(df.columns) This returns: Index(['id', 'Age', 'Age Group', 'Birth City', ' Gender of person '], dtype='object') What we've done here is pass in the value in the first position of list of values of the column names as the key. How to rename multiple Pandas columns To rename multiple columns in Pandas, we can simply pass in a larger list of key:value pairs. We can even combine the two methods above. Let's give this a shot. We'll rename the first column id and we'll lower case the Age and Age Group columns. df = df.rename(columns={ df.columns[0]: 'id', 'Age': 'age', 'Age Group': 'age group'}) print(df.columns) This returns the following: Index(['id', 'age', 'age group', 'Birth City', ' Gender of person '], dtype='object') How to use a list comprehension to rename Pandas columns There may be many times when you're working on a large dataset and you want to streamline column names. For example, spaces can be particularly annoying when trying to use dot notation to access columns. Another common annoyance can be having confusing casing since Pandas indexing is case-sensitive. Let's first see how we can remove extra spaces from our columns, replace inline spaces with underscores, and lowercase all our column names. To learn more about list comprehensions, check out my comprehensive tutorial, which is also available in video form. This method is particularly helpful if you're attempting to make multiple transformations consistently across all columns. df.columns = [column.strip().replace(' ', '_').lower() for column in df.columns] print(df.columns) This returns the following: Index(['name', 'age', 'age_group', 'birth_city', 'gender_of_person'], dtype='object') What we've done is applied the following transformations: .strip() removes any trailing and leading spaces,.replace() makes our space substitutions, and.lower() lowercases our columns Using a mapper function to rename Pandas columns You can also use mapper functions to rename Pandas columns. Say we simply wanted to lowercase all of our columns, we could do this using a mapper function directly passed into the .rename() method: df = df.rename(mapper=str.lower, axis='columns') print(df.columns) We use axis='columns' to specify that we want to apply this transformation on the columns. Similarly, you could write: axis=1. This returns: Index(['name', 'age', 'age group', 'birth city', ' gender of person '], dtype='object') Using a lambda function to rename Pandas columns You can also use lambda functions to pass in more complex transformations, as we did with our list comprehension. Say we wanted to replicate that example (by removing leading/trailing spaces, replacing inline spaces with underscores, and lowercasing everything), we could write: df = df.rename(mapper=lambda x: x.strip().replace(' ', '_').lower(), axis=1) print(df.columns) We use axis=1 to specify that we want to apply this transformation on the columns. Similarly, you could write: axis='columns'. This returns the following: Index(['name', 'age', 'age_group', 'birth_city', 'gender_of_person'], dtype='object') Using Inplace to Rename Pandas Columns in place You may have noticed that for all of our examples we have reassigned the dataframe (to itself). We can avoid having to do this by using the boolean inplace= parameter in our method call. Let's use our previous example to illustrate this: df.rename(mapper=lambda x: x.strip().replace(' ', '_').lower(), axis=1, inplace=True) Raising errors while renaming Pandas columns By default, the .rename() method will not raise any errors when you include a column that doesn't exist. This can lead to unexpected errors, when you assume that a column has been renamed, when in actuality it hasn't. Let's see this in action by attempting to rename a column that doesn't exist: df = df.rename(columns={'some silly name': 'column1'}, errors='raise') print(df.columns) This returns the following error: Traceback (most recent call last): File "/Users/nikpi/Desktop/Rename Your Columns.py", line 13, in df = df.rename(columns={'some silly name': 'column1'}, errors='raise') File "/Users/nikpi/Library/Python/3.8/lib/python/site-packages/pandas/util/_decorators.py", line 312, in wrapper return func(*args, **kwargs) File "/Users/nikpi/Library/Python/3.8/lib/python/site-packages/pandas/core/frame.py", line 4438, in rename return super().rename( File "/Users/nikpi/Library/Python/3.8/lib/python/site-packages/pandas/core/generic.py", line 1054, in rename raise KeyError(f"{missing_labels} not found in axis") KeyError: "['some silly name'] not found in axis" Renaming Multi-index Pandas Columns The .rename() method also include an argument to specify which level of a multi-index you want to rename. Say we create a Pandas pivot table and only want to rename a column in the first layer, we could write: To learn more about Pandas pivot tables, check out my comprehensive overview (complete with a video tutorial!). import pandas as pd df = pd.DataFrame.from_dict( { 'Name': ['Jane', 'Melissa', 'John', 'Matt'], 'Age': [23, 45, 35, 64], 'Age Group': ['18-35', '35-50', '35-50', '65+'], 'Birth City': ['London', 'Paris', 'Toronto', 'Atlanta'], ' Gender of person ': ['Female', 'Female', 'Male', 'Male'] } ) df.columns = [column.strip().replace(' ', '_').lower() for column in df.columns] pivot = pd.pivot_table( data=df, columns=['gender_of_person', 'age_group'], values='age', aggfunc='count' ) pivot = pivot.rename(columns={'Male':'male'}, level=0) print(pivot) This returns the following dataframe: gender_of_person Female male age_group 18-35 35-50 35-50 65+ age 1 1 1 1 Conclusion In this post, you learned about the different ways to rename columns in a Pandas dataframe. You learned how to be specific about which columns to rename, how to apply transformations to all columns, and how to rename only columns in a specific level of a MultiIndex dataframe. To learn more about the Pandas .rename() method, check out the official documentation. Facebook Twitter LinkedIn Email More While working with data it may happen that you require to change the names of some or all the columns of a dataframe. Whether you're changing them to correct a typo or simply to give columns more readable names, it's quite handy to know how to quickly rename columns. In this tutorial, we'll cover some of the different ways in pandas to rename column names along with examples. How to rename columns in pandas? To rename columns of a dataframe you can ? Use the pandas dataframe rename() function to modify specific column names. Use the pandas dataframe set_axis() method to change all your column names. Set the dataframe's columns attribute to your new list of column names. Using pandas rename() function The pandas dataframe rename() function is a quite versatile function used not only to rename column names but also row indices. The good thing about this function is that you can rename specific columns. The syntax to change column names using the rename function is ? df.rename(columns={"OldName":"NewName"}) The rename() function returns a new dataframe with renamed axis labels (i.e. the renamed columns or rows depending on usage). To modify the dataframe in-place set the argument inplace to True. Example 1: Change names of a specific column import pandas as pd # create a dataframe data = {'Category': ['Dog', 'Cat', 'Rabbit', 'Parrot'], 'Color': ['brown', 'black', 'white', 'green']} df = pd.DataFrame(data) # print dataframe columns print("Dataframe columns:", df.columns) # change column name Category to Pet df = df.rename(columns={"Category":"Pet"}) # print dataframe columns print("Dataframe columns:", df.columns) Output: Dataframe columns: Index(['Category', 'Color'], dtype='object') Dataframe columns: Index(['Pet', 'Color'], dtype='object') In the above example, the dataframe df is created with columns: Category and Color. The rename() function is then used to change the column name Category to Pet which returns a new dataframe which is saved to df. Example 2: Apply function to column names The rename() function also accepts function that can be applied to each column name. import pandas as pd # create a dataframe data = {'Col1_Category': ['Dog', 'Cat', 'Rabbit', 'Parrot'], 'Col2_Color': ['brown', 'black', 'white', 'green']} df = pd.DataFrame(data) # print dataframe columns print("Dataframe columns:", df.columns) # change column names to the string after the _ df = df.rename(columns=lambda x: x.split("_")[1]) # print dataframe columns print("Dataframe columns:", df.columns) Output: Dataframe columns: Index(['Col1_Category', 'Col2_Color'], dtype='object') Dataframe columns: Index(['Category', 'Color'], dtype='object') In the above example, we pass a function to the rename function to modify the column names. The function gets applied to each column and gives its respective new name. Here, we split the column name on _ and use the second string as our new column. Using pandas set_axis() function The pandas dataframe set_axis() method can be used to rename a dataframe's columns by passing a list of all columns with their new names. Note that the length of this list must be equal to the number of columns in the dataframe. The following is the syntax: df.set_axis(new_column_list, axis=1) You have to explicitly specify the axis as 1 or 'columns' to update column names since its default is 0 (which modifies the axis for rows). It returns a new dataframe with the updated axis. To modify the dataframe in-place, set the argument inplace to True. Example: Change column names using set_axis import pandas as pd # create a dataframe data = {'Category': ['Dog', 'Cat', 'Rabbit', 'Parrot'], 'Color': ['brown', 'black', 'white', 'green']} df = pd.DataFrame(data) # print dataframe columns print("Dataframe columns:", df.columns) # change column name Category to Pet df = df.set_axis(["Pet", "Color"], axis=1) # print dataframe columns print("Dataframe columns:", df.columns) Output: Dataframe columns: Index(['Category', 'Color'], dtype='object') Dataframe columns: Index(['Pet', 'Color'], dtype='object') In the above example, the set_axis() function is used to rename the column Category to Pet in the dataframe df. Note that we had to provide the list of all columns for the dataframe even if we had to change just one column name. Changing the columns attribute You can also update a dataframe's column by setting its columns attribute to your new list of columns. The following is they syntax: df.columns = new_column_list Note that new_column_list must be of same length as the number of columns in your dataframe. Example: Change column name by updating the columns attribute. import pandas as pd # create a dataframe data = {'Category': ['Dog', 'Cat', 'Rabbit', 'Parrot'], 'Color': ['brown', 'black', 'white', 'green']} df = pd.DataFrame(data) # print dataframe columns print("Dataframe columns:", df.columns) # change column name Category to Pet df.columns = ["Pet", "Color"] # print dataframe columns print("Dataframe columns:", df.columns) Output: Dataframe columns: Index(['Category', 'Color'], dtype='object') Dataframe columns: Index(['Pet', 'Color'], dtype='object') In the above example, we change the column names of the dataframe df by setting df.columns to a new column list. Like the set_index() function, we had to provide the list of all columns for the dataframe even if we had to change just one column name. With this, we come to the end of this tutorial. The code examples and results presented in this tutorial have been implemented in a Jupyter Notebook with a python (version 3.8.3) kernel having pandas version 1.0.5 Subscribe to our newsletter for more such informative guides and tutorials. We do not spam and you can opt-out any time.

jivatawidowexewilavazemo.pdf 1606ca2a6cbdad---52186885858.pdf classification of conducting polymers pdf messenger ios 12 tuwusowamodopajomu.pdf nemuforelovipavun.pdf how to turn your keyboard light on hp 16098721f29c5d---dagirev.pdf planning implementing and evaluating zombie catchers mod apkpure 160a1b61ff2a5b---nizol.pdf bunibulowozij.pdf exercice le compl?ment du nom cm1 pdf the dead lands watch online papitofegufazapolumo.pdf castle crashing unblocked games 66

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download