
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Concatenate Column Values in a Pandas DataFrame
Pandas is a powerful library for data manipulation and analysis in Python. It provides a variety of functions and tools for handling and transforming data, including the ability to concatenate column values in a Pandas DataFrame.
In a Pandas DataFrame, columns represent variables or features of the data. Concatenating column values involves combining the values of two or more columns into a single column. This can be useful for creating new variables, merging data from different sources, or formatting data for analysis.
To concatenate column values in a Pandas DataFrame, you can use the pd.Series.str.cat() method. This method concatenates two or more series along a particular axis with a specified separator. The str.cat() method can be used with the apply() function to apply it to each row of the DataFrame.
There are several other methods and functions available in Pandas that can be used for concatenating column values, including the pd.concat() function, the pd.Series.str.join() method, and the pd.Series.str.cat() method with a different separator or string. Each of these methods has its own advantages and disadvantages depending on the specific use case.
In this tutorial, we will explore the different methods and functions available in Pandas for concatenating column values in a DataFrame. We will provide step-by-step instructions and code examples for each method, along with a discussion of the pros and cons of each approach. By the end of the tutorial, you will have a comprehensive understanding of how to concatenate column values in a Pandas DataFrame, and which method is best suited for their specific use case.
Now let's consider two approaches with which we can concatenate column values in a Panda dataframe.
Using the pd.Series.str.cat() Method to Concatenate Column Values
Create a new column in your DataFrame to store the concatenated values.
Use the pd.Series.str.cat() method to concatenate the values of the columns you want to combine.
Specify the separator you want to use between the concatenated values using the 'sep' parameter.
Use the apply() method to apply the concatenation function to each row of the DataFrame.
Now that we have discussed the approach in points, let's make use of it in a code.
Example
Consider the code shown below.
import pandas as pd from tabulate import tabulate # Create a sample DataFrame df = pd.DataFrame({ 'Name': ['John', 'Jane', 'Bob'], 'Age': [25, 30, 35], 'Country': ['USA', 'Canada', 'Mexico'] }) # Create a new column for concatenated values df['Name_Age_Country'] = '' # Define a function to concatenate the columns def concatenate_columns(row): """ Concatenate the values in the 'Name', 'Age', and 'Country' columns with a separator of '|'. """ return row['Name'] + '|' + str(row['Age']) + '|' + row['Country'] # Apply the function to each row of the DataFrame df['Name_Age_Country'] = df.apply(concatenate_columns, axis=1) # Print the original DataFrame and the concatenated DataFrame print('Original DataFrame:\n') print(tabulate(df[['Name', 'Age', 'Country']], headers='keys', tablefmt='psql')) print('\nConcatenated DataFrame:\n') print(tabulate(df[['Name_Age_Country']], headers='keys', tablefmt='psql'))
Output
On executing this code, you will get the following output

Observe the original dataframe and how it looks after we concatenate the columns. Now let's consider the second approach.
Using the pd.concat() Method to Concatenate Column Values
First create a list of the columns you want to concatenate.
Use the pd.concat() function to concatenate the columns along the axis of your choice (i.e., columns or rows).
Specify the separator you want to use between the concatenated values using the sep parameter.
Use the rename() method to rename the new concatenated column.
Use the drop() method to drop the original columns that were concatenated.
Example
Consider the code shown below.
import pandas as pd from tabulate import tabulate # Create a sample DataFrame df = pd.DataFrame({ 'Name': ['John', 'Jane', 'Bob'], 'Age': [25, 30, 35], 'Country': ['USA', 'Canada', 'Mexico'] }) print("\nOriginal Dataframe:") print(tabulate(df, headers='keys', tablefmt='psql')) # Concatenate the columns using the pd.concat() function concatenated_cols = pd.concat( [df['Name'], df['Age'], df['Country']], axis=1, keys=['Name', 'Age', 'Country'] ) concatenated_cols['Name_Age_Country'] = concatenated_cols['Name'] + '|' + concatenated_cols['Age'].astype(str) + '|' + concatenated_cols['Country'] # Rename the concatenated column and drop the original columns df = pd.concat([df, concatenated_cols['Name_Age_Country']], axis=1) df = df.rename(columns={'Name_Age_Country': 'Name|Age|Country'}) df = df.drop(columns=['Name', 'Age', 'Country']) # Print the original DataFrame and the concatenated DataFrame print('\nConcatenated Dataframe:') print(tabulate(df, headers='keys', tablefmt='psql'))
Output
On executing this code, you will get the following output

Conclusion
In conclusion, Pandas provides several ways to concatenate column values in a DataFrame. Two approaches were discussed in this tutorial: using the pd.Series.str.cat() method and using the pd.concat() function.
Depending on your specific use case, one of these approaches may be more suitable than the other. By leveraging the flexibility and power of Pandas, you can easily manipulate and transform your data to suit your needs.