How To Add A Column To A DataFrame In Python Pandas
Maybe your like
When working with multiple datasets in Python, I often need to combine information from different dataframes. One common task is adding a column from one dataframe to another.
This might seem simple, but there are several methods to accomplish this, each with its advantages.
In this article, I’ll walk you through six practical methods to add a column from another dataframe in Pandas. So let’s get in..
Table of Contents
- Methods to Add a Column From Another Dataframe In Pandas Python
- 1: Use pandas.DataFrame.join() Method
- 2: Use pandas.concat() Method
- 3: Use pandas.DataFrame.merge() Method
- 4: Use pandas.Series.map() Method
- 5: Use DataFrame.assign() Method
- 6: Use DataFrame.loc[] Method
- Real-World Example: Combine Customer and Order Data
- Performance Considerations of All the Methods
Methods to Add a Column From Another Dataframe In Pandas Python
Now, I will explain how to add a column from another Dataframe in Pandas Python.
1: Use pandas.DataFrame.join() Method
The join() method in Python is one of my go-to approaches when I need to add columns from another dataframe. It’s particularly useful when you want to join dataframes based on their indices.
Let’s start with a practical example. Imagine we have two dataframes containing information about sales representatives and their performance:
import pandas as pd # Sales team dataframe sales_team = pd.DataFrame({ 'Rep_ID': [101, 102, 103, 104, 105], 'Name': ['John Smith', 'Sarah Johnson', 'Mike Davis', 'Emma Wilson', 'Robert Brown'], 'Region': ['East', 'West', 'North', 'South', 'Central'] }) # Sales performance dataframe sales_performance = pd.DataFrame({ 'Rep_ID': [101, 102, 103, 104, 105], 'Monthly_Sales': [15000, 22000, 18000, 25000, 19500], 'Targets_Met': [3, 4, 3, 5, 4] }) # Set Rep_ID as index for both dataframes sales_team.set_index('Rep_ID', inplace=True) sales_performance.set_index('Rep_ID', inplace=True) # Add Monthly_Sales column from sales_performance to sales_team result = sales_team.join(sales_performance['Monthly_Sales']) print(result)Output:
Name Region Monthly_Sales Rep_ID 101 John Smith East 15000 102 Sarah Johnson West 22000 103 Mike Davis North 18000 104 Emma Wilson South 25000 105 Robert Brown Central 19500I executed the above example code and added the screenshot below

The join() method works beautifully when both dataframes share the same index. It’s clean and efficient for this specific scenario.
2: Use pandas.concat() Method
Python’s concat() method is extremely versatile for combining dataframes. When I need to add columns from one dataframe to another, especially when they share the same structure and index, concat() is often my choice.
Here’s how to use it:
import pandas as pd # Reset indices to work with concat sales_team = sales_team.reset_index() sales_performance = sales_performance.reset_index() # Using concat to add Monthly_Sales column result = pd.concat([sales_team, sales_performance['Monthly_Sales']], axis=1) print(result)Output:
Rep_ID Name Region Monthly_Sales 0 101 John Smith East 15000 1 102 Sarah Johnson West 22000 2 103 Mike Davis North 18000 3 104 Emma Wilson South 25000 4 105 Robert Brown Central 19500I executed the above example code and added the screenshot below

The concat() method combined our dataframes along axis=1 (columns). This works well when dataframes have the same length and order, which is common in many data analysis scenarios.
3: Use pandas.DataFrame.merge() Method
The merge() method in Python is my preferred approach when I need more control over how dataframes are combined. It’s particularly useful for scenarios involving different types of joins (inner, outer, left, right).
Let’s see it in action:
import pandas as pd # Create our sample dataframes again sales_team = pd.DataFrame({ 'Rep_ID': [101, 102, 103, 104, 105], 'Name': ['John Smith', 'Sarah Johnson', 'Mike Davis', 'Emma Wilson', 'Robert Brown'], 'Region': ['East', 'West', 'North', 'South', 'Central'] }) sales_performance = pd.DataFrame({ 'Rep_ID': [101, 102, 103, 104, 105], 'Monthly_Sales': [15000, 22000, 18000, 25000, 19500], 'Targets_Met': [3, 4, 3, 5, 4] }) # Merge dataframes on Rep_ID to add Monthly_Sales column result = sales_team.merge(sales_performance[['Rep_ID', 'Monthly_Sales']], on='Rep_ID', how='left') print(result)Output:
Rep_ID Name Region Monthly_Sales 0 101 John Smith East 15000 1 102 Sarah Johnson West 22000 2 103 Mike Davis North 18000 3 104 Emma Wilson South 25000 4 105 Robert Brown Central 19500I executed the above example code and added the screenshot below

The merge() method is incredibly flexible. In this example, I used a left join to ensure all records from sales_team are preserved. This is especially useful when dealing with datasets that might not have perfect one-to-one relationships.
4: Use pandas.Series.map() Method
The map() method in Python is elegant when you need to map values from one dataframe to another based on a key. I frequently use this approach for simple column additions.
Here’s how it works:
import pandas as pd # Create our sample dataframes sales_team = pd.DataFrame({ 'Rep_ID': [101, 102, 103, 104, 105], 'Name': ['John Smith', 'Sarah Johnson', 'Mike Davis', 'Emma Wilson', 'Robert Brown'], 'Region': ['East', 'West', 'North', 'South', 'Central'] }) sales_performance = pd.DataFrame({ 'Rep_ID': [101, 102, 103, 104, 105], 'Monthly_Sales': [15000, 22000, 18000, 25000, 19500], 'Targets_Met': [3, 4, 3, 5, 4] }) # Create a mapping dictionary from sales_performance sales_map = dict(zip(sales_performance.Rep_ID, sales_performance.Monthly_Sales)) # Add Monthly_Sales to sales_team using map sales_team['Monthly_Sales'] = sales_team['Rep_ID'].map(sales_map) print(sales_team)Output:
Rep_ID Name Region Monthly_Sales 0 101 John Smith East 15000 1 102 Sarah Johnson West 22000 2 103 Mike Davis North 18000 3 104 Emma Wilson South 25000 4 105 Robert Brown Central 19500The map() method is straightforward and efficient, especially for simple one-to-one mappings. It’s also very readable, making code maintenance easier.
Read Python Pandas Write to Excel
5: Use DataFrame.assign() Method
The assign() method provides a clean, functional approach to adding columns. I often use it when I want to maintain immutability in my data processing pipeline.
Here’s how to use assign():
import pandas as pd # Create our sample dataframes sales_team = pd.DataFrame({ 'Rep_ID': [101, 102, 103, 104, 105], 'Name': ['John Smith', 'Sarah Johnson', 'Mike Davis', 'Emma Wilson', 'Robert Brown'], 'Region': ['East', 'West', 'North', 'South', 'Central'] }) sales_performance = pd.DataFrame({ 'Rep_ID': [101, 102, 103, 104, 105], 'Monthly_Sales': [15000, 22000, 18000, 25000, 19500], 'Targets_Met': [3, 4, 3, 5, 4] }) # Create a mapping dictionary sales_map = dict(zip(sales_performance.Rep_ID, sales_performance.Monthly_Sales)) # Use assign to add the column result = sales_team.assign(Monthly_Sales=lambda x: x.Rep_ID.map(sales_map)) print(result)Output:
Rep_ID Name Region Monthly_Sales 0 101 John Smith East 15000 1 102 Sarah Johnson West 22000 2 103 Mike Davis North 18000 3 104 Emma Wilson South 25000 4 105 Robert Brown Central 19500The assign() method doesn’t modify the original dataframe, which is great for maintaining data integrity throughout your analysis process.
Check out Create Plots Using Pandas crosstab() in Python
6: Use DataFrame.loc[] Method
The loc[] method gives you direct, intuitive access to add columns. I find it particularly useful when I need fine-grained control over how data is added.
Let’s see it in action:
import pandas as pd # Create our sample dataframes sales_team = pd.DataFrame({ 'Rep_ID': [101, 102, 103, 104, 105], 'Name': ['John Smith', 'Sarah Johnson', 'Mike Davis', 'Emma Wilson', 'Robert Brown'], 'Region': ['East', 'West', 'North', 'South', 'Central'] }) sales_performance = pd.DataFrame({ 'Rep_ID': [101, 102, 103, 104, 105], 'Monthly_Sales': [15000, 22000, 18000, 25000, 19500], 'Targets_Met': [3, 4, 3, 5, 4] }) # Set Rep_ID as index for both dataframes for easier alignment sales_team.set_index('Rep_ID', inplace=True) sales_performance.set_index('Rep_ID', inplace=True) # Add Monthly_Sales column using loc sales_team.loc[:, 'Monthly_Sales'] = sales_performance['Monthly_Sales'] print(sales_team.reset_index()) # Reset index for displayOutput:
Rep_ID Name Region Monthly_Sales 0 101 John Smith East 15000 1 102 Sarah Johnson West 22000 2 103 Mike Davis North 18000 3 104 Emma Wilson South 25000 4 105 Robert Brown Central 19500The loc[] method provides a clear, direct way to add columns, especially when working with indexed dataframes.
Read Drop the Header Row of Pandas DataFrame
Real-World Example: Combine Customer and Order Data
Let’s look at a more realistic scenario. Imagine we’re analyzing an e-commerce dataset with customer information in one dataframe and their order history in another:
import pandas as pd # Customer information dataframe customers = pd.DataFrame({ 'Customer_ID': [1001, 1002, 1003, 1004, 1005], 'Name': ['Alice Cooper', 'Bob Dylan', 'Charlie Parker', 'Diana Ross', 'Elton John'], 'State': ['California', 'New York', 'Texas', 'Florida', 'Illinois'], 'Signup_Date': ['2022-01-15', '2022-02-20', '2022-01-30', '2022-03-05', '2022-02-10'] }) # Order history dataframe orders = pd.DataFrame({ 'Customer_ID': [1001, 1002, 1003, 1004, 1005], 'Total_Orders': [12, 8, 15, 6, 10], 'Avg_Order_Value': [85.50, 120.75, 65.30, 210.15, 95.80], 'Last_Order_Date': ['2023-04-10', '2023-04-05', '2023-04-12', '2023-03-28', '2023-04-08'] }) # Let's add the Average Order Value to our customer data using merge enhanced_customers = customers.merge( orders[['Customer_ID', 'Avg_Order_Value', 'Total_Orders']], on='Customer_ID', how='left' ) print(enhanced_customers)Output:
Customer_ID Name State Total_Orders 0 1001 Alice Cooper California 12.0 1 1002 Bob Dylan New York 8.0 2 1003 Charlie Parker Texas 0.0 3 1004 Diana Ross Florida 6.0 4 1005 Elton John Illinois 10.0Using a left join ensures all customers are included, and fillna() helps handle missing values appropriately.
Check out np.where in Pandas Python
Performance Considerations of All the Methods
When working with large datasets, performance becomes crucial. Here’s a quick comparison of methods based on my experience:
- For small to medium datasets:
- merge() and join() are generally most intuitive and flexible
- map() is excellent for simple key-based lookups
- For large datasets:
- merge() with specified columns (rather than entire dataframes) helps reduce memory usage
- Consider using the pd.options.mode.chained_assignment = None setting for complex operations
- For very large datasets:
- Consider using dask or vaex libraries for out-of-memory processing
- Chunking operations can help manage memory constraints
Remember that clean, well-structured data is the foundation of good analysis. Taking the time to properly combine your dataframes will pay dividends when you move to the modeling and visualization stages of your data science workflow.
By mastering these techniques, you’ll be able to integrate data from multiple sources efficiently, creating rich, unified datasets that reveal deeper insights.
Related tutorials:
- Pandas GroupBy Without Aggregation Function in Python
- Pandas Merge Fill NAN with 0 in Python
- Pandas Find Duplicates in Python
Bijay KumarI am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.
enjoysharepoint.com/Tag » Add Column From One Dataframe To Another Based On Condition R
-
Add A New Column To A Dataframe Using Matching Values Of Another ...
-
R: Add A Column To Dataframe Based On Other Columns With Dplyr
-
How To Add A Column To A Dataframe In R With Tibble & Dplyr
-
Add New Column In Dataframe Based On Condition From Another ...
-
4 Data Wrangling Tasks In R For Advanced Beginners - Computerworld
-
Add A Column In A Pandas DataFrame Based On An If-Else Condition
-
How To Insert A New Column Based On Condition In Python?
-
How To Add A Column Based On Other Columns In R DataFrame
-
How To Add Column From Another DataFrame In Pandas
-
R: How To Add Column To Data Frame Based On Other Columns
-
Replace Values In Data Frame Conditionally In R (4 Examples)
-
Select Rows In DataFrame By Conditions On Multiple Columns
-
Joining Pandas Dataframes - Data Carpentry
-
How To Assign A Column Value In A Data Frame Based On Another ...