Pandas: split one column into two or multiple columns in Python

In today’s quick tutorial we’ll learn how to re-format your column contents so that you can split data located in a DataFrame column into one or more columns.

Most probably you’ll be acquiring your data from an API, database, text or comma separated value file. But in this example we’ll use a simple dataframe that we’ll define manually out of a dictionary.

# Python3
import pandas as pd
targets = pd.DataFrame({ "manager": ["Johns;Tim ", "Mcgregor; Dave", "DeRocca; Leo", "Haze; Jim"] ,
                     "target": [42000, 85000, 45000, 33000]})

Let’s look at the data:

targets.head()
managertarget
0Johns;Tim42000
1Mcgregor; Dave85000
2DeRocca; Leo45000
3Haze; Jim33000

Split columns in Pandas dataframes

Splitting into multiple columns with str.split()

We would like to split the manager column into two. As you can see the name and family name are separated by a semi-colon.

manager = targets['manager']
targets[['last_name','first_name']] =  manager.str.split(";", n=1, expand=True)
targets

Explanation:

  1. We user str.split() method to first convert the Series to a string.
  2. The str.split() method receives a couple of parameters:
    • First (‘;’) is the split character, in this case a semi-colon.
    • Second (N=1) is the number of splits we would like to define for the string. If you would to divide the column text into more than two columns, set your N parameter accordingly.
    • Expand=True has to be specified, as otherwise the string will not be divided into different columns.
  3. Once split the strings are kept in two columns we’ll add to the dataframe: ‘last_name’,’first_name’

Here’s our DataFrame:

managertargetlast_namefirst_name
0Johns;Tim42000JohnsTim
1Mcgregor; Dave85000McgregorDave
2DeRocca; Leo45000DeRoccaLeo
3Haze; Jim33000HazeJi

Dividing a DataFrame column by comma

If your separator is a comma, then we just need to adjust the separator parameter of the split method.

targets[['last_name','first_name']] =  manager.str.split(",", n=1, expand=True)

Export DataFrame Column values to list

Couple of readers asked me about this topic. We have a detailed post on this topic. Look here.

Drop a redundant Pandas column

if you would like to keep only one of the new columns we just created, you can use the following code:

targets.drop('first_name', axis=1)

More on removing Pandas dataframe columns can be found here.

Leave a Comment