Pandas: split a Series into two or more columns in Python

In today’s quick tutorial we’ll learn how to split data located in a Dataframe column into one or more columns.

Most probably you’ll be acquiring your data from an API, database, text or csv file. But in this example we’ll use a simple dataframe will build.

# Python3
import pandas as pd
targets = pd.DataFrame({ "manager": ["Johns;Tim ", "Mcgregor; Dave", "DeRocca; Leo", "Haze; Jim"] ,
                     "target": [42000, 85000, 45000, 33000]})

Let’s look at the data:

targets.head()
managertarget
0Johns;Tim42000
1Mcgregor; Dave85000
2DeRocca; Leo45000
3Haze; Jim33000

Split columns in Pandas dataframes

Splitting into multiple columns with str.split()

We would like to split the manager column into two.

manager = targets['manager']
targets[['last_name','first_name']] =  manager.str.split(";", n=1, expand=True)
targets

Explanation:

  1. We user str.split() method to first convert the Series to a string.
  2. The str.split() method receives a couple of parameters:
    • First (‘;’) is the split character, in this case a semi-colon.
    • Second (N=1) is the number of splits we would like to define for the string.
    • Expand=True has to be specified, as otherwise the string will not be divided into different columns.
  3. Once split the strings are kept in two columns we’ll add to the dataframe: ‘last_name’,’first_name’

Here’s our dataframe:

managertargetlast_namefirst_name
0Johns;Tim42000JohnsTim
1Mcgregor; Dave85000McgregorDave
2DeRocca; Leo45000DeRoccaLeo
3Haze; Jim33000HazeJi

Dividing a df column by comma

If your separator is a comma, then we just need to adjust the separator parameter of the split method.

targets[['last_name','first_name']] =  manager.str.split(",", n=1, expand=True)

Dataframe Column values to list

Couple of readers asked me about this topic. We have a detailed post on this topic. Look here.

Drop a redundant Pandas column

And what if you would like to keep only one of the new columns we just created.

targets.drop('first_name', axis=1)

More on removing Pandas dataframe columns can be found here.

Leave a Comment