How to create a Pandas Series or Dataframes from Numpy arrays in Python?

In today’s recipe we’ll show how you can very easily convert np arrays to Pandas series or dataframe objects in Python.

We’ll look into the following scenarios:

  1. Convert array to a Series
  2. Ndarray to DataFrame column
  3. NumPy array to DataFrame column with column names
  4. Insert array to existing DataFrame row

Data Preparation

Let’s define a simple two dimensional array that we’ll use as an example in this recipe. Copy this Python code and paste it in your favorite development environment or in Jupyter Notebook.

# Python3
import numpy as np
import pandas as pd
np.random.seed(10)

my_array = np.random.randint(1000, 10000, (6,2))

Note: If you receive a ‘modulenotfound’ error when importing Numpy, look at how to fix it here.

Let’s take a look at the auto-generated dataset:

print(my_array)
[[2289 8293]
 [2344 8291]
 [5829 2520]
 [7400 6648]
 [5452 1239]
 [3443 3102]]

We have created an ndarray, let’s verify that:

type(my_array)

numpy.ndarray

Note: use the shape property to find out the number of rows and columns of the ndarray (my_array.shape)

Convert Numpy array to Pandas Series/Column

First case we’ll cover is to write one of the ndarray columns to a Python Series object. That’s easy with the pd.Series function.

# 1.  Numpy Ndarray to PD series
actuals = pd.Series(data = my_array[:,0],dtype='int32' )
actuals.head()

As expected we got a series:

0    2289
1    2344
2    5829
3    7400
4    5452
dtype: int32

Ndarray to Pandas Dataframe

We’ll now write our Numpy ndarray directly to a Pandas Dataframe.

Without columns names
#2. array to df
revenue = pd.DataFrame(data = my_array)

With column names

For better readability and to ease on your data analysis, you should define column headings as shown below:

#3. Numpy array to Pandas dataframe with columns
revenue = pd.DataFrame(data = my_array, columns= ['budget', 'actual'] )
revenue.head()

Here’s our dataframe header:

budgetactual
022898293
123448291
258292520
374006648
454521239

Add Numpy array to DataFrame row

We’ll now create a simple array and append it to DataFrame.

# 4. Insert NP array to exiting dataframe row
new_array = np.array([1850, 1950])
revenue.loc[len(revenue)] = new_array
revenue.tail()

Our new ndarray was inserted in the last position of our df.

Note: If required you can reset the df index using the reset_index DataFrame method:

revenue.reset_index(drop=True)

Leave a Comment