How to count missing values in Pandas?

In this recipe we’ll learn how to easily count missing values in Pandas DataFrames and Series. We’ll specifically look into two key cases:

  • Count missing/null values in a DataFrame column (Series).
  • Count missing values in a specific DataFrame row

Create the Dataset

Type this into your Python environment:

# Python3
# let's create the dataframe 

import pandas as pd
data= [["Debbie", np.nan], ["Tommy", np.nan],[np.nan, 25000],
                 ['Tim', 30000]]

cols = ['person', 'budget']
df = pd.DataFrame(data, columns=cols)
df.head()

Here’s our newly created Dataset:

personbudget
0DebbieNaN
1TommyNaN
2NaN25000.0
3Tim30000.0

Count nan values in Pandas columns

We’ll first subset out the budget column into a Series an apply the isna() Series method.

# count NA in a column
df['budget'].isna().sum()

The result is 2.

Important Note: Using the df[‘budget’].isna().count() statement will simply calculate the Series lenght and provide an erroneous result. In this case – 4.

Count missing values in a row

A bit more interesting scenario is how to calculate the number of nan values ina specific row.

# sum NA in rows
df['num_of_na'] = df.apply(lambda na: na.isnull().sum(), axis=1)

We use a lambda function to calculate the number of missing values. The axis=1 statement allows us to apply the calculation across the different columns of the Dataframe.

personbudgetnum_of_na
0DebbieNaN1
1TommyNaN1
2NaN25000.01
3Tim30000.00

In case of feedback or questions, please leave us a comment.

Leave a Comment