Pandas DataFrame - Select rows without NaN
Pandas DataFrame - Select rows without NaN
In Pandas, to select rows without NaN values from a DataFrame, you can use notna() method of DataFrame instance.
The following are complete steps to select the rows of a DataFrame that do not have NaN.
- Call notna() method on the given DataFrame. The method returns a DataFrame of the same shape as the original DataFrame where the cells are True if the respective cells in original DataFrame are not NaN, or False otherwise.
- Now, we need to chain all() method to the notna() method call. all(axis=1) checks if all values in each row are True, meaning there are no NaN values in that row.
- The result of DataFrame.notna().all(axis=1) is a boolean Series where True represents rows without NaN values and False represents rows with at least one NaN value.
- Finally, index the DataFrame with this boolean Series. It returns a DataFrame with the rows from original DataFrame that do not have NaN.
Let us keep all the above steps in a single expression, considering df as the original DataFrame.
df[df.notna().all(axis=1)]
Now, let us go through some examples, where we select rows from DataFrame without NaN values.
Examples
1. Select rows from given DataFrame without NAN values
In this example, we are given a DataFrame in df. We have to select rows from this DataFrame that doesn't have NaN values, using DataFrame.notna() method.
Steps
- Given a DataFrame in df with three columns 'A', 'B', and 'C', and four rows.
df = pd.DataFrame({
'A': [1, 2, 3, np.nan],
'B': [5, np.nan, 7, 8],
'C': [9, 10, 11, np.nan]})
- Call notna() and all() methods on the DataFrame df as a chain as shown below.
df.notna().all(axis=1)
This expression returns a series object, with True for the rows without any NaN values.
- Pass the above expression as index for the DataFrame.
df[df.notna().all(axis=1)]
This returns a DataFrame with rows that do not have any NaN values from the original DataFrame.
- You may store the returned DataFrame in a variable, and print it to output.
df_no_nan = df[df.notna().all(axis=1)]
print(df_no_nan)
Program
The complete program to select rows without NaN from a given DataFrame.
Python Program
import pandas as pd
import numpy as np
# Take a DataFrame
df = pd.DataFrame({
'A': [1, 2, 3, np.nan],
'B': [5, np.nan, 7, 8],
'C': [9, 10, 11, np.nan]})
# Select rows without nan
df_no_nan = df[df.notna().all(axis=1)]
# Print the resulting dataframe
print(df_no_nan)
Output
A B C
0 1.0 5.0 9.0
2 3.0 7.0 11.0
Only the first and third rows do not have NaN values. Therefore, only these rows are returned in the DataFrame.
Summary
In this Pandas Tutorial, we learned how to select rows without NaN value from a given DataFrame, with examples.