Pandas DataFrame.fillna()
Pandas DataFrame fillna
DataFrame.fillna() method fills(replaces) NA or NaN values in the DataFrame with the specified values.
fillna() method can be used to fill NaN values in the whole DataFrame, or specific columns, or modify inplace, or limit on the number of fillings, or choose an axis along which filling has to take place etc.
In this tutorial, we will go through some of the above scenarios to fill NaN values in DataFrame with example programs.
Syntax of DataFrame.fillna
The syntax of DataFrame.fillna() method is
DataFrame.fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None) → Union[ForwardRef(‘DataFrame’), NoneType][source]
where
- value can be scalar, dictionary, pandas Series or a DataFrame
- method can be one of these values {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}.
- axis can take 0 or 'index', 1 or 'columns'.
- inplace is a boolean argument. If True, the DataFrame is modified inplace, and if False a new DataFrame with resulting contents is returned.
- limit takes integer or None. This is the maximum number of consecutive NaN values to forward/backward fill. This argument is used only if method is specified.
- downcast can be a dictionary or None.
The arguments are self explanatory. We have provided the possible values that can be passed for these arguments. And we shall dive into the examples to understand how fillna() can be used.
Examples
1. Replace NaN values with 0 in DataFrame
This is one of the basic usage of fillna() method, and a good place to start understanding the usage.
In the following program, we shall create a DataFrame with values containing NaN. And we will use fillna() method to replace these NaN values with 0. We pass value 0
for the argument value
in fillna().
Python Program
import pandas as pd
import numpy as np
df = pd.DataFrame(
[[np.nan, 72, 67],
[23, 78, 62],
[32, 74, np.nan],
[np.nan, 54, 76]],
columns=['a', 'b', 'c'])
df_result = df.fillna(0)
print('Original DataFrame\n', df)
print('\nResulting DataFrame\n', df_result)
Output
Original DataFrame
a b c
0 NaN 72 67.0
1 23.0 78 62.0
2 32.0 74 NaN
3 NaN 54 76.0
Resulting DataFrame
a b c
0 0.0 72 67.0
1 23.0 78 62.0
2 32.0 74 0.0
3 0.0 54 76.0
2. Fill NaN values with column specific values in a DataFrame
The value
argument can take a dictionary. This dictionary we pass are a set of column name and value pairs. NaN values in the column are replaced with value specific to the column.
In the following program, we shall create a DataFrame with values containing NaN. And we will use fillna() method to replace these NaN values with different values in different columns. We will pass the dictionary specifying these columns and values.
Python Program
import pandas as pd
import numpy as np
df = pd.DataFrame(
[[np.nan, 72, 67],
[23, np.nan, 62],
[32, 74, np.nan],
[np.nan, 54, 76]],
columns=['a', 'b', 'c'])
df_result = df.fillna(value={'a': 0, 'b': 1, 'c': 2})
print('Original DataFrame\n', df)
print('\nResulting DataFrame\n', df_result)
Output
Original DataFrame
a b c
0 NaN 72.0 67.0
1 23.0 NaN 62.0
2 32.0 74.0 NaN
3 NaN 54.0 76.0
Resulting DataFrame
a b c
0 0.0 72.0 67.0
1 23.0 1.0 62.0
2 32.0 74.0 2.0
3 0.0 54.0 76.0
3. DataFrame.fillna() with inplace=True
By default, fillna() method returns a DataFrame with resulting or modified data. But, if you would like to modify the original DataFrame inplace, pass True for inplace argument.
Python Program
import pandas as pd
import numpy as np
df = pd.DataFrame(
[[np.nan, 72, 67],
[23, np.nan, 62],
[32, 74, np.nan],
[np.nan, 54, 76]],
columns=['a', 'b', 'c'])
print('Original DataFrame\n', df)
df.fillna(value=0, inplace=True)
print('\nModified DataFrame\n', df)
Output
Original DataFrame
a b c
0 NaN 72.0 67.0
1 23.0 NaN 62.0
2 32.0 74.0 NaN
3 NaN 54.0 76.0
Modified DataFrame
a b c
0 0.0 72.0 67.0
1 23.0 0.0 62.0
2 32.0 74.0 0.0
3 0.0 54.0 76.0
Summary
In this tutorial of Python Examples, we learned about DataFrame.fillna() method.