Pandas DataFrame - Replace values in column based on condition


Pandas - Replace Values in Column based on Condition

To replace values in column based on condition in a Pandas DataFrame, you can use DataFrame.loc property, or numpy.where(), or DataFrame.where().

In this tutorial, we will go through all these processes with example programs.


1. Replace values in column by a condition using DataFrame.loc property

To replace a values in a column based on a condition, using DataFrame.loc, use the following syntax.

DataFrame.loc[condition, column_name] = new_value

In the following program, we will replace those values in the column 'a' that satisfy the condition that the value is less than zero.

Python Program

import pandas as pd

df = pd.DataFrame([
	[-10, -9, 8],
	[6, 2, -4],
	[-8, 5, 1]],
	columns=['a', 'b', 'c'])

df.loc[(df.a < 0), 'a'] = 0
print(df)

Output

   a  b  c
0  0 -9  8
1  6  2 -4
2  0  5  1

You can also replace the values in multiple values based on a single condition. Pass the columns as tuple to loc.

DataFrame.loc[condition, (column_1, column_2)] = new_value

In the following program, we will replace those values in columns 'a' and 'b' that satisfy the condition that the value is less than zero.

Python Program

import pandas as pd

df = pd.DataFrame([
	[-10, -9, 8],
	[6, 2, -4],
	[-8, 5, 1]],
	columns=['a', 'b', 'c'])

df.loc[(df.a < 0), ('a', 'b')] = 0
print(df)

Output

   a  b  c
0  0  0  8
1  6  2 -4
2  0  0  1

2. Replace values in column by a condition using Numpy.where() method

To replace a values in a column based on a condition, using numpy.where, use the following syntax.

DataFrame['column_name'] = numpy.where(condition, new_value, DataFrame.column_name)

In the following program, we will use numpy.where() method and replace those values in the column 'a' that satisfy the condition that the value is less than zero.

Python Program

import pandas as pd
import numpy as np

df = pd.DataFrame([
	[-10, -9, 8],
	[6, 2, -4],
	[-8, 5, 1]],
	columns=['a', 'b', 'c'])

df['a'] = np.where((df.a < 0), 0, df.a)
print(df)

Output

   a  b  c
0  0 -9  8
1  6  2 -4
2  0  5  1

3. Replace values in column by a condition using DataFrame.where() method

To replace a values in a column based on a condition, using numpy.where, use the following syntax.

DataFrame['column_name'].where(~(condition), other=new_value, inplace=True)
  • column_name is the column in which values has to be replaced.
  • condition is a boolean expression that is applied for each value in the column.
  • new_value replaces (since inplace=True) existing value in the specified column based on the condition.

In the following program, we will use DataFrame.where() method and replace those values in the column 'a' that satisfy the condition that the value is less than zero.

Python Program

import pandas as pd

df = pd.DataFrame([
	[-10, -9, 8],
	[6, 2, -4],
	[-8, 5, 1]],
	columns=['a', 'b', 'c'])

df['a'].where(~(df.a < 0), other=0, inplace=True)
print(df)

Output

   a  b  c
0  0 -9  8
1  6  2 -4
2  0  5  1

Summary

In this tutorial of Python Examples, we learned how to replace values of a column in DataFrame, with a new value, based on a condition.


Python Libraries