Query Rows in DataFram based on Column Values Condition
Pandas DataFrame - Query based on Columns
To query DataFrame rows based on a condition applied on columns, you can use pandas.DataFrame.query() method.
By default, query() function returns a DataFrame containing the filtered rows. You can also pass inplace=True
argument to the function, to modify the original DataFrame.
Examples
1. Query DataFrame with condition on a column
In this example, we will query the DataFrame to return filtered DataFrame with rows that satisfy the passed boolean expression.
Python Program
import pandas as pd
#initialize a dataframe
df = pd.DataFrame(
[[21, 72, 67],
[23, 78, 62],
[32, 74, 56],
[73, 88, 67],
[32, 74, 56],
[43, 78, 69],
[32, 74, 54],
[52, 54, 76]],
columns=['a', 'b', 'c'])
#query single column
df1 = df.query('a>50')
#print the dataframe
print(df1)
Output
a b c
3 73 88 67
7 52 54 76
2. Query DataFrame with condition on multiple columns using AND operator
In this example, we will try to apply the condition on multiple columns and use AND operator.
Python Program
import pandas as pd
#initialize a dataframe
df = pd.DataFrame(
[[21, 72, 67],
[23, 78, 62],
[32, 74, 56],
[73, 88, 67],
[32, 74, 56],
[43, 78, 69],
[32, 74, 54],
[52, 54, 76]],
columns=['a', 'b', 'c'])
#query multiple columns
df1 = df.query('a>30 and c>60')
#print the dataframe
print(df1)
Output
a b c
3 73 88 67
5 43 78 69
7 52 54 76
3. Query DataFrame with condition on multiple columns using OR operator
In this example, we will try to apply the condition on multiple columns and use OR operator.
Python Program
import pandas as pd
#initialize a dataframe
df = pd.DataFrame(
[[21, 72, 67],
[23, 78, 62],
[32, 74, 56],
[73, 88, 67],
[32, 74, 56],
[43, 78, 69],
[32, 74, 54],
[52, 54, 76]],
columns=['a', 'b', 'c'])
#query multiple columns
df1 = df.query('a>50 or c>60')
#print the dataframe
print(df1)
Output
a b c
0 21 72 67
1 23 78 62
3 73 88 67
5 43 78 69
7 52 54 76
4. Query DataFrame with inplace parameter
We can pass inplace=True
, to modify the actual DataFrame we are working on.
Python Program
import pandas as pd
#initialize a dataframe
df = pd.DataFrame(
[[21, 72, 67],
[23, 78, 62],
[32, 74, 56],
[73, 88, 67],
[32, 74, 56],
[43, 78, 69],
[32, 74, 54],
[52, 54, 76]],
columns=['a', 'b', 'c'])
#query dataframe with inplace trues
df.query('a>50 and c>60', inplace=True)
#print the dataframe
print(df)
Output
a b c
3 73 88 67
7 52 54 76
Summary
In this Pandas Tutorial, we learned how to query a DataFrame with conditions applied on columns.