Pandas DataFrame - Select Rows by Column Value


Pandas DataFrame - Select Rows by Column value

In Pandas DataFrame, you can select rows by column value using boolean indexing or DataFrame query() method.

In this tutorial, we shall go through examples where we shall select rows from a DataFrame, based on a condition applied on a single column.

1. Select rows from DataFrame by column value using boolean indexing

In this example, we are given a DataFrame in df, and we need to select rows from this DataFrame where the column value quantity is greater than 10, using boolean indexing.

Steps

  1. Import pandas library.
import pandas as pd
  1. Given DataFrame in df.
df = pd.DataFrame({
    'name': ['apple', 'banana', 'cherry', 'fig', 'mango'],
    'quantity': [14, 0, 0, 37, 25],
    'price': [100, 50, 20, 30, 150]
})
  1. Use boolean indexing and select rows where the column 'quantity' has a value greater than 10.
df[df['quantity'] > 10]

The expression df['quantity'] > 10 creates a boolean mask where each row is marked as True if the 'quantity' is greater than 10, or False otherwise. Then, this mask is used to index the DataFrame, selecting only the rows where the condition is True.

In total, the expression returns a new DataFrame with the selected rows that satisfy the given condition based on column value. Assign it to a variable, say df_selected_rows.

  1. You may print the resulting DataFrame df_selected_rows to output.

Program

The complete program to select rows from a DataFrame where the value of a specific column 'quantity' is greater than 10, is given below.

Python Program

import pandas as pd

# Take a DataFrame
df = pd.DataFrame({
    'name': ['apple', 'banana', 'cherry', 'fig', 'mango'],
    'quantity': [14, 0, 0, 37, 25],
    'price': [100, 50, 20, 30, 150]
})

# Select rows by column value
df_selected_rows = df[df['quantity'] > 10]

# Print DataFrame
print(f"Original DataFrame\n{df}\n")
print(f"Selected Rows\n{df_selected_rows}")

Output

Original DataFrame
     name  quantity  price
0   apple        14    100
1  banana         0     50
2  cherry         0     20
3     fig        37     30
4   mango        25    150

Selected Rows
    name  quantity  price
0  apple        14    100
3    fig        37     30
4  mango        25    150

2. Select rows from DataFrame by column value using query() method

In this example, we are given a DataFrame in df, and we need to select rows from this DataFrame where the column value quantity is greater than 10, using DataFrame query() method.

Steps

  1. Import pandas library.
import pandas as pd
  1. Given DataFrame in df.
df = pd.DataFrame({
    'name': ['apple', 'banana', 'cherry', 'fig', 'mango'],
    'quantity': [14, 0, 0, 37, 25],
    'price': [100, 50, 20, 30, 150]
})
  1. Call query() method on df DataFrame and pass the query that selects rows where the column 'quantity' has a value greater than 10.
df.query('quantity > 10')

The query method returns a new DataFrame with the rows selected by the given query. Assign it to a variable, say df_selected_rows.

  1. You may print the resulting DataFrame df_selected_rows to output.

Program

The complete program to select rows from a DataFrame where the value of a specific column 'quantity' is greater than 10, is given below.

Python Program

import pandas as pd

# Take a DataFrame
df = pd.DataFrame({
    'name': ['apple', 'banana', 'cherry', 'fig', 'mango'],
    'quantity': [14, 0, 0, 37, 25],
    'price': [100, 50, 20, 30, 150]
})

# Select rows by column value
df_selected_rows = df.query('quantity > 10')

# Print DataFrame
print(f"Original DataFrame\n{df}\n")
print(f"Selected Rows\n{df_selected_rows}")

Output

Original DataFrame
     name  quantity  price
0   apple        14    100
1  banana         0     50
2  cherry         0     20
3     fig        37     30
4   mango        25    150

Selected Rows
    name  quantity  price
0  apple        14    100
3    fig        37     30
4  mango        25    150

3. Select rows from DataFrame where column value is equal to 0 using boolean indexing

In this example, we are given a DataFrame in df, and we need to select rows from this DataFrame where the column value quantity is equal to 0.

Steps

Follow the same steps given in the previous example, except for Step 3, where the condition becomes

df[df['quantity'] == 0]

Program

The complete program to select rows from a DataFrame where the value of a specific column 'quantity' is equal to 0, is given below.

Python Program

import pandas as pd

# Take a DataFrame
df = pd.DataFrame({
    'name': ['apple', 'banana', 'cherry', 'fig', 'mango'],
    'quantity': [14, 0, 0, 37, 25],
    'price': [100, 50, 20, 30, 150]
})

# Select rows by column value
df_selected_rows = df[df['quantity'] == 0]

# Print DataFrame
print(f"Original DataFrame\n{df}\n")
print(f"Selected Rows\n{df_selected_rows}")

Output

Original DataFrame
     name  quantity  price
0   apple        14    100
1  banana         0     50
2  cherry         0     20
3     fig        37     30
4   mango        25    150

Selected Rows
     name  quantity  price
1  banana         0     50
2  cherry         0     20

Summary

In this Pandas Tutorial, we learned how to select rows from a DataFrame based on a column value using boolean indexing, with example programs.