How to Calculate Mean of Pandas DataFrame? Python Examples
Python Pandas - Mean of DataFrame
To calculate mean of a Pandas DataFrame, you can use pandas.DataFrame.mean() method. Using mean() method, you can calculate mean along an axis, or the complete DataFrame.
In this tutorial, you'll learn how to find the mean of a DataFrame, along rows, columns, or complete DataFrame using DataFrame.mean() method, with examples.
Examples
1. Find Mean along columns of DataFrame
In this example, we will calculate the mean along the columns. We will come to know the average marks obtained by students, subject wise.
Python Program
import pandas as pd
mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
'physics': [68, 74, 77, 78],
'chemistry': [84, 56, 73, 69],
'algebra': [78, 88, 82, 87]}
# Create DataFrame
df_marks = pd.DataFrame(mydictionary)
print('DataFrame\n----------')
print(df_marks)
# Calculate mean of DataFrame
mean = df_marks.mean()
print('\nMean\n------')
print(mean)
Output
DataFrame
----------
names physics chemistry algebra
0 Somu 68 84 78
1 Kiku 74 56 88
2 Amol 77 73 82
3 Lini 78 69 87
Mean
------
physics 74.25
chemistry 70.50
algebra 83.75
dtype: float64
The mean() function returns a Pandas Series. This is the default behavior of the mean() function. Hence, for this particular case, you need not pass any arguments to the mean() function. Or, if you want to explicitly mention to mean() function, to calculate along the columns, pass axis=0
as shown below.
df_marks.mean(axis=0)
2. Find Mean of complete DataFrame
In this example, we will create a DataFrame with numbers present in all columns, and calculate mean of complete DataFrame.
From the previous example, we have seen that mean() function by default returns mean calculated among columns and return a Pandas Series. Apply mean() on returned series and mean of the complete DataFrame is returned.
Python Program
import pandas as pd
mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
'physics': [68, 74, 77, 78],
'chemistry': [84, 56, 73, 69],
'algebra': [78, 88, 82, 87]}
# Create DataFrame
df_marks = pd.DataFrame(mydictionary)
print('DataFrame\n----------')
print(df_marks)
# Calculate mean of the whole DataFrame
mean = df_marks.mean().mean()
print('\nMean\n------')
print(mean)
Output
DataFrame
----------
names physics chemistry algebra
0 Somu 68 84 78
1 Kiku 74 56 88
2 Amol 77 73 82
3 Lini 78 69 87
Mean
------
76.16666666666667
3. Find Mean of DataFrame along Rows
In this example, we will calculate the mean of all the columns along rows or axis=1
. In this particular example, the mean along rows gives the average or percentage of marks obtained by each student.
Python Program
import pandas as pd
mydictionary = {'names': ['Somu', 'Kiku', 'Amol', 'Lini'],
'physics': [68, 74, 77, 78],
'chemistry': [84, 56, 73, 69],
'algebra': [78, 88, 82, 87]}
# Create dataframe
df_marks = pd.DataFrame(mydictionary)
print('DataFrame\n----------')
print(df_marks)
# Calculate mean along rows
mean = df_marks.mean(axis=1)
print('\nMean\n------')
print(mean)
# Display names and average marks
print('\nAverage marks or percentage for each student')
print(pd.concat([df_marks['names'], mean], axis=1))
Output
DataFrame
----------
names physics chemistry algebra
0 Somu 68 84 78
1 Kiku 74 56 88
2 Amol 77 73 82
3 Lini 78 69 87
Mean
------
0 76.666667
1 72.666667
2 77.333333
3 78.000000
dtype: float64
Average marks or percentage for each student
names 0
0 Somu 76.666667
1 Kiku 72.666667
2 Amol 77.333333
3 Lini 78.000000
Summary
In this Pandas Tutorial, we have learned how to calculate mean of whole DataFrame, mean of DataFrame along column(s) and mean of DataFrame along rows.