Pandas groupby sum

Hem / Teknik & Digitalt / Pandas groupby sum

If you want to sort keys in descending order, use the below. Drop your comments or questions below.


You can use the following basic syntax to find the sum of values by group in pandas:

df.groupby(['group1','group2'])['sum_col'].sum().reset_index()

The following examples show how to use this syntax in practice with the following pandas DataFrame:

import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'], 'position': ['G', 'G', 'F', 'C', 'G', 'F', 'F', 'C'], 'points': [25, 17, 14, 9, 12, 9, 6, 4], 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]}) #view DataFrame df team position points rebounds 0 A G 25 11 1 A G 17 8 2 A F 14 10 3 A C 9 6 4 B G 12 6 5 B F 9 5 6 B F 6 9 7 B C 4 12

Example 1: Group by One Column, Sum One Column

The following code shows how to group by one column and sum the values in one column:

#group by team and sum the points df.groupby(['team'])['points'].sum().reset_index() team points 0 A 65 1 B 31

From the output we can see that:

  • The players on team A scored a sum of 65 points.
  • The players on team B scored a sum of 31 points.

Example 2: Group by Multiple Columns, Sum Multiple Columns

The following code shows how to group by multiple columns and sum multiple columns:

#group by team and position, sum points and rebounds df.groupby(['team', 'position'])['points', 'rebounds'].sum().reset_index() team position points rebounds 0 A C 9 6 1 A F 14 10 2 A G 42 19 3 B C 4 12 4 B F 15 14 5 B G 12 6

From the output we can see that:

  • The players on team A in the ‘C’ position scored a sum of 9 points and 6 rebounds.
  • The players on team A in the ‘F’ position scored a sum of 14 points and 10 rebounds.
  • The players on team A in the ‘G’ position scored a sum of 42 points and 19 rebounds.

And so on.

Note that the reset_index() function prevents the grouping columns from becoming part of the index.

For example,

What does the sum() function do with groupby()?

Using Pandas function on a DataFrameGroupBy object, you can calculate the sum of numeric columns for each group. Also explained using some more functions like , , and functions how to get the sum of the grouped data of single/multiple columns.

Happy Learning !!

Related Articles

References

Tags: DataFrame.set_index,GroupBy.sum(),Pivot Function,Transform()

Sum Values by Group in Pandas

Top 9 Ways to Sum Values by Group in Pandas

When working with data in Python, the library is an indispensable tool for data manipulation.

For example,

Conclusion

In this article, I have explained groupby() and sum() functions and using together how we can group the data on single/multiple columns of DataFrame and calculate the sum of these grouped data with multiple examples. function returns a DataFrameGroupBy object which can be used for performing aggregate functions on each group.

In this article, I will explain how to use and functions together with examples.

Using transform() Function with DataFrame.GoupBy().sum()

You can also transform the groupby() result.

Pandas DataFrame.set_index Using Sum with Level

You can also use to set the groupby column to index rather than using sum with level.

Yields below output. Each approach offers slightly different advantages depending on your specific needs and desired output format.

For more and related solutions, check out:

We’d love to hear your feedback!

For instance, .

Q: How can I create a pivot table from grouped sums

ANS: After grouping and summing, you can use the method.

Pandas Group By & Sum Using agg() Aggregate Function

Instead of using the function you can use aggregate function groupby.agg(‘sum’) to aggregate Pandas DataFrame results.

Use when you want a summarized table of groups.

Q: Is there a way to use SQL syntax with pandas

ANS: Yes, libraries like allow you to execute SQL queries directly on pandas DataFrames, providing an alternative if you prefer SQL.

Q: What does setting as_index=False in groupby do

ANS: When is set in , the columns used for grouping are retained as regular columns in the output DataFrame, rather than becoming the index.

Q: How do I apply multiple aggregations at once

ANS: The method is ideal for this.

group by & sum on single & multiple columns is accomplished in multiple ways in pandas, some of them are , , , and functions.

Key Points –

  • The function is used to group data in a DataFrame based on one or more columns, allowing for aggregation or transformation of the grouped data.
  • After grouping data with , the method can be used to calculate the sum of numeric values for each group.
  • When applying with , you can group by multiple columns, and the sum will be computed for each unique combination of the group keys.
  • You can specify which columns to sum after grouping, either by selecting them before applying or by using the method.
  • After performing , the result often has a hierarchical index.

    For example, here’s what the output looks like if we don’t use it:

    #group by team and position, sum points and rebounds df.groupby(['team', 'position'])['points', 'rebounds'].sum() points rebounds team position A C 9 6 F 14 10 G 42 19 B C 4 12 F 15 14 G 12 6

    Depending on how you’d like the results to appear, you may or may not choose to use the reset_index() function.

    Additional Resources

    The following tutorials explain how to perform other common grouping operations in pandas:

    How to Count Observations by Group in Pandas
    How to Find the Max Value by Group in Pandas
    How to Calculate Quantiles by Group in Pandas

.

The below example applies the sum to the column.

Yields below output.

Pandas groupby() & sum() on Multiple Columns

You can also send a list of columns you want the group to method, using this, you can apply a group by on multiple columns and calculate a sum over each combination group.

You can also explicitly specify in which column you want to do a operation. For example, will calculate the total number of one group with function sum, the result is a series with the same index as the original DataFrame.

Yields below output.

pandas groupby sum

For example, .

Q: Can I rename the aggregated sum column

ANS: Yes, you can use the method with named aggregation. reset_index() function is used to set the index on DataFrame.

The above two examples yield the below output.

In case, you want to sort by a different key, you can use something like below.

For example .

Yields below output.