SQL GROUP BY and Aggregate Functions: Summarize Your Data
Published: 2026-04-08
GROUP BY collapses multiple rows that share the same value in one or more columns into a single summary row. Combined with aggregate functions like COUNT, SUM, AVG, MIN, and MAX, it is the foundation of almost every reporting and analytics query. GROUP BY is supported in all SQL databases including SQLite, PostgreSQL, MySQL, SQL Server, and Oracle.
Sample Data
We use two tables in all examples:
employees table:
| id | name | department | hire_date | salary |
|---|---|---|---|---|
| 1 | Alice | Engineering | 2023-01-15 | 95000 |
| 2 | Bob | Engineering | 2023-06-01 | 85000 |
| 3 | Carol | Engineering | 2024-03-10 | 85000 |
| 4 | Dave | Sales | 2023-02-20 | 72000 |
| 5 | Eve | Sales | 2023-09-05 | 68000 |
| 6 | Frank | Sales | 2024-01-15 | 65000 |
| 7 | Grace | Marketing | 2023-04-01 | 82000 |
| 8 | Henry | Marketing | 2024-06-01 | 70000 |
orders table:
| id | customer | product | category | quantity | unit_price | order_date | status |
|---|---|---|---|---|---|---|---|
| 1 | Acme Corp | Laptop Pro | Electronics | 3 | 1299.99 | 2026-01-05 | delivered |
| 2 | Acme Corp | Wireless Mouse | Electronics | 10 | 29.99 | 2026-01-12 | delivered |
| 3 | Globex | Office Chair | Furniture | 5 | 249.99 | 2026-01-18 | delivered |
| 4 | Globex | Standing Desk | Furniture | 2 | 599.99 | 2026-01-20 | shipped |
| 5 | Initech | Notebook Pack | Stationery | 20 | 12.99 | 2026-02-03 | delivered |
| 6 | Initech | Pen Set | Stationery | 15 | 8.99 | 2026-02-07 | delivered |
| 7 | Acme Corp | Monitor 27" | Electronics | 4 | 399.99 | 2026-02-14 | delivered |
| 8 | Umbrella Ltd | Laptop Pro | Electronics | 2 | 1299.99 | 2026-02-20 | cancelled |
| 9 | Umbrella Ltd | Desk Lamp | Furniture | 8 | 39.99 | 2026-03-01 | delivered |
| 10 | Globex | Wireless Mouse | Electronics | 12 | 29.99 | 2026-03-05 | shipped |
| 11 | Initech | Monitor 27" | Electronics | 1 | 399.99 | 2026-03-10 | delivered |
| 12 | Acme Corp | Standing Desk | Furniture | 1 | 599.99 | 2026-03-15 | processing |
GROUP BY Syntax
SELECT column1, aggregate_function(column2)
FROM table_name
WHERE condition
GROUP BY column1
HAVING aggregate_condition
ORDER BY column1;
- GROUP BY — one or more columns whose distinct value combinations define each group
- Aggregate functions — compute a single value from all rows in the group
- HAVING — filters groups (like WHERE, but for grouped results)
- Every column in SELECT that is not inside an aggregate must appear in GROUP BY
SQL Execution Order
SQL clauses are written in one order but executed in a different order. Understanding this helps explain why HAVING can reference aggregate functions but WHERE cannot, and why you cannot use a SELECT alias in a HAVING clause.
FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY
| Step | Clause | What happens |
|---|---|---|
| 1 | FROM | Load the table(s) and evaluate any JOINs |
| 2 | WHERE | Filter individual rows — aggregate functions are not allowed here |
| 3 | GROUP BY | Collapse remaining rows into groups |
| 4 | HAVING | Filter groups — aggregate functions are allowed here |
| 5 | SELECT | Compute output columns and aliases |
| 6 | ORDER BY | Sort the final result set |
This is why WHERE status = 'delivered' runs before grouping (fast, row-level filter), while HAVING COUNT(*) > 2 runs after grouping (filters whole groups).
COUNT()
COUNT(*) counts all rows in a group. COUNT(column) counts non-NULL values in that column.
Example: Number of Employees per Department
Example: Number of Orders per Customer
Example: Count Delivered vs. Other Statuses
Count only orders matching a condition using COUNT with a CASE expression:
SUM()
SUM(column) adds up all values in the group for a numeric column.
Example: Total Salary per Department
Example: Total Revenue per Category
Compute revenue as quantity * unit_price:
Example: Revenue by Customer and Category
Group by multiple columns to break revenue down further:
AVG()
AVG(column) returns the arithmetic mean of all non-NULL values in the group.
Example: Average Salary per Department
Example: Average Order Value per Customer
MIN() and MAX()
MIN(column) and MAX(column) return the smallest and largest values in the group.
Example: Salary Range per Department
Example: First and Last Order Date per Customer
Combining Multiple Aggregates
You can combine any number of aggregate functions in a single query.
Example: Full Department Summary
Example: Order Summary per Customer
HAVING
HAVING filters groups after aggregation — it is the equivalent of WHERE for grouped results.
Example: Departments with More Than Two Employees
Example: Customers with Total Revenue Above 2000
Example: Categories with Average Order Value Below 500
WHERE vs. HAVING
WHERE filters individual rows before grouping. HAVING filters groups after aggregation.
Example: Revenue from Delivered Orders Only per Customer
Use WHERE to exclude rows before grouping, then HAVING to filter results:
GROUP BY with JOINs
You can group data that spans multiple tables by first joining them.
Example: Revenue per Department via Employee Join
Example: Order Count and Revenue per Category and Month
GROUP BY with Expressions
You can group by computed expressions, not just plain columns.
Example: Employees Hired per Year
Example: Orders by Month
COUNT(DISTINCT)
COUNT(DISTINCT column) counts the number of unique values in the group.
Example: Unique Products Ordered per Customer
Filtering Groups with Multiple Conditions
HAVING supports any boolean expression, including AND/OR combinations.
Example: High-Value Active Customers
Find customers with more than one order and total revenue above 1000:
Practical Example: Sales Report
Combine everything into a full sales summary report.
Example: Category Performance Report
Example: Top Product by Revenue
GROUP BY vs. Window Functions
GROUP BY collapses rows into one row per group. Window functions keep all rows and add a computed column — see the SQL Window Functions tutorial for details.
Example: GROUP BY Returns One Row per Department
Example: Window Function Keeps All Rows
Common Use Cases for GROUP BY
- Reporting dashboards — Summarize totals, averages, and counts by category or time period
- Headcount analysis — Count employees, users, or records by group
- Revenue breakdown — Total and average revenue by product, region, or customer
- Trend analysis — Group by month or year to spot patterns over time
- Top-N per group — Combine with subqueries or CTEs to find the best performer in each group
- Data validation — Count duplicates or NULL values across groups
- Cohort analysis — Group users by signup date or behavior segment
Performance Tips
- Index GROUP BY columns — Indexes on grouped columns speed up the sort and grouping step
- Filter early with WHERE — Reducing rows before grouping with WHERE is faster than filtering after with HAVING
- Avoid SELECT * — Only select columns you actually need; aggregate operations benefit from narrower data
- Use covering indexes — An index that includes both the GROUP BY column and the aggregated column can avoid a full table scan
- Consider materialized views — For frequently computed aggregates on large tables, some databases support pre-computed summaries
Summary
- GROUP BY collapses rows that share the same column values into one summary row per unique combination
- COUNT(*) counts all rows; COUNT(column) counts non-NULL values; COUNT(DISTINCT column) counts unique values
- SUM totals numeric values; AVG computes the mean; MIN / MAX find the boundaries
- Every SELECT column must either appear in GROUP BY or be wrapped in an aggregate function
- WHERE filters rows before grouping; HAVING filters groups after aggregation
- You can GROUP BY expressions (e.g.,
SUBSTR(date, 1, 7)for month-level grouping) - Joining tables before grouping lets you combine data from multiple sources into one report
- Use window functions instead of GROUP BY when you need group-level metrics without losing individual rows
Try combining GROUP BY with JOINs, CASE expressions, and HAVING clauses to build the reports your data analysis needs!