Day 02 Core Concepts

JOINs, GROUP BY, Aggregations

Combine tables, count rows, sum revenue, and compute averages. This is what data analysis actually looks like — and SQL makes it effortless.

~1 hour Hands-on Precision AI Academy

Today’s Objective

Combine tables, count rows, sum revenue, and compute averages. This is what data analysis actually looks like — and SQL makes it effortless.

A customer revenue report — total spend per customer, number of orders, average order size, and which city generates the most revenue. All from the database you built yesterday.

COUNT, SUM, AVG, MIN, MAX

Aggregate functions collapse many rows into a single value. These five cover 90% of everything you'll do with numbers in SQL.

sql
SQL
-- How many orders total?
SELECT COUNT(*) AS total_orders FROM orders;

-- Total revenue from completed orders
SELECT SUM(amount) AS total_revenue
FROM orders
WHERE status = 'completed';

-- Average order value
SELECT AVG(amount) AS avg_order FROM orders;

-- Largest and smallest single orders
SELECT MAX(amount) AS biggest_order, MIN(amount) AS smallest_order
FROM orders;

COUNT(*) vs COUNT(column): COUNT(*) counts all rows. COUNT(column) counts only rows where that column is NOT NULL. Use COUNT(*) unless you're specifically checking for null values.

GROUP BY — Aggregations Per Group

GROUP BY splits your table into groups and runs the aggregate function on each group. This is the core of almost every analytics query.

sql
SQL
-- Total orders per customer_id
SELECT customer_id, COUNT(*) AS num_orders, SUM(amount) AS total_spent, AVG(amount) AS avg_order
FROM orders
GROUP BY customer_id
ORDER BY total_spent DESC;

-- Revenue breakdown by order status
SELECT status, COUNT(*) AS count, SUM(amount) AS total
FROM orders
GROUP BY status;

The rule: any column in your SELECT that is NOT inside an aggregate function must appear in GROUP BY. Break this rule and you'll get an error.

HAVING — Filtering Groups

WHERE filters rows before grouping. HAVING filters groups after aggregation. You need both when you want "customers who spent more than $10,000."

sql
SQL
-- Customers with more than $10,000 in completed orders
SELECT customer_id, SUM(amount) AS total_spent
FROM orders
WHERE status = 'completed' -- filter rows first
GROUP BY customer_id
HAVING SUM(amount) > 10000 -- then filter groups
ORDER BY total_spent DESC;

JOIN — Combining Tables

A JOIN links two tables on a shared column (usually an ID). Your orders table has a customer_id. The customers table has an id. A JOIN connects them so you can see customer names next to their orders.

INNER JOIN

Returns only rows that have a match in both tables. Most common type.

sql
SQL
-- Orders with customer names (not just IDs)
SELECT c.name, c.city, o.order_date, o.amount, o.status
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
ORDER BY o.amount DESC;

-- Revenue per customer name (combining JOIN + GROUP BY)
SELECT c.name, c.city, COUNT(*) AS orders, SUM(o.amount) AS total_revenue
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'completed'
GROUP BY c.name, c.city
ORDER BY total_revenue DESC;

Table aliases (o and c) are just shorthand. Writing orders o means "call the orders table 'o' in this query." It saves typing and makes complex queries readable.

LEFT JOIN — Include All Left-Side Rows

sql
SQL
-- All customers, even those with no orders (NULL for order fields)
SELECT c.name, COUNT(o.id) AS order_count, COALESCE(SUM(o.amount), 0) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.name
ORDER BY total_spent DESC;

COALESCE(value, 0) replaces NULL with 0. Useful whenever a LEFT JOIN might produce NULLs in your aggregate columns.

40%

Tomorrow: CTEs and window functions

Day 3 is the jump from good to great — subqueries, CTEs, and window functions that most analysts never learn.

Day 3: Subqueries, CTEs & Window Functions

Supporting Resources

Go deeper with these references.

Day 2 Checkpoint

Before moving on, make sure you can answer these without looking:

Continue To Day 3
Subqueries, CTEs, Window Functions