SQL Subqueries: Correlated vs Non-Correlated, Scalar, and EXISTS Explained
Subqueries appear in almost every SQL interview. This guide explains non-correlated vs correlated subqueries, scalar subqueries, EXISTS vs IN, and exactly when to use each — with real interview examples throughout.
Subqueries are one of those SQL concepts that look simple in a tutorial but trip people up badly in interviews.
The reason: there are several distinct types of subqueries, each used differently, and candidates often mix them up or default to the wrong one. This guide breaks down every type clearly.
1. What is a subquery?
A subquery is a SELECT statement nested inside another SQL statement.
sqlSELECT name FROM customers WHERE customer_id IN ( SELECT customer_id -- this is the subquery FROM orders WHERE amount > 1000 );
The inner query runs first, produces a result, and the outer query uses that result.
Subqueries can live in three places:
- The
WHEREclause — filtering based on another query - The
FROMclause — as an inline table (also called a derived table) - The
SELECTclause — as a scalar value
2. Non-correlated subqueries — independent inner query
A non-correlated subquery runs once and independently of the outer query. The inner query has no reference to the outer query.
sql-- Non-correlated: inner query runs ONCE SELECT product_id, name, price FROM products WHERE price > ( SELECT AVG(price) -- runs once, returns one number FROM products );
The subquery SELECT AVG(price) FROM products runs once, returns a single value (say, 45.00), and then the outer query filters all products with price > 45.00.
Non-correlated subqueries are efficient and straightforward.
3. Correlated subqueries — references the outer query
A correlated subquery references a column from the outer query. It runs once per row of the outer query.
sql-- Correlated: inner query runs ONCE PER ROW SELECT e.employee_id, e.name, e.salary, e.department_id FROM employees e WHERE e.salary > ( SELECT AVG(salary) FROM employees WHERE department_id = e.department_id -- references outer query's e.department_id );
For each employee row, the subquery calculates the average salary for THAT employee's department. The e.department_id reference ties the inner query to the current outer row.
Performance note: correlated subqueries can be slow on large tables because they run once per row. A JOIN or window function is often faster — but correlated subqueries are cleaner to read and perfectly acceptable in interviews.
4. Scalar subqueries — return a single value
A scalar subquery returns exactly one row and one column. It can appear anywhere a single value is expected — including in SELECT.
sql-- Scalar subquery in SELECT SELECT product_id, name, price, price - (SELECT AVG(price) FROM products) AS diff_from_avg FROM products;
sql-- Scalar subquery in WHERE SELECT name FROM employees WHERE department_id = ( SELECT id FROM departments WHERE name = 'Engineering' );
If a scalar subquery returns more than one row, it throws an error. Always be certain it returns at most one row — use LIMIT 1 or aggregate functions to guarantee this.
5. Subqueries in the FROM clause — derived tables
A subquery in FROM acts as a temporary table for the outer query.
sqlSELECT country, avg_spend FROM ( SELECT c.country, AVG(o.amount) AS avg_spend FROM customers c JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.country ) AS country_averages WHERE avg_spend > 500 ORDER BY avg_spend DESC;
You must alias derived tables (here: country_averages). This is equivalent to a CTE and often interchangeable — use whichever is more readable.
6. EXISTS vs IN — an important interview distinction
Both EXISTS and IN can answer "does a matching row exist?" but they work differently.
IN with a subquery:
sqlSELECT name FROM customers WHERE customer_id IN ( SELECT customer_id FROM orders WHERE amount > 1000 );
EXISTS:
sqlSELECT name FROM customers c WHERE EXISTS ( SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id AND o.amount > 1000 );
Key differences:
IN | EXISTS | |
|---|---|---|
| Handles NULLs | Poorly — IN with NULLs can return no rows | Correctly — EXISTS ignores NULL behavior |
| Performance on large sets | Can be slower | Often faster — stops at first match |
| Readability | Cleaner for simple lists | Better for complex conditions |
| Correlated | No | Yes (references outer query) |
Interview tip: if the subquery could return NULLs, prefer
EXISTS. Also mention thatEXISTSshort-circuits — it stops checking as soon as one matching row is found.
7. NOT EXISTS — the anti-join alternative
"Find customers who have never placed an order."
sqlSELECT c.customer_id, c.name FROM customers c WHERE NOT EXISTS ( SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id );
This is cleaner than NOT IN when NULLs might be present in the subquery result. NOT IN with any NULL in the subquery returns zero rows — a common silent bug.
sql-- Dangerous if orders.customer_id has any NULLs: WHERE customer_id NOT IN (SELECT customer_id FROM orders) -- Safe alternative: WHERE NOT EXISTS (SELECT 1 FROM orders WHERE orders.customer_id = customers.customer_id)
8. Subqueries vs CTEs vs JOINs — when to use each
| Situation | Best approach |
|---|---|
| Simple one-time filter | Subquery in WHERE |
| Scalar value in SELECT | Scalar subquery |
| Multi-step logic | CTE |
| Reusing same result multiple times | CTE |
| Combining rows from related tables | JOIN |
| "Does a matching row exist?" | EXISTS |
| "Find rows with no match" | NOT EXISTS or LEFT JOIN + IS NULL |
9. Common interview patterns
Find the second highest salary:
sqlSELECT MAX(salary) AS second_highest FROM employees WHERE salary < (SELECT MAX(salary) FROM employees);
Find employees earning more than their manager:
sqlSELECT e.name, e.salary FROM employees e WHERE e.salary > ( SELECT m.salary FROM employees m WHERE m.employee_id = e.manager_id );
Find products never ordered:
sqlSELECT product_id, name FROM products WHERE product_id NOT IN ( SELECT DISTINCT product_id FROM order_items WHERE product_id IS NOT NULL );
10. Practice subquery problems on TheQueryLab
Subqueries appear in a wide range of interview problems — from simple filters to complex multi-step analytics. The best way to get comfortable is to practice on realistic schemas where the subquery is genuinely necessary.
The SQL Interview Preparation Kit on TheQueryLab includes problems specifically designed to test subquery understanding — including correlated subqueries, EXISTS patterns, and scalar subqueries in SELECT.
Once you can write any subquery type instinctively, SQL interviews become significantly easier.
