Aggregate Commands
Using ORDER BY
The ORDER BY
clause is used to sort the result set.
SELECT * FROM Persons
ORDER BY Age;
SELECT * FROM Persons
ORDER BY LastName DESC;
ASC
is the default order (ascending).Use
DESC
for descending order.
Using Aggregate Functions
Aggregate functions summarize information across multiple rows. Common aggregate functions include:
COUNT()
– Returns the number of rows or non-NULL values.SUM()
– Calculates the total sum.AVG()
– Calculates the average.MAX()
– Returns the maximum value.MIN()
– Returns the minimum value.
SELECT COUNT(*) FROM Persons;
SELECT AVG(Age) FROM Persons;
SELECT MAX(Age), MIN(Age) FROM Persons;
SELECT SUM(Salary) FROM Employees;
More Examples
SELECT COUNT(*) FROM EMPLOYEE;
SELECT COUNT(*)
FROM EMPLOYEE, DEPARTMENT
WHERE DNO = DNUMBER AND DNAME = 'Research';
SELECT COUNT(DISTINCT Salary) FROM EMPLOYEE;
SELECT COUNT(Salary) FROM EMPLOYEE;
SELECT SUM(SALARY), MAX(SALARY), MIN(SALARY), AVG(SALARY)
FROM EMPLOYEE;
SELECT MAX(SALARY), MIN(SALARY), AVG(SALARY)
FROM EMPLOYEE, DEPARTMENT
WHERE DNO = DNUMBER AND DNAME = 'Research';
Using GROUP BY
The GROUP BY
clause groups rows that have the same values in specified columns.
SELECT Age, COUNT(*)
FROM Persons
GROUP BY Age;
All non-aggregated columns in the SELECT
clause must appear in the GROUP BY
clause.
For each department, retrieve the department number, number of employees, and average salary
SELECT DNO, COUNT (*), AVG (SALARY)
FROM EMPLOYEE
GROUP BY DNO;
The EMPLOYEE tuples are divided into groups – each group having the same value for the grouping attribute DNO.
The COUNT and AVG functions are applied to each such group of tuples separately.
The SELECT-clause includes only the grouping attribute and the functions to be applied on each group of tuples.
A join condition can be used in conjunction with grouping
If NULLs exist in the grouping attribute, then a separate group is created for all tuples with a NULL value in the grouping attribute.
Example:
For each project, retrieve the project number, project name, and the number of employees who work on that project.
SELECT PNUMBER, PNAME, COUNT (*)
FROM PROJECT, WORKS_ON
WHERE PNUMBER=PNO
GROUP BY PNUMBER, PNAME;
Using HAVING with GROUP BY
The HAVING
clause is used to filter groups after aggregation.
SELECT Age, COUNT(*)
FROM Persons
GROUP BY Age
HAVING COUNT(*) > 1;
To retrieve the values of these functions for only those groups that satisfy certain conditions.
The HAVING-clause is used for specifying a selection condition on groups (rather than on individual tuples).
Example:
For each project on which more than two employees work, retrieve the project number, project name, and the number of employees who work on that project.
SELECT PNUMBER, PNAME, COUNT(*)
FROM PROJECT, WORKS_ON
WHERE PNUMBER=PNO
GROUP BY PNUMBER, PNAME
HAVING COUNT (*) > 2;
WHERE vs HAVING
WHERE
filters rows before grouping.HAVING
filters groups after aggregation.
Incorrect:
SELECT Age
FROM Persons
WHERE COUNT(*) > 1; -- ❌ Invalid
Correct:
SELECT Age
FROM Persons
GROUP BY Age
HAVING COUNT(*) > 1;
Aggregate functions cannot be used in WHERE — because the rows haven’t been grouped yet.
In HAVING, groups have been formed (by GROUP BY), so MySQL knows how to evaluate aggregates like COUNT(*) < 2
.
Example:
For each project, retrieve the project number, the project name, and the number of employees from department 5 who work on the project.
SELECT Pnumber, Pname, COUNT (*)
FROM PROJECT, WORKS_ON, EMPLOYEE
WHERE Pnumber=Pno
AND Ssn=Essn
AND Dno=5
GROUP BY Pnumber, Pname;
Example:
For each department that has more than five employees, retrieve the department number and the number of its employees who are making more than $40,000.
SELECT Dnumber, COUNT (*)
FROM DEPARTMENT, EMPLOYEE
WHERE Dnumber=Dno
AND Salary>40000
AND (SELECT Dno
FROM EMPLOYEE
GROUP BY Dno
HAVING COUNT (*) > 5);
Subqueries with Aggregate Functions
Get Maximum Salary
SELECT MAX(SALARY) FROM EMPLOYEE;
Get Employee(s) with Maximum Salary
SELECT ENAME
FROM EMPLOYEE
WHERE SALARY = (SELECT MAX(SALARY) FROM EMPLOYEE);
Get Second Highest Salary
SELECT MAX(SALARY)
FROM EMPLOYEE
WHERE SALARY < (SELECT MAX(SALARY) FROM EMPLOYEE);
Get Employee(s) with Second Highest Salary
SELECT ENAME
FROM EMPLOYEE
WHERE SALARY = (
SELECT MAX(SALARY)
FROM EMPLOYEE
WHERE SALARY < (SELECT MAX(SALARY) FROM EMPLOYEE)
);
Advanced GROUP BY Queries
Department-wise Employee Count
SELECT DEPT, COUNT(*)
FROM EMPLOYEES
GROUP BY DEPT;
Departments with Less Than 2 Employees
SELECT DEPT, COUNT(*)
FROM EMPLOYEES
GROUP BY DEPT
HAVING COUNT(*) < 2;
Names of Employees in Departments with < 2 Employees
SELECT ENAME
FROM EMPLOYEE
WHERE DEPT IN (
SELECT DEPT
FROM EMPLOYEE
GROUP BY DEPT
HAVING COUNT(*) < 2
);
Important Rule for Aggregates
When using aggregates in SELECT
, all non-aggregated columns must appear in GROUP BY
.
✅ Valid:
SELECT dept, COUNT(*)
FROM employee
GROUP BY dept;
❌ Invalid:
SELECT dept, COUNT(*)
FROM employee; -- No GROUP BY
Department-wise Maximum Salary and Employee Names
Show Highest Salary by Department
SELECT DEPT, MAX(SALARY)
FROM EMPLOYEE
GROUP BY DEPT;
Names of Employees Earning Department-wise Max Salary
SELECT *
FROM EMPLOYEE
WHERE SALARY IN (
SELECT MAX(SALARY)
FROM EMPLOYEE
GROUP BY DEPT
);
EXISTS Subqueries
What is EXISTS?
EXISTS
returnsTRUE
if a subquery returns any row.Used for checking the existence of related data.
Employees Working on at Least One Project
SELECT *
FROM EMPLOYEE1
WHERE EXISTS (
SELECT EID
FROM PROJECTS
WHERE EMPLOYEE1.EID = PROJECTS.EID
);
Combining WHERE, GROUP BY, and HAVING
Average Salary for Each Department (Only if Salary > 4000)
SELECT dept_id, AVG(salary) AS avg_salary
FROM employees
WHERE salary > 4000
GROUP BY dept_id;
Filter Departments Where Average Salary > 6000
SELECT dept_id, AVG(salary) AS avg_salary
FROM employees
WHERE salary > 4000
GROUP BY dept_id
HAVING AVG(salary) > 6000;
Explanation:
WHERE
filters rows (salary > 4000)GROUP BY
groups those filtered rows by departmentHAVING
filters groups where the average salary is > 6000