20 SQL Commands For Your Next Data Science Job
Imagine walking into your next data science interview with confidence, knowing you can answer any SQL question they throw at you.
SQL is a fundamental tool for Data Scientists and one of the most essential skills to have for your Resume. Whether querying a database, aggregating results, or cleaning data, SQL provides a powerful set of commands to handle these tasks. Here are 20 essential SQL statements that will cover 90% of your data science needs.
1. SELECT
The SELECT command finds and gets particular information from one or more data tables.
SELECT column1, column2
FROM table_name
WHERE condition;
SELECT * FROM customers WHERE age >= 18;
This retrieves all records where the customer’s age is at least 18.
2. JOIN (INNER, LEFT, RIGHT, OUTER)
Joins combine data from multiple tables based on a related column.
INNER JOIN: Returns only matching records.
SELECT orders.order_id, customers.customer_name FROM orders INNER JOIN customers ON orders.customer_id = customers.customer_id;
LEFT JOIN: Returns all records from the left table and matching records from the right table.
SELECT customers.customer_name, orders.order_id FROM customers LEFT JOIN orders ON customers.customer_id = orders.customer_id;
RIGHT JOIN: Similar to LEFT JOIN but includes all rows from the right table.
SELECT customers.customer_name, orders.order_id FROM customers RIGHT JOIN orders ON customers.customer_id = orders.customer_id;
FULL OUTER JOIN: Returns all records from both tables, including unmatched ones.
SELECT customers.customer_name, orders.order_idFROM customers FULL OUTER JOIN orders ON customers.customer_id = orders.customer_id ORDER BY customers.customer_name;
3. WHERE
Filters results based on specified conditions.
SELECT name, department, salary
FROM employees
WHERE department = 'Sales' AND salary > 50000;
4. GROUP BY
Groups data and applies aggregate functions.
SELECT department, AVG(salary) as avg_salary
FROM employees
GROUP BY department;
5. HAVING
Filters grouped results.
SELECT customer_id, SUM(quantity) AS total_quantity
FROM orders
GROUP BY customer_id
HAVING SUM(quantity) >= 50;
6. Window Functions
Perform calculations across a subset of rows related to the current row.
SELECT name, department,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;
7. UNION
Combines results from multiple queries.
SELECT name, city FROM customers WHERE city = 'New York'
UNION
SELECT name, city FROM employees WHERE city = 'New York';
8. CREATE
Creates a new table
CREATE TABLE customers(
id INT PRIMARY KEY,
name VARCHAR(50),
email VARCHAR(100),
phone VARCHAR(20)
);
9. INSERT
Adds new records to a table
INSERT INTO students (id, name, major, gpa)
VALUES (1234, 'John Doe', 'Computer Science', 3.5);
10. UPDATE
Modifies existing records
UPDATE students
SET major = 'Mathematics', gpa = 3.7
WHERE id = 1234;
11. DELETE
Removes specific records from a table.
DELETE FROM students WHERE id = 1234;
12. DROP
Deletes a table or database object permanently
DROP TABLE students;
13. ALTER
Modifies a table structure
ALTER TABLE employees ADD COLUMN age INT;
14. CASE
Creates conditional expressions.
SELECT name, salary,
CASE
WHEN salary > 70000 THEN 'High'
WHEN salary BETWEEN 50000 AND 70000 THEN 'Medium'
ELSE 'Low'
END AS salary_category
FROM employees;
15. EXISTS
Checks for the existence of records.
SELECT name FROM customers
WHERE EXISTS (SELECT 1 FROM orders WHERE orders.customer_id = customers.customer_id);
16. DISTINCT
Removes duplicate values
SELECT DISTINCT department FROM employees;
17. LIMIT
Restricts the number of results
SELECT * FROM customers LIMIT 10;
18. ORDER BY
Sorts Results
SELECT name, salary FROM employees ORDER BY salary DESC;
19. SUBQUERY
A query within another query.
SELECT name FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);
20. COALESCE
Returns the first non-null value.
SELECT name, COALESCE(phone, 'N/A') AS phone_number FROM customers;
SQL is an important tool for working with data. Learning these 20 commands will help you handle a vast range of data tasks quickly and well. Companies often ask about these commands in technical interviews, so it's good to practice using them often!