Preparing for a SQL interview requires a solid understanding of database concepts and the SQL language. This comprehensive list covers 20 important SQL interview questions with detailed answers to help you ace your interview:
1. What is SQL?
Answer: SQL (Structured Query Language) is a standard programming language designed for managing and manipulating relational databases. It allows users to define, query, update, and control data within a database.
2. What are the different types of SQL commands?
Answer: SQL commands are broadly categorized into:
- Data Definition Language (DDL): Commands used to define the database schema, such as `CREATE`, `ALTER`, `DROP`, `TRUNCATE`.
- Data Manipulation Language (DML): Commands used to manipulate data within the database, such as `SELECT`, `INSERT`, `UPDATE`, `DELETE`.
- Data Control Language (DCL): Commands used to control access to data and database objects, such as `GRANT`, `REVOKE`.
- Transaction Control Language (TCL): Commands used to manage transactions within the database, such as `COMMIT`, `ROLLBACK`, `SAVEPOINT`.
- Data Query Language (DQL): Technically a subset of DML, focusing specifically on retrieving data using the `SELECT` command.
3. What is a database schema?
Answer: A database schema is a blueprint of how a database is organized. It defines the tables, columns within each table, relationships between tables, constraints, and other database objects. It essentially describes the structure of the data.
4. What is a primary key?
Answer: A primary key is a column or a set of columns in a table that uniquely identifies each row in that table. It must have the following properties:
- It must contain unique values.
- It cannot contain NULL values.
- A table can have only one primary key.
5. What is a foreign key?
Answer: A foreign key is a column or a set of columns in one table that refers to the primary key of another table. It establishes a link or relationship between the two tables and is used to enforce referential integrity.
6. What are SQL joins? List different types of joins.
Answer: SQL joins are used to combine rows from two or more tables based on a related column between them. Different types of joins include:
- INNER JOIN: Returns rows only when there is a match in both tables.
- LEFT JOIN (or LEFT OUTER JOIN): Returns all rows from the left table and the matching rows from the right table. If there is no match in the right table, NULL values are returned for the columns of the right table.
- RIGHT JOIN (or RIGHT OUTER JOIN): Returns all rows from the right table and the matching rows from the left table. If there is no match in the left table, NULL values are returned for the columns of the left table.
- FULL JOIN (or FULL OUTER JOIN): Returns all rows when there is a match in either the left or the right table. If there is no match in a table, NULL values are returned for its columns. (Note: Some older SQL implementations may not directly support `FULL JOIN` and require workarounds using `UNION ALL` with `LEFT JOIN` and `RIGHT JOIN`).
- CROSS JOIN (or CARTESIAN JOIN): Returns the Cartesian product of all rows from all the joined tables.
7. What is the difference between `WHERE` and `HAVING` clauses?
Answer: Both `WHERE` and `HAVING` clauses are used to filter data, but they operate at different stages:
- `WHERE` Clause: Used to filter rows before any grouping occurs. It operates on individual rows and is applied to the data source directly.
- `HAVING` Clause: Used to filter groups after the `GROUP BY` clause has been applied. It operates on the aggregated results of the groups.
8. What are aggregate functions in SQL? List some common ones.
Answer: Aggregate functions perform calculations on a set of rows and return a single summary value. Common aggregate functions include:
- `COUNT()`: Returns the number of rows.
- `SUM()`: Returns the sum of values in a column.
- `AVG()`: Returns the average of values in a column.
- `MIN()`: Returns the minimum value in a column.
- `MAX()`: Returns the maximum value in a column.
9. What is the `GROUP BY` clause in SQL?
Answer: The `GROUP BY` clause is used to group rows that have the same values in one or more columns into summary rows, like “find the number of customers in each country”. It is often used in conjunction with aggregate functions to perform calculations on these groups.
10. What is the `ORDER BY` clause in SQL? How can you sort in ascending and descending order?
Answer: The `ORDER BY` clause is used to sort the result set of a query based on one or more columns. To sort in:
- Ascending order (default): Use the keyword `ASC` after the column name (e.g., `ORDER BY column_name ASC`). If `ASC` is not specified, the default is ascending order.
- Descending order: Use the keyword `DESC` after the column name (e.g., `ORDER BY column_name DESC`).
You can sort by multiple columns by listing them in the `ORDER BY` clause, separated by commas. The sorting will be performed based on the order of the columns specified.
11. What are SQL constraints? List different types of constraints.
Answer: SQL constraints are rules applied to the data in a table to ensure data integrity and consistency. Common types of constraints include:
- `NOT NULL`: Ensures that a column cannot have NULL values.
- `UNIQUE`: Ensures that all values in a column are unique.
- `PRIMARY KEY`: A combination of `NOT NULL` and `UNIQUE`, uniquely identifies each row in a table.
- `FOREIGN KEY`: Establishes a link between the data in two tables.
- `CHECK`: Ensures that all values in a column satisfy a specified condition.
- `DEFAULT`: Provides a default value for a column when no value is specified during insertion.
12. What is the difference between `DELETE` and `TRUNCATE` commands?
Answer: Both `DELETE` and `TRUNCATE` are used to remove rows from a table, but they differ in several ways:
- Logging: `DELETE` operations are logged individually, so you can rollback the changes. `TRUNCATE` is a DDL command and performs minimal logging, making it faster but unrecoverable via rollback.
- Identity Reset: `TRUNCATE` resets the identity (auto-increment) counter of the table to its initial seed value. `DELETE` does not reset the identity counter.
- Triggers: `DELETE` can activate triggers defined on the table. `TRUNCATE` does not activate triggers.
- Locking: `TRUNCATE` typically acquires a table-level lock, while `DELETE` can acquire row-level locks (depending on the database system).
- WHERE Clause: You can use a `WHERE` clause with `DELETE` to remove specific rows. `TRUNCATE` removes all rows from the table.
13. What are SQL indexes? Why are they used?
Answer: SQL indexes are special lookup tables that the database search engine can use to speed up data retrieval. An index is created on one or more columns of a table. Similar to the index in a book, it allows the database to quickly locate specific rows without scanning the entire table.
Why they are used:
- Improve the speed of `SELECT` queries and other data retrieval operations.
- Can enforce uniqueness on columns (UNIQUE index).
However, indexes can slow down data modification operations (e.g., `INSERT`, `UPDATE`, `DELETE`) because the index also needs to be updated.
14. What is normalization in SQL? What are the different normal forms?
Answer: Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves dividing larger tables into smaller, more manageable tables and defining relationships between them.
Different Normal Forms:
- 1NF (First Normal Form): Each column contains atomic (indivisible) values, and there are no repeating groups of columns.
- 2NF (Second Normal Form): Must be in 1NF and all non-key attributes are fully functionally dependent on the entire primary key.
- 3NF (Third Normal Form): Must be in 2NF and all non-key attributes are non-transitively dependent on the primary key (no dependency on other non-key attributes).
- BCNF (Boyce-Codd Normal Form): A stricter form of 3NF. Every determinant is a candidate key.
- 4NF (Fourth Normal Form): Must be in BCNF and there are no multi-valued dependencies.
- 5NF (Fifth Normal Form) or PJNF (Projection-Join Normal Form): Deals with join dependencies and aims to eliminate redundancy caused by them.
While higher normal forms reduce redundancy, they can sometimes lead to more complex queries involving more joins. The level of normalization to apply often involves a trade-off between redundancy and query performance.
15. What is a subquery in SQL?
Answer: A subquery (or inner query) is a query nested inside another SQL query (the outer query). It is used to retrieve data that will be used by the outer query. Subqueries can appear in the `SELECT`, `FROM`, `WHERE`, and `HAVING` clauses.
Types of Subqueries:
- Scalar subquery: Returns a single value.
- Column subquery: Returns a single column with multiple rows.
- Row subquery: Returns a single row with multiple columns.
- Table subquery: Returns a table (multiple rows and columns), often used in the `FROM` clause.
- Correlated subquery: A subquery that depends on the outer query for its values.
- Non-correlated subquery: A subquery that can be executed independently of the outer query.
16. What are SQL views? Why are they used?
Answer: A SQL view is a virtual table based on the result of a SQL statement. It is stored as a named object in the database. Views do not store data themselves; they simply provide a different way to look at the data in the underlying tables.
Why they are used:
- Simplified Queries: Can hide complex query logic, making it easier for users to retrieve data.
- Data Security: Can restrict access to certain columns or rows of a table.
- Data Independence: Changes to the underlying table structure may not affect the applications that use the view, provided the view definition remains consistent.
- Data Integrity: Can enforce certain data constraints or present calculated data consistently.
17. What are stored procedures in SQL? What are their advantages?
Answer: A stored procedure is a set of SQL statements with an assigned name, which is stored in the database. It can accept input parameters and return output parameters.
Advantages of Stored Procedures:
- Improved Performance: Stored procedures are compiled and stored in the database, reducing the overhead of parsing and compiling SQL statements each time they are executed.
- Enhanced Security: You can grant users permission to execute stored procedures without giving them direct access to the underlying tables.
- Code Reusability: Stored procedures can be called multiple times by different applications.
- Reduced Network Traffic: Instead of sending multiple SQL statements over the network, an application can execute a single call to a stored procedure.
- Data Integrity: Stored procedures can encapsulate complex business logic and enforce data consistency rules.
18. What are triggers in SQL? When are they used?
Answer: Triggers are database objects that are automatically executed in response to certain events (e.g., `INSERT`, `UPDATE`, `DELETE`) that occur on a table.
When they are used:
- Enforcing complex business rules or data integrity constraints that cannot be defined by standard constraints.
- Auditing changes to data in a table.
- Maintaining derived data (e.g., updating a summary table when details are changed).
- Implementing complex security authorizations.
- Generating automatic notifications or alerts based on data changes.
19. What is a transaction in SQL? What are ACID properties?
Answer: A transaction is a sequence of one or more SQL operations that are treated as a single logical unit of work. Either all operations in a transaction are successfully completed (committed), or none of them are (rolled back).
ACID Properties: Transactions in a relational database management system (RDBMS) typically adhere to the ACID properties:
- Atomicity: The entire transaction is treated as a single, indivisible unit. Either all changes within the transaction are applied, or none are.
- Consistency: A transaction must move the database from one valid state to another. It must preserve all the defined rules and constraints of the database.
- Isolation: Multiple transactions executing concurrently should appear to each other as if they are executing sequentially. The intermediate state of one transaction should not be visible to other transactions.
- Durability: Once a transaction is committed, the changes are permanent and will survive subsequent system failures (e.g., power outages, crashes).
20. Explain the concept of SQL injection and how to prevent it.
Answer: SQL injection is a code injection technique used to attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution (e.g., to dump the database content to the attacker). This can occur when user input is not properly sanitized or parameterized before being used in a SQL query.
How to prevent SQL injection:
- Use Parameterized Queries (Prepared Statements): This is the most effective way to prevent SQL injection. With parameterized queries, the SQL structure is defined first, and then user-provided values are passed as parameters. The database treats these parameters as data, not as executable SQL code.
- Input Validation and Sanitization: Validate user input to ensure it conforms to expected formats and lengths. Sanitize input by escaping or removing potentially harmful characters. However, this should be used as a secondary defense and not as the primary prevention method.
- Principle of Least Privilege: Grant database users only the necessary permissions they need to perform their tasks. Avoid using the `root` or `administrator` database user in applications.
- Web Application Firewall (WAF): WAFs can help detect and block malicious SQL injection attempts.
- Regular Security Audits and Testing: Conduct regular security assessments of your applications and databases to identify and address potential vulnerabilities.
Leave a Reply