SQL Interview
SQL Interview
SQL Interview
1. What is SQL?
Structured Query Language is a database tool which is used to create
and access database to support software application.
2. What are different types of statements supported by SQL?
3. What is DBMS?
A Database Management System (DBMS) is a program that controls
creation, maintenance and use of a database.
DBMS can be termed as File Manager that manages data in a database
rather than saving it in file systems.
4. What is RDBMS?
RDBMS stands for Relational Database Management System. RDBMS
store the data into the collection of tables, which is related by common
fields between the columns of the table. It also provides relational
operators to manipulate the data stored into the tables.
Example: SQL Server.
5. Why do we use SQL constraints? Which constraints we can use while
creating database in SQL?
Constraints are used to set the rules for all records in the table. If any
constraints get violated then it can abort the action that caused it.
Constraints are defined while creating the database itself with CREATE
TABLE statement or even after the table is created once with ALTER
TABLE statement.
There are 5 major constraints are used in SQL, such as
NOT NULL: That indicates that the column must have some value and
cannot be left null
UNIQUE: This constraint is used to ensure that each row and column
has unique value and no value is being repeated in any other row or
column
PRIMARY KEY: This constraint is used in association with NOT NULL
and UNIQUE constraints such as on one or the combination of more
than one column to identify the particular record with a unique identity.
FOREIGN KEY: It is used to ensure the referential integrity of data in the
table and also matches the value in one table with another using Primary
Key
CHECK: It is used to ensure whether the value in columns fulfills the
specified condition
6. What are different JOINS used in SQL?
7. What is normalization?
Normalization is the process of minimizing redundancy and dependency
by organizing fields and table of a database. The main aim of
Normalization is to add, delete or modify field that can be made in a
single table.
22. How to delete DUPLICATE records from a table using a SQL Query?
Consider the same EMPLOYEE table as source discussed in previous
question
Using ROWID and ROW_NUMBER Analytic Function
23.How to read TOP 5 records from a table using a SQL query?
24. How to read LAST 5 records from a table using a SQL query?
MINUS operator is used to return all rows in the first SELECT statement
that are not present in the second SELECT statement.
25. How to find the employee with second MAX Salary using a SQL
query?
In order to find the second MAX salary, employee record with MAX
salary needs to be eliminated. It can be achieved by using below SQL
query.
26. How to find the employee with third MAX Salary using a SQL query
without using Analytic Functions?
In order to find the third MAX salary, we need to eliminate the top 2
salary records. But we cannot use the same method we used for finding
second MAX salary (not a best practice). Imagine if we have to find the
fifth MAX salary. We should not be writing a query with four nested sub
queries.
STEP-1:
The approach here is to first list all the records based on Salary in the
descending order with MAX salary on top and MIN salary at bottom.
Next, using ROWNUM select the top 2 records.
STEP-2:
Next find the MAX salary from EMPLOYEE table which is not one of top
two salary values fetched in the earlier step.
STEP-3:
In order to fetch the entire employee record with third MAX salary we
need to do a self-join on Employee table based on Salary value.
A CHECK constraint is used to limit the values or type of data that can
be stored in a column. They are used to enforce domain integrity.
31. Assume you have the below tables on sessions that users have, and
a users table. Write a query to get the active user count of daily cohorts.
By definition, daily cohorts are active users from a particular day. First,
we can use a subquery to get the sessions of new users by day using an
inner join with users. This is to filter for only active users by a particular
join date for the cohort. Then we can get a distinct count to return the
active user count:
32. Assume you are given the below table on transactions from users for
purchases. Write a query to get the list of customers where their earliest
purchase was at least $50.
First, we need to get the latest transaction date for each user, along with
the number of products they have purchased. This can be done in a
subquery where we GROUP BY user_id and take a COUNT(DISTINCT
product_id) to get the number of products they have purchased, and a
MAX(transaction_date) to get the latest transaction date (while casting to
a date). Then, using this subquery, we can simply do an aggregation by
the transaction date column in the previous subquery, while doing a
COUNT() on the number of users, and a SUM() on the number of
products:
34. Assume you are given the below tables on users and their time spent
on sending and opening Snaps. Write a query to get the breakdown for
each age breakdown of the percentage of time spent on sending versus
opening snaps.
We can get the breakdown of total time spent on each activity by each
user by filtering out for the activity_type and taking the sum of time spent.
In doing this, we want to do an outer join with the age bucket to get the
total time by age bucket for both activity types. This results in the below
two subqueries. Then, we can use these two subqueries to sum them by
joining on the appropriate age bucket and take the proportion for send
time and the proportion for open time per age bucket:
35. Assume you are given the below table on reviews from users. Define
a top-rated place as a business whose reviews only consist of 4 or 5
stars. Write a query to get the number and percentage of businesses that
are top-rated places.
First, we need to get the places where the reviews are all 4 or 5 stars.
We can do this using a HAVING clause, instead of a WHERE clause
since the reviews need to all be 4 stars or above. For the HAVING
condition, we can use a CASE statement that filters for 4 or 5 stars and
then take a SUM over them. This can then be compared with the total
row count of the particular business_id reviews to ensure that the count
of top reviews matches with the total review count. With the relevant
businesses, we can then do an outer join with the original table on
business_id to get a COUNT of distinct business_id matches, and then
the percentage by comparing the COUNT from the top places with the
overall COUNT of business_id:
36. What is an SQL View?
37.
38.
39.
40.
Which of the following is the correct outcome of the SQL query below?
Solution: A
The query will extract the course ids where student receive the grade “C”
in the course.
The above query first joined the ENROLLED and STUDENT tables then
it will evaluate the where condition and then it will return the name, grade
of the students, those took 15-415 and got a grade ‘A’ or ‘B’ in the
course. But for the given two tables it will give zero records in output.
42. Which of the following query will find all the unique students who
have taken more than one course?
43. What are the tuples additionally deleted to preserve reference
integrity when the rows (2,4) are deleted from the below table. Suppose
you are using ‘ON DELETE CASCADE’.
Query: SELECT Company, AVG(Salary) FROM AV1 HAVING AVG(Salary) > 1200 GROUP BY
Company WHERE Salary > 1000 ;
This query will give the error because ‘WHERE’ is always evaluated before ‘GROUP BY’
and ‘Having’ is always evaluated after ‘GROUP BY’.
There are many ways to find the second highest salary of an Employee
in SQL, you can either use SQL Join or Subquery to solve this problem.
Here is an SQL query using Subquery:
47. SQL Query to find Max Salary from each department.
You can find the maximum salary for each department by grouping all
records by DeptId and then using MAX() function to calculate the
maximum salary in each group or each department.
These questions become more interesting if the Interviewer will ask you
to print the department name instead of the department id, in that case,
you need to join the Employee table with Department using the foreign
key DeptID, make sure you do LEFT or RIGHT OUTER JOIN to include
departments without any employee as well.
In this query, we have used RIGHT OUTER JOIN because we need the
name of the department from the Department table which is on the right
side of the JOIN clause, even if there is no reference of dept_id on the
Employee table.
49. Write an SQL Query to print the name of the distinct employee whose
DOB is between 01/01/1960 to 31/12/1975.
50. Write an SQL Query to find an employee whose salary is equal to or
greater than 10000.
51. Write SQL Query to find duplicate rows in a database? and then write
SQL query to delete them?
to Delete:
52. How do you find all employees who are also managers?
this will show employee name and manager name in two columns like
53. The Trips table holds all taxi trips. Each trip has a unique Id, while
Client_Id and Driver_Id are both foreign keys to the Users_Id at
the Users table. Status is an ENUM type of (‘completed’,
‘cancelled_by_driver’, ‘cancelled_by_client’).
Write a SQL query to find the cancellation rate of requests made by
unbanned users between Oct 1, 2013 and Oct 3, 2013. For the above
tables, your SQL query should return the following rows with the
cancellation rate being rounded to two decimal places.
The solution looks like that:
select
result.Request_at as "Day",
round(sum(case when result.Status = 'completed' then 0 else 1
end) / count(*), 2) as "Cancellation Rate"
from (
select
Driver_Id,
Status,
Request_at
from trips left join users on trips.client_id=users.users_id
where users.banned = 'NO'
) result
left join users on result.driver_id=users.users_id
where
users.Banned ='NO'
and result.Request_at between '2013-10-01' and '2013-10-03'
group by
result.Request_at
54. Write a SQL query to find all duplicate emails in a table named
Person.
For example, your query should return the following for the above table:
Solution:
Since all email are in lowercase we can simply groupby email and print
those that have a count >1.
55. Given a Weather table, write a SQL query to find all dates' Ids with
higher temperature compared to its previous (yesterday's) dates.
The solution is to join the table to itself when the dates differ by one
day (DATEDIFF() function) and make sure that the temperature is higher
than the previous date.
63.What are user defined datatypes and when you should go for them?
User defined datatypes let you extend the base SQL Server datatypes by
providing a descriptive name, and format to the database. Take for
example, in your database, there is a column called Flight_Num which
appears in many tables. In all these tables it should be varchar(8). In this
case you could create a user defined datatype called Flight_num_type of
varchar(8) and use it across all your tables.
Lock escalation is the process of converting a lot of low level locks (like
row locks, page locks) into higher level locks (like table locks). Every lock
is a memory structure too many locks would mean, more memory being
occupied by locks. To prevent this from happening, SQL Server
escalates the many fine-grain locks to fewer coarse-grain locks. Lock
escalation threshold was definable in SQL Server 6.5, but from SQL
Server 7.0 onwards it's dynamically managed by SQL Server.
67. What is a table called, if it has neither Cluster nor Non-cluster Index?
What is it used for?
If a table is dropped, all things associated with the tables are dropped as
well. This includes - the relationships defined on the table with other
tables, the integrity checks and constraints, access privileges and other
grants that the table has. To create and use the table again in its original
form, all these relations, checks, constraints, privileges and relationships
need to be redefined. However, if a table is truncated, none of the above
problems exist and the table retains its original structure.
Data Integrity defines the accuracy as well as the consistency of the data
stored in a database. It also defines integrity constraints to enforce
business rules on the data when it is entered into an application or a
database.
There are a lot of ways to fetch characters from a string. For example:
A primary key is used to uniquely identify all table records. It cannot have
NULL values, and it must contain unique values. A table can have only
one primary key that consists of single or multiple fields.
The key which can accept only the null value and cannot accept the
duplicate values is called Unique Key. The role of the unique key is to
make sure that each column and row are unique.
88. What is the difference between Primary key and Unique Key?
Both Primary and Unique key carry unique values but the primary key
can not have a null value where the Unique key can. And in a table,
there cannot be more than one Primary key but unique keys can be
multiple.
Entities: Entity can be a person, place, thing, or any identifiable object for which data
can be stored in a database.
The stuff function deletes a part of the string and then inserts another
part into the string starting at a specified position.
A stored procedure is a prepared SQL code that can be saved and reused. In
other words, we can consider a stored procedure to be a function
consisting of many SQL statements to access the database system. We
can consolidate several SQL statements into a stored procedure and
execute them whenever and wherever required.
A stored procedure can be used as a means of modular programming,
i.e., we can create a stored procedure once, store it, and call it multiple
times as required. This also supports faster execution when compared to
executing multiple queries.
94. From the following user activity table, write a query to return the
fraction of users who are retained (show some activity) a given number of
days after joining. By convention, users are considered active on their
join day (day 0).
95. From the given trips and users tables for a taxi service, write
a query to return the cancellation rate in the first two days in
October, rounded to two decimal places, for trips not involving
banned riders or drivers. From LeetCode.
96. Explain the difference between OLTP and OLAP.
Self Join in SQL is used for joining a table with itself. Here, depending
upon some conditions, each row of the table is joined with itself and with
other rows of the table.
98. What is the difference between Union and Union All operators?
The Intersect operator helps combine two select statements and returns
only those records that are common to both the select statements. So,
after we get Table A and Table B over here and if we apply the Intersect
operator on these two tables, then we will get only those records that are
common to the result of the select statements of these two.