Complete Guide of CRUD in SQL

Let’s start the journey of learning SQL queries by learning about CRUD Operations. Okay, tell me one thing. Let’s say there is a table in which we are storing information about students. What can we do in that table or its entries?

Correct. Primarily, on any entity stored in a table, there are 4 operations possible:

  • Create (or insert a new entry)
  • Read (fetching some entries)
  • Update (updating information about an entry already stored)
  • Delete (deleting an entry)

For learning CRUD operations, we will use Sakila database.

Today we are going to discuss about these operations in detail. Understand that read queries can get a lot more complex, involving aggregate functions, subqueries etc.

What is the Sakila Database

Let me give you all a brief idea about what the Sakila database represents so that it is easy to relate to the conversations that we shall have around this over the coming weeks. The Sakila database represents a digital video rental store, assuming an old movie rental store before Netflix etc came. It’s designed with functionality that would allow for all the operations of such a business, including transactions like renting films, managing inventory, and storing customer and staff information. For example, it has tables regarding films, actors, customers, staff, stores, payments etc.


Create

CREATE Query

First of all, what is SQL? SQL stands for Structured Query Language. It is a language used to interact with relational databases. It allows you to create tables, fetch data from them, update data, manage user permissions etc. Why “Structured Query” because it allows to query over data arranged in a structured way. Eg: In Relational databases, data is structured into tables.

A simple query to create a table in MySQL has:

  • column names
  • data type of column (integer, varchar, boolean, date, timestamp)
  • properties of column (unique, not null, default)

Similarly, the table could also have properties (primary key, key)

Here we are creating a table called students. Inside brackets, we mention the different columns that this table has. Along with each column, we mention the data type of that column. E.g.: firstName is of type VARCHAR(50). After the data type, we mention any constraints on that column. Eg: NOT NULL means that this column cannot be null. if we try to not put a value in this column, we will get an error. UNIQUE means that this column cannot have duplicate values. If we insert a new row in a table or update an existing row that leads to 2 rows having the same value of this column, the query will fail and we will get an error. DEFAULT specifies that if no value is provided for this column, it will take the given value. For example, for enrollmentDate, it will take the value of current_timestamp, which is the time when you are inserting the row. CHECK (PSP BETWEEN 0.00 AND 100.00) means that the value of this column should be between 0.00 and 100.00. If some other value is put, the query will fail.

INSERT Query

CREATE TABLE students (
    id INT AUTO_INCREMENT,
    firstName VARCHAR(50) NOT NULL,
    lastName VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    dateOfBirth DATE NOT NULL,
    enrollmentDate TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    psp DECIMAL(3, 2) CHECK (psp BETWEEN 0.00 AND 100.00),
    batchId INT,
    isActive BOOLEAN DEFAULT TRUE,
    PRIMARY KEY (id),
);

Now let’s start with the first set of operations for the day: The Create Operation. As the name suggests, this operation is used to create new entries in a table. Let’s say we want to add a new film to the database. How do we do that?

INSERT statement in MySQL is used to insert new entries in a table. Let’s see how we can use it to insert a new film in the film table of the Sakila database.

INSERT INTO film (title, description, release_year, language_id, rental_duration, rental_rate, length, replacement_cost, rating, special_features) 
VALUES ('The Dark Knight', 'Batman fights the Joker', 2008, 1, 3, 4.99, 152, 19.99, 'PG-13', 'Trailers'),
       ('The Dark Knight Rises', 'Batman fights Bane', 2012, 1, 3, 4.99, 165, 19.99, 'PG-13', 'Trailers'),
       ('The Dark Knight Returns', 'Batman fights Superman', 2016, 1, 3, 4.99, 152, 19.99, 'PG-13', 'Trailers');

Let’s dive through the syntax of the query. First, we have the INSERT INTO clause, which is used to specify the table in which we want to insert the new entry. Then we have the column names in the brackets, which are the columns in which we want to insert the values. Then we have the VALUES clause, which is used to specify the values that we want to insert in the columns. The values are specified in the same order as the columns are specified in the INSERT INTO clause. So the first value in the VALUES clause will be inserted in the first column specified in the INSERT INTO clause, and so on.

Anyways, an example of a query without column names is as follows:

INSERT INTO film
VALUES (default, 'The Dark Knight', 'Batman fights the Joker', 2008, 1, NULL, 3, 4.99, 152, 19.99, 'PG-13', 'Trailers', default);

NULL is used to specify that the value of that column should be NULL, and default is used to specify that the value of that column should be the default value specified for that column. Example: film_id is an auto-increment column, so we don’t need to specify its value. So we can specify a default for that column, which will insert the next auto-increment value in that column.


Read

SELECT statement is used to read data from a table. Let’s see how we can use it to read data via different queries on the film table of the Sakila database. A basic select query is as follows:

SELECT * FROM film;

Here we are selecting all the columns from the film table. The * is used to select all the columns. This query will give you the value of each column in each row of the film table. If we want to select only specific columns, then we can specify the column names instead of *. Example:

SELECT title, description, release_year FROM film;

Here we are selecting only the title, description and release_year columns from the film table. Note that the column names are separated by commas. Also, the column names are case-insensitive, so the title and TITLE are the same. Example following query would have also given the same result

SELECT TITLE, DESCRIPTION, RELEASE_YEAR FROM film;

Now, let’s learn some nuances about the SELECT statement.

Selecting Distinct Values

Let’s say we want to select all the distinct values of the rating column from the film table. How do we do that? We can use the DISTINCT keyword to select distinct values. Example:

SELECT DISTINCT rating FROM film;

This query will give you all the distinct values of the rating column from the film table. Note that the DISTINCT keyword, as all other keywords in MySQL, is case-insensitive, so DISTINCT and distinct are the same.

We can also use the DISTINCT keyword with multiple columns. Example:

SELECT DISTINCT rating, release_year FROM film;

This query will give you all the distinct values of the rating and release_year columns from the film table.

Let’s try to understand the above query with a pseudo code. The pseudo-code for the above query would be as follows:

answer = []

for each row in film:
    answer.append(row)

filtered_answer = []

for each row in answer:
    filtered_answer.append(row['rating'], row['release_year'])

unique_answer = set(filtered_answer)

return unique_answer

So what you see is that the DISTINCT keyword on multiple columns gives you for all of the rows in the table, the distinct value of a pair of these columns.

Select a statement to print a constant value

Let’s say we want to print a constant value in the output. Eg: The first program that almost every programmer writes: “Hello World”. How do we do that? We can use the SELECT statement to print a constant value. Example:

SELECT 'Hello World';
SELECT title, 'Hello World' FROM film;

That’s it. No form, nothing. Just the value. You can also combine it with other columns. Example:

Operations on Columns

Let’s say we want to select the title and length columns from the film table. If you see, the value of length is currently in minutes, but we want to select the length in hours instead of minutes. How do we do that? We can use the SELECT statement to perform operations on columns. Example:

SELECT title, length/60 FROM film;

Later in the course, we will learn about built-in functions in SQL as well. You can use those functions as well to perform operations on columns. Example:

SELECT title, ROUND(length/60) FROM film;

The ROUND function is used to round off a number to the nearest integer. So the above query will give you the title of the film, and the length of the film in hours, rounded off to the nearest integer.

Inserting Data from Another Table

BTW, select can also be used to insert data in a table. Let’s say we want to insert all the films from the film table into the film_copy table. We can combine the SELECT and INSERT INTO statements to do that. Example:

Here we are using the SELECT statement to select all the columns from the film table, and then using the INSERT INTO statement to insert the selected data into the film_copy table. Note that the column names in the INSERT INTO clause and the SELECT clause are the same, and the values are inserted in the same order as the columns are specified in the INSERT INTO clause. So the first value in the SELECT clause will be inserted in the first column specified in the INSERT INTO clause, and so on.

Till now, we have been doing basic read operations. SELECT query with only a FROM a clause is rarely sufficient. Rarely do we want to return all rows. Often we need to have some kind of filtering logic etc. for the rows that should be returned. Let’s learn how to do that.

WHERE Clause

Let’s say we want to select all the films from the film table which has a rating of PG-13. How do we do that? We can use the WHERE clause to filter rows based on a condition. Example:

SELECT * FROM film WHERE rating = 'PG-13';
INSERT INTO film_copy (title, description, release_year, language_id, rental_duration, rental_rate, length, replacement_cost, rating, special_features)
SELECT title, description, release_year, language_id, rental_duration, rental_rate, length, replacement_cost, rating, special_features
FROM film;

Here we are using the WHERE clause to filter rows based on the condition that the value of the rating column should be PG-13. Note that the WHERE clause is always used after the FROM clause. In terms of pseudocode, you can think of where clause to work as follows:

answer = []

for each row in film:
    if row.matches(conditions in where clause) # new line from above
        answer.append(row)

filtered_answer = []

for each row in answer:
    filtered_answer.append(row['rating'], row['release_year'])

unique_answer = set(filtered_answer) # assuming we also had DISTINCT

return unique_answer

If you seem where clause can be considered analogous to if in a programming language. With if as well there are many other operators that are used, right. Can you name which operators do we often use in programming languages with if?

AND, OR, NOT

Correct. We use things like and , or! in programming languages to combine multiple conditions. Similarly, we can use ANDORNOT operators in SQL as well. Example: We want to get all the films from the film table which has a rating of PG-13 and a release year of 2006. We can use the AND operator to combine multiple conditions.

SELECT * FROM film WHERE rating = 'PG-13' AND release_year = 2006;

Similarly, we can use the OR operator to combine multiple conditions. Example: We want to get all the films from the film table which has a rating of PG-13 or a release year of 2006. We can use the OR operator to combine multiple conditions.

SELECT * FROM film WHERE rating = 'PG-13' OR release_year = 2006;

Similarly, we can use the NOT operator to negate a condition. Example: We want to get all the films from the film table which does not have a rating of PG-13. We can use the NOT operator to negate the condition.

SELECT * FROM film WHERE NOT rating = 'PG-13';

A piece of advice on using these operators. If you are using multiple operators, it is always a good idea to use parentheses to make your query more readable. Otherwise, it can be difficult to understand the order in which the operators will be evaluated. Example:

SELECT * FROM film WHERE rating = 'PG-13' OR release_year = 2006 AND rental_rate = 0.99;

Here, it is not clear whether the AND an operator will be evaluated first or the OR operator. To make it clear, we can use parentheses. Example:

SELECT * FROM film WHERE rating = 'PG-13' OR (release_year = 2006 AND rental_rate = 0.99);

Till now, we have used only = for doing comparisons. Like traditional programming languages, MySQL also supports other comparison operators like ><>=<=!= etc. Just one special case, != can also be written as <> in MySQL. Example:

SELECT * FROM film WHERE rating <> 'PG-13';

IN Operator

With comparison operators, we can only compare a column with a single value. What if we want to compare a column with multiple values? For example, we want to get all the films from the film table which have a rating of PG-13 or R. One way to do that can be to combine multiple consitions using OR. A better way will be to use the IN operator to compare a column with multiple values. Example:

SELECT * FROM film WHERE rating IN ('PG-13', 'R');

Okay, now let’s say we want to get those films that have ratings anything other than the above 2. Any guesses how we may do that?

Correct! We had earlier discussed NOT. You can also use NOT before IN to negate the condition. Example:

SELECT * FROM film WHERE rating NOT IN ('PG-13', 'R');

Think of IN to be like any other operator. Just that it allows comparison with multiple values.

Hope you had a good break. Let’s continue with the session. In this second part of the session, we are going to start the discussion by discussing about another important keyword in SQL, BETWEEN.

IS NULL Operator

Now we are almost at the end of the discussion about different operators. Do you all remember how we store empties, that is, no value for a particular column for a particular row? We store it as NULL. Interestingly working with NULLs is a bit tricky. We cannot use the = operator to compare a column with NULL. Example:

SELECT * FROM film WHERE description = NULL;

The above query will not return any rows. Why? Because NULL is not equal to NULL. In fact, NULL is not equal to anything. Nor is it not equal to anything. It is just NULL.

Example:

SELECT NULL = NULL;

The above query will return NULL. Similarly, 3 = NULL , 3 <> NULL , NULL <> NULL will also return NULL. So, how do we compare a column with NULL? We use the IS NULL operator. Example:

SELECT * FROM film WHERE description IS NULL;

Similarly, we can use the IS NOT NULL operator to find all the rows where a particular column is not NULL. Example:

SELECT * FROM film WHERE description IS NOT NULL;

In many assignments, you will find that you will have to use the IS NULL and IS NOT NULL operators. Without them you will miss out on rows that had NULL values in them and get the wrong answer. Example: find customers with id other than 2. If you use = operator, you will miss out on the customer with id NULL. Example:

SELECT * FROM customers WHERE id != 2;

The above query will not return the customer with id NULL. So, you will get the wrong answer. Instead, you should use the IS NOT NULL operator. Example:

SELECT * FROM customers WHERE id IS NOT NULL AND id != 2;

Update

Now let’s move to learn U of CRUD. Update and Delete are thankfully much simpler, so don’t worry, we will be able to breeze through it over the coming 20 minutes. As the name suggests, this is used to update rows in a table. The general syntax is as follows:

UPDATE table_name SET column_name = value WHERE conditions;

Example:

UPDATE film SET release_year = 2006 WHERE id = 1;

The above query will update the release_year column of the row with id 1 in the film table to 2006. You can also update multiple columns at once. Example:

UPDATE film SET release_year = 2006, rating = 'PG' WHERE id = 1;

Let’s talk about how the update works. It works as follows:

for each row in film:
    if row.matches(conditions in where clause)
        row['release_year'] = 2006
        row['rating'] = 'PG'

So basically update query iterates through all the rows in the table and updates the rows that match the conditions in the where clause. So, if you have a table with 1000 rows and you run an update query without a where clause, then all the 1000 rows will be updated. So, be careful while running update queries. Example:

UPDATE film SET release_year = 2006;

Delete

Finally, we are at the end of CRUD. Let’s talk about Delete operations. The general syntax is as follows:

DELETE FROM table_name WHERE conditions;

Example:

DELETE FROM film WHERE id = 1;

The above query will delete the row with id 1 from the film table. If you don’t specify a where clause, then all the rows from the table will be deleted. Example:

DELETE FROM film

Let’s talk about how delete works as well in terms of code.

for each row in film:
    if row.matches(conditions in where clause)
        delete row

Delete vs Truncate vs Drop

There are two more commands which are used to delete rows from a table. They are TRUNCATE and DROP. Let’s discuss them one by one.

Truncate

The command looks as follows:

TRUNCATE film;

The above query will delete all the rows from the film table. TRUNCATE command internally works by removing the complete table and then recreating it. So, it is much faster than DELETE. But it has a disadvantage. It cannot be rolled back. We will learn more about rollbacks in the class on Transactions (In short, rollbacks can only happen for incomplete transactions – not committed yet – to be discussed in future classes). But at a high level, this is because as the complete table is deleted as an intermediate step, no log is maintained as to what all rows were deleted, and thus is not easy to revert. So, if you run a TRUNCATE query, then you cannot undo it.

Note: It also resets the primary key ID. For example, if the highest ID in the table before truncating was 10, then the next row inserted after truncating will have an ID of 1.

Drop

The command looks as follows:

DROP TABLE film;

The above query will delete the film table. The difference between DELETE and DROP is that DELETE is used to delete rows from a table and DROP is used to delete the entire table. So, if you run a DROP query, then the entire table will be deleted. All the rows and the table structure will be deleted. So, be careful while running a DROP query. Nothing will be left of the table after running a DROP query. You will have to recreate the table from scratch.

Note that, DELETE:

  1. Removes specified rows one-by-one from a table (may delete all rows if no condition is present in a query but keeps table structure intact).
  2. It is slower than TRUNCATE.
  3. Doesn’t reset the key.
  4. It can be rolled back.

TRUNCATE

  1. Removes the complete table and then recreates it.
  2. Faster than DELETE.
  3. Resets the key.
  4. It can not be rolled back because the complete table is deleted as an intermediate step.

DROP

  1. Removes the complete table and the table structure as well.
  2. It can not be rolled back.

LIKE Operator

LIKE operator is one of the most important and frequently used operators in SQL. Whenever there is a column storing strings, there comes a requirement to do some kind of pattern matching. For example, assume Scaler’s database where we have a batches table with a column called name. Let’s say we want to get the list of Academy batches and the rule is that an Academy batch shall have Academy somewhere within the name. How do we find those? We can use the LIKE operator for this purpose. Example:

SELECT * FROM batches WHERE name LIKE '%Academy%';

Similarly, let’s say in our Sakila database, we want to get all the films that have LOVE their title. We can use the LIKE operator. Example:

SELECT * FROM film WHERE title LIKE '%LOVE%';

Let’s talk about how the LIKE operator works. The LIKE operator works with the help of 2 wildcards in our queries, % and _. The % wildcard matches any number of characters (>= 0 occurrences of any set of characters). The _ wildcard matches exactly one character (any character). Example:

  1. LIKE ‘cat%’ will match “cat”, “caterpillar”, “category”, etc. but not “wildcat” or “dog”.
  2. LIKE ‘%cat’ will match “cat”, “wildcat”, “domesticcat”, etc. but not “cattle” or “dog”.
  3. LIKE ‘%cat%’ will match “cat”, “wildcat”, “cattle”, “domesticcat”, “caterpillar”, “category”, etc. but not “dog” or “bat”.
  4. LIKE ‘_at’ will match “cat”, “bat”, “hat”, etc. but not “wildcat” or “domesticcat”.
  5. LIKE ‘c_t’ will match “cat”, “cot”, “cut”, etc. but not “chat” or “domesticcat”.
  6. LIKE ‘c%t’ will match “cat”, “chart”, “connect”, “cult”, etc. but not “wildcat”, “domesticcat”, “caterpillar”, “category”.

COUNT

The count function takes the values from a particular column and returns the number of values in that set. Umm, but don’t you think it will be exactly the same as the number of rows in the table? Nope. Not true. Aggregate functions only take not null values into account. So, if there are any null values in the column, they will not be counted.

Example: Let’s take a students table with data like follows:

If you will try to run COUNT and give it the values in batch_id column, it will return 3. Because there are 3 not null values in the column. This is different from number of rows in the students table.

Let’s see how do you use this operation in SQL.

SELECT COUNT(batch_id) FROM students;

To understand how aggregate functions work via a pseudo code, let’s see how SQL query optimizer may execute them.

table = []

count = 0

for row in table:
    if row[batch_id] is not null:
        count += 1

print(count)

Few things to note here: While printing, do we have access to the values of row? Nope. We only have access to the count variable. So, we can only print the count. Extrapolating this point, when you use aggregate functions, you can only print the result of the aggregate function. You cannot print the values of the rows.

SELECT COUNT(batch_id), batch_id FROM students;

This will be an invalid query. Because you are trying to print the values of batch_id column as well as the count of batch_id column. But, you can only print the count of batch_id column.

LIMIT Clause

And now let’s discuss the last clause for the day. The LIMIT clause allows us to limit the number of rows returned by a query. Example:

SELECT * FROM film LIMIT 10;

The above query will return only 10 rows from the film table. If you want to return 10 rows starting from the 11th row, you can use the OFFSET keyword. Example:

SELECT * FROM film LIMIT 10 OFFSET 10;

The above query will return 10 rows starting from the 11th row from the film table. Note that in MySQL, you cannot use the OFFSET keyword without the LIMIT keyword. Example:

SELECT * FROM film OFFSET 10;

throws an error.

LIMIT clause is applied at the end. Just before printing the results. Taking the example of pseudo code, it works as follows:

answer = []

for each row in film:
    if row.matches(conditions in where clause) # new line from above
        answer.append(row)

answer.sort(column_names in order by clause)

filtered_answer = []

for each row in answer:
    filtered_answer.append(row['rating'], row['release_year'])

return filtered_answer[start_of_limit: end_of_limit]

Thus, if your query contains ORDER BY clause, then LIMIT clause will be applied after the ORDER BY clause. Example:

SELECT * FROM film ORDER BY title LIMIT 10;

The above query will return 10 rows from the film table in ascending order of the title column.

ORDER BY Clause

Now let’s discuss another important clause. ORDER BY clause allows to return values in a sorted order. Example:

SELECT * FROM film ORDER BY title;

The above query will return all the rows from the film table in ascending order of the title column. If you want to return the rows in descending order, you can use the DESC keyword. Example:

SELECT * FROM film ORDER BY title DESC;

You can also sort by multiple columns. Example:

SELECT * FROM film ORDER BY title, release_year;

The above query will return all the rows from the film table in ascending order of the title column and then in ascending order of the release_year column. Consider the second column as tie breaker. If 2 rows have the same value of title, the release year will be used to break tie between them. Example:

SELECT * FROM film ORDER BY title DESC, release_year DESC;

Above query will return all the rows from the film table in descending order of the title column and if tie on title, in descending order of the release_year column.

By the way, you can ORDER BY on a column which is not present in the SELECT clause. Example:

SELECT title FROM film ORDER BY release_year;

Let’s also build the analogy of this with a pseudo code.

answer = []

for each row in film:
    if row.matches(conditions in where clause) # new line from above
        answer.append(row)

answer.sort(column_names in order by clause)

filtered_answer = []

for each row in answer:
    filtered_answer.append(row['rating'], row['release_year'])

return filtered_answer

If you see, the ORDER BY clause is applied after the WHERE clause. So, first the rows are filtered based on the WHERE clause and then they are sorted based on the ORDER BY clause. And only after that are the columns that have to be printed taken out. And that’s why you can sort based on columns not even in the SELECT clause.