Optimizing MySQL Update Queries for Consecutive Rows

MySQL Update Query Against a Certain Number of Consecutive Rows

In this article, we will explore how to update a table with a specific value in MySQL where the condition is met by checking for a certain number of consecutive rows without that value.

Introduction

Suppose you have a table test with columns id and reservation, and you want to update the reservation column to 18 only when you find the first three consecutive rows without any reservation. In this article, we will discuss how to achieve this using MySQL queries.

Table Data

Let’s start by looking at an example of the table data:

      |   id     | reservation |
      | -------- | ----------- |
      |    1     |      15     |
      |    2     |      0      |
      |    3     |      0      |
      |    4     |      16     |
      |    5     |      16     |
      |    6     |      18     |
      |    7     |      18     |
      |    8     |      18     |
      |    9     |      17     |
      |   10    |      17     |
      |   11    |      17     |
      |   12    |      17     |
      |   13    |      17     |
      |   14    |      17     |
      |   15    |      0      |
      |   16    |      0      |
      |   17    |      0      |
      |   18    |      0      |
      ...

Query Using Recursive Common Table Expressions (CTE)

Recursive CTEs are a powerful feature in MySQL that allow you to perform complex queries. However, they require MySQL 8.0 or later.

Unfortunately, the provided question does not seem to be possible using Recursive CTEs.

Alternative Solution Using Joining Tables

Instead, we can use joining tables to achieve this result. Here’s an example query:

-- First of all, we shall get the initial id which satifies 3 consecutive 0 as the reservation value:
SELECT t1.id
FROM test t1
JOIN test t2 ON t2.id=t1.id+1
JOIN test t3 ON t3.id=t2.id+1
WHERE t1.reservation=0 AND t2.reservation=0 AND t3.reservation=0
ORDER BY id
LIMIT 1;

-- result:
+------+
| id   |
+------+
|    6 |
+------+

-- Then, Use CASE to replace the reservation value whose id has the value between the calculated initial id and two ids right after it by using BETWEEN.
SELECT id,
       CASE
           WHEN id BETWEEN (SELECT t1.id FROM test t1 JOIN test t2 ON t2.id=t1.id+1 JOIN test t3 ON t3.id=t2.id+1 WHERE t1.reservation=0 AND t2.reservation=0 AND t3.reservation=0 ORDER BY id LIMIT 1)
                AND (SELECT t1.id FROM test t1 JOIN test t2 ON t2.id=t1.id+1 JOIN test t3 ON t3.id=t2.id+1 WHERE t1.reservation=0 AND t2.reservation=0 AND t3.reservation=0 ORDER BY id LIMIT 1) +2
           THEN 18
           ELSE reservation END AS reservation
FROM test
ORDER BY id;

This query first finds the initial id that satisfies three consecutive rows with reservation=0. Then, it uses a CASE statement to update the reservation value for each row whose id falls within this range.

Performance Improvement with PRIMARY KEY

To improve performance, we should make sure the id column is set as the primary key of the table. This allows MySQL to use an index on the id column, which can significantly reduce the number of rows that need to be scanned during queries.

ALTER TABLE test ADD PRIMARY KEY (id);

Execution Plan Comparison

Here’s a comparison of the execution plans for both scenarios:

Without PRIMARY KEY

+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type   | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+--------+---------------+------+----------+------+------+----------+----------------------------------------------------+
|  1 | PRIMARY     | test  | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   11 |   100.00 | Using filesort                                     |
|  3 | SUBQUERY    | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   11 |    10.00 | Using where; Using temporary; Using filesort       |
|  3 | SUBQUERY    | t2    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   11 |     9.09 | Using where; Using join buffer (Block Nested Loop) |
|  3 | SUBQUERY    | t3    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   11 |     9.09 | Using where; Using join buffer (Block Nested Loop) |
|  2 | SUBQUERY    | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   11 |    10.00 | Using where; Using temporary; Using filesort       |
|  2 | SUBQUERY    | t2    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   11 |     9.09 | Using where; Using join buffer (Block Nested Loop) |
|  2 | SUBQUERY    | t3    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |   11 |     9.09 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+--------+---------------+------+----------+------+------+----------+----------------------------------------------------+

With PRIMARY KEY

+----+-------------+-------+------------+--------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+---------+----------+------+------+----------+-------------+
|  1 | PRIMARY     | test  | NULL       | index  | NULL          | PRIMARY | 4       | NULL |   18 |   100.00 | NULL        |
|  3 | SUBQUERY    | t1    | NULL       | index  | NULL          | PRIMARY | 4       | NULL |   18 |    10.00 | Using where |
|  3 | SUBQUERY    | t2    | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | func |    1 |    10.00 | Using where |
|  3 | SUBQUERY    | t3    | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | func |    1 |    10.00 | Using where |
|  2 | SUBQUERY    | t1    | NULL       | index  | NULL          | PRIMARY | 4       | NULL |   18 |    10.00 | Using where |
|  2 | SUBQUERY    | t2    | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | func |    1 |    10.00 | Using where |
|  2 | SUBQUERY    | t3    | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | func |    1 |    10.00 | Using where |
+----+-------------+-------+------------+--------+---------------+---------+----------+------+------+----------+-------------+

As expected, the execution plan with id as the primary key is significantly faster and requires fewer rows to be examined.


Last modified on 2024-03-26