I have a database table that maintains some information and is required to preserve order. Essentially if I have elements 1 through 5 listed and I want to add a new element, then it could be inserted anywhere in the existing row, either at the last, after 5, the beginning before 1 or somewhere in the middle such as after 3. Is there a way to do this using MySQL INSERT statements and specifying after which row we should insert the index?
I presume not. So my strategy to go about doing this is to create another column 'order_number' that basically records the order of the elements. For instance, if the record table has primary key (record_id) and the order_number listed side by side, it would look like this:
record_id order_number
1 1
2 2
3 3
4 4
5 5
TO add a new element to this row after row 3, the resulting end table will look like this:
record_id order_number
1 1
2 2
3 3
**6** **4** <------ added row
4 **5** <-- changed order_number
5 **6** <-- changed order_number
In such a situation, I can clearly achieve the order that I want by simply selecting the data that i want and providing an Order By order_number asc clause.
However, as you can see, to do a simple Insert, it requires me to update every other row's order_number that appears after it. The table is expected to have an extensive amount of rows to it (magnitude of 100,000) at minimum and simply updating every other row (hence locking the table) at every single insert operation is not at all feasible.
What is a better recommended strategy in this case ?
If the order_number
is not to be shown but only used for ordering, I suggest you use a decimal datatype instead of integer. This way, when you have to insert a row "between" two existing rows, you can set as order_number, the average of the two existing order numbers.
In your example:
record_id order_number
1 1.0
2 2.0
3 3.0
**6** 3.5 <---- added row
4 4.0 <-- no change
5 5.0 <-- no change
There is a problem though that if you keep inserting numbers in the same area, some order numbers may result to be too close for the precision of the datatype you have chosen, close enough as to not be distinguished form one another.
To avoid this, your insert procedure will have to examine whether the two existing order number are too close. In that case, it could reassign some order numbers of other nearby rows, "stretching" the order numbers above and below to "make space" for a new value.
You could also have a "cleanup" procedure that runs periodically and does this "stretching" in the whole or large parts of the table.
I found this answer for a similar question: https://stackoverflow.com/a/6333717/1010050
In summary, it increments all of the record IDs below the one you will be adding, to maintain consistency. That still requires you to update all of the record IDs, so it isn't the most efficient. It does have the benefit, compared to your method, of maintaining the physical order in the database, rather than just a virtual order like you have.
Another way I can think of would be to record the child and parent record IDs for each record, rather than an order number, similar to a Doubly Linked List. Inserting an element in the middle would then only require updating two other records regardless of table size. This has the same disadvantage as your solution where the physical ordering would be wrong, so reading from the table in an ordered fashion would be more costly.
For example:
record_id parent_id child_id
0 NULL 1
1 0 2
2 1 NULL
When we insert a record after record_id = 1
, the table becomes:
record_id parent_id child_id
0 NULL 1
1 0 3
2 3 NULL
3 1 2
Note how only the parent_id
and child_id
for IDs 1 and 2 had to change.
I think between these two solutions, the biggest thing to consider is what is your most common operation: reading out the values in order, or writing a new value in the middle somewhere. If it's reading, then updating the record IDs would be your best option in order to maintain the physical order of the database. If writing, then you can optimize for that by using the method I suggested similar to a doubly linked list, or your own order method.
Summary after question update: Seeing that updating most of the records is not feasible, then the other answer I found is definitely not valid. The solution of treating it similar to a doubly linked list is still plausible, however.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With