There are some use cases when we need to update a table containing more than million rows.
There exist many solution
- To update row wise. We can do this by writing a script, making a connection to database and then executing queries.
- Another good way to create another table with data to be updated and then taking join of two tables and updating.
One step before if you have data in CSV that is to be updated, we can import CSV to mysql table.
The LOAD DATA INFILE statement reads rows from a text file into a table at a very high speed.
LOAD DATA LOCAL INFILE 'abc.csv' INTO TABLE content_csv_dump FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (col1, col2, col3);
Then to update take join with the table to update
UPDATE content a JOIN content_csv_dump ON a.hash = content_csv_dump.hash SET a.col1 = content_csv_dump.col1, a.col2 = content_csv_dump.col2, a.col3 = content_csv_dump.col3
This is an efficient approach to update millions of data. Its better to use join instead of each record update.
We can update in batches this way, using pl/sql script with putting some delays between batch execution.