CSV stands for Comma-Separated Values, and CSV files are text files that look something like this:
Username,Email,Address,Contact johndoe,firstname.lastname@example.org,"71 Pickering Street, Singapore, Singapore",+65-91234567 janedoe,email@example.com,"24 Raffles Lane, Singapore, Singapore",+65-81234567 marysmith,firstname.lastname@example.org,"83 Riveting Road, Singapore, Singapore",+65-97654321 bobsmith,email@example.com,"84 Riveting Road, Singapore, Singapore",+65-87654321
Essentially, the CSV file format is meant to represent tabular data. The above CSV file represents the following table:
|firstname.lastname@example.org||71 Pickering Street, Singapore, Singapore||+65-91234567|
|email@example.com||24 Raffles Lane, Singapore, Singapore||+65-81234567|
|firstname.lastname@example.org||83 Riveting Road, Singapore, Singapore||+65-97654321|
|email@example.com||84 Riveting Road, Singapore, Singapore||+65-87654321|
Due to their tabular nature, data in a CSV file can very easily be imported into and stored in an SQL table. The commands to do that, however, are not very well-documented online.
If a CSV file does not open as a text file on your computer, that’s because your computer is opening the file with a spreadsheet program such as Microsoft Excel. In such a case, to see the file as text, you will want to open these files on a text editing software such as Notepad.
Importing the file into an SQL table
So how do we load a CSV file into an SQL table? If we were loading the CSV into an SQL table with the exact same structure, we can use the following SQL code:
LOAD DATA INFILE '/path/to/example.csv' INTO TABLE `users` FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 ROWS;
You’ll need to replace the highlighted portions above with:
- The path to your CSV file;
- The name of the table you are importing the CSV into, and;
- The appropriate newline character(s) for your platform, which is
\nfor macOS / Linux / Unix-based systems, and
\r\nfor Windows-based systems.
Importing a file with different columns
In practice, your SQL table structure is going to different from your CSV file’s structure most of the time, so you’ll need to specify which columns of the CSV file go into which columns in your table. For example, if your SQL table structure looks like this:
Then, you’ll need to add the following line into your SQL code:
LOAD DATA INFILE '/path/to/example.csv' INTO TABLE `users` FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 ROWS (`Username`,`User_Email`,`User_Address`,`User_Contact`);
In the highlighted section above, we are specifying that column 1 of the CSV file goes into the
Username field in the SQL table, column 2 goes into
User_Email, and etc. Note also that the
User_Address and the
User_Contact fields are swapped, because the address field appears first in the CSV file, and that the
User_ID field is auto-generated.
Article continues after the advertisement:
Processing column data before insertion
In some cases, you might want to process the data in specific columns before inserting them into your table. For example, you might want to split the:
- Address field into Street, City and State, and the;
- Contact field into Country Code and Phone Number.
This gives us the following database structure:
Instead of splitting the CSV file into these different columns so that we can import the data, we can use
LOAD DATA INFILE to process the column data for us:
LOAD DATA INFILE '/path/to/example.csv' INTO TABLE `user` FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 ROWS (`Username`, `User_Email`, @User_Address, @User_Contact) SET `User_Country_Code` = SUBSTRING(@User_Contact, 1, 3), `User_Phone` = SUBSTRING(@User_Contact, 5), `User_Street_Address` = SUBSTRING_INDEX(@User_Address, ",", 1), `User_City` = SUBSTRING_INDEX(SUBSTRING_INDEX(@User_Address, ",", -2), ",", 1), `User_Country` = SUBSTRING_INDEX(@User_Address, ",", -1);
Essentially, what’s happening above is that:
- Instead of assigning every column directly to a field, we assign the last 2 columns to variables
@User_Contact(highlighted green above);
- Then, using the
SETkeyword, we set the algorithms that determine the values of each field (highlighted yellow above). The SQL
SUBSTRING_INDEX()functions are used to extract parts of the string from
Unlike most other programming languages, the first index of a string is numbered 1 — not 0 — in MySQL. This means that, for example, to capture the first character in a substring in SQL, we would use
SUBSTRING(@str, 1, 1) instead of
SUBSTRING(@str, 0, 1).
After running the above query, the CSV data should be input nicely into our database table as follows:
|firstname.lastname@example.org||+65||91234567||71 Pickering Street||Singapore||Singapore|
|email@example.com||+65||81234567||24 Raffles Lane||Singapore||Singapore|
|firstname.lastname@example.org||+65||97654321||83 Riveting Road||Singapore||Singapore|
|email@example.com||+65||87654321||84 Riveting Road||Singapore||Singapore|
Common issues and errors
Here are some of the common errors that people run into when using
LOAD DATA INFILE:
- For Windows machines, note that when you’re copying filepaths, the system will always use a
\as a directory separator by default. In MySQL strings, the
\is an escape character, so you’ll need to double up the character for it to be read properly in a string (for e.g.
C:\\users\\john). Alternatively, you can replace all
\with the more conventional
- If the lines are not reading correctly, try replacing the newline character with the opposite set (i.e.
\r\nand vice versa). Sometimes, the CSV files you receive have been created in the other operating system, so they will not contain the newline character that is appropriate to your platform.
An escape character is a special character that is necessary for certain character entities to be represented in a string. Take for example the
' character, which is used to open and close strings in MySQL. For it to be represented in a string without closing the string, it will need to be escaped with a preceding
'Tommy \'s phone'. An escape character is also be used to represent characters that cannot be easily-represented otherwise, such as Tab or newline characters.
Have you run into any issues uploading your CSV file onto MySQL? Let us know in the comments below, so we can add them as solutions to this guide and help future readers.
This post has been made possible by our Patreon supporters below:
Article continues after the advertisement: