Importing CSV files into a MySQL table

Importing a CSV file into an SQL table

CSV stands for Comma-Separated Values, and CSV files are text files that look something like this:

example.csv

Username,Email,Address,Contact
johndoe,john@example.com,"71 Pickering Street, Singapore, Singapore",+65-91234567
janedoe,jane@website.com,"24 Raffles Lane, Singapore, Singapore",+65-81234567
marysmith,mary@smith.com,"83 Riveting Road, Singapore, Singapore",+65-97654321
bobsmith,bob@smith.com,"84 Riveting Road, Singapore, Singapore",+65-87654321

Essentially, the CSV file format is meant to represent tabular data. The above CSV file represents the following table:

UsernameEmailAddressContact
johndoejohn@example.com71 Pickering Street, Singapore, Singapore+65-91234567
janedoejane@website.com24 Raffles Lane, Singapore, Singapore+65-81234567
marysmith mary@smith.com 83 Riveting Road, Singapore, Singapore +65-97654321
bobsmith bob@smith.com 84 Riveting Road, Singapore, Singapore +65-87654321

Due to their tabular nature, data in a CSV file can very easily be imported into and stored in an SQL table. The commands to do that, however, are not very well-documented online.

If a CSV file does not open as a text file on your computer, that’s because your computer is opening the file with a spreadsheet program such as Microsoft Excel. In such a case, to see the file as text, you will want to open these files on a text editing software such as Notepad.

Importing the file into an SQL table

So how do we load a CSV file into an SQL table? If we were loading the CSV into an SQL table with the exact same structure, we can use the following SQL code:

LOAD DATA INFILE '/path/to/example.csv' 
INTO TABLE `users`
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

You’ll need to replace the highlighted portions above with:

  1. The path to your CSV file;
  2. The name of the table you are importing the CSV into, and;
  3. The appropriate newline character(s) for your platform, which is \n for macOS / Linux / Unix-based systems, and \r\n for Windows-based systems.

Importing a file with different columns

In practice, your SQL table structure is going to different from your CSV file’s structure most of the time, so you’ll need to specify which columns of the CSV file go into which columns in your table. For example, if your SQL table structure looks like this:

Users
PKUser_ID
Username
User_Email
User_Contact
User_Address

Then, you’ll need to add the following line into your SQL code:

LOAD DATA INFILE '/path/to/example.csv' 
INTO TABLE `users`
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(`Username`,`User_Email`,`User_Address`,`User_Contact`);

In the highlighted section above, we are specifying that column 1 of the CSV file goes into the Username field in the SQL table, column 2 goes into User_Email, and etc. Note also that the User_Address and the User_Contact fields are swapped, because the address field appears first in the CSV file, and that the User_ID field is auto-generated.


Article continues after the advertisement:


Processing column data before insertion

In some cases, you might want to process the data in specific columns before inserting them into your table. For example, you might want to split the:

  1. Address field into Street, City and State, and the;
  2. Contact field into Country Code and Phone Number.

This gives us the following database structure:

Users
PKUser_ID
Username
User_Email
User_Country_Code
User_Phone
User_Street_Address
User_City
User_Country

Instead of splitting the CSV file into these different columns so that we can import the data, we can use LOAD DATA INFILE to process the column data for us:

LOAD DATA INFILE '/path/to/example.csv'
INTO TABLE `user`
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(`Username`, `User_Email`, @User_Address, @User_Contact)
SET `User_Country_Code` = SUBSTRING(@User_Contact, 1, 3),
`User_Phone` = SUBSTRING(@User_Contact, 5),
`User_Street_Address` = SUBSTRING_INDEX(@User_Address, ",", 1),
`User_City` = SUBSTRING_INDEX(SUBSTRING_INDEX(@User_Address, ",", -2), ",", 1),
`User_Country` = SUBSTRING_INDEX(@User_Address, ",", -1);

Essentially, what’s happening above is that:

  1. Instead of assigning every column directly to a field, we assign the last 2 columns to variables @User_Address and @User_Contact (highlighted green above);
  2. Then, using the SET keyword, we set the algorithms that determine the values of each field (highlighted yellow above). The SQL SUBSTRING() and SUBSTRING_INDEX() functions are used to extract parts of the string from @User_Address and @User_Contact.

Unlike most other programming languages, the first index of a string is numbered 1 — not 0 — in MySQL. This means that, for example, to capture the first character in a substring in SQL, we would use SUBSTRING(@str, 1, 1) instead of SUBSTRING(@str, 0, 1).

After running the above query, the CSV data should be input nicely into our database table as follows:

User_IDUsernameUser_EmailUser_Country_CodeUser_PhoneUser_Street_AddressUser_CityUser_Country
1johndoejohn@example.com+659123456771 Pickering StreetSingaporeSingapore
2janedoejane@website.com+658123456724 Raffles LaneSingaporeSingapore
3marysmithmary@smith.com+659765432183 Riveting RoadSingaporeSingapore
4bobsmithbob@smith.com+658765432184 Riveting RoadSingaporeSingapore

Common issues and errors

Here are some of the common errors that people run into when using LOAD DATA INFILE:

  1. For Windows machines, note that when you’re copying filepaths, the system will always use a \ as a directory separator by default. In MySQL strings, the \ is an escape character, so you’ll need to double up the character for it to be read properly in a string (for e.g. C:\users\john becomes C:\\users\\john). Alternatively, you can replace all \ with the more conventional / to get C:/users/john.
  2. If the lines are not reading correctly, try replacing the newline character with the opposite set (i.e. \n to \r\n and vice versa). Sometimes, the CSV files you receive have been created in the other operating system, so they will not contain the newline character that is appropriate to your platform.

An escape character is a special character that is necessary for certain character entities to be represented in a string. Take for example the ' character, which is used to open and close strings in MySQL. For it to be represented in a string without closing the string, it will need to be escaped with a preceding \, e.g. 'Tommy \'s phone'. An escape character is also be used to represent characters that cannot be easily-represented otherwise, such as Tab or newline characters.

Conclusion

Have you run into any issues uploading your CSV file onto MySQL? Let us know in the comments below, so we can add them as solutions to this guide and help future readers.

This post has been made possible by our Patreon supporters below:

  1. Kayla Marie
  2. Amar AM7

Article continues after the advertisement:


Leave a Reply

Your email address will not be published.

For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

I agree to these terms.