Importing CSV files into a MySQL table

Importing a CSV file into an SQL table

CSV stands for Comma-Separated Values, and CSV files are text files that look something like this:

example.csv

Username,Email,Address,Contact
johndoe,john@example.com,"71 Pickering Street, Singapore, Singapore",+65-91234567
janedoe,jane@website.com,"24 Raffles Lane, Singapore, Singapore",+65-81234567
marysmith,mary@smith.com,"83 Riveting Road, Singapore, Singapore",+65-97654321
bobsmith,bob@smith.com,"84 Riveting Road, Singapore, Singapore",+65-87654321

Essentially, the CSV file format is meant to represent tabular data. The above CSV file represents the following table:

UsernameEmailAddressContact
johndoejohn@example.com71 Pickering Street, Singapore, Singapore+65-91234567
janedoejane@website.com24 Raffles Lane, Singapore, Singapore+65-81234567
marysmith mary@smith.com 83 Riveting Road, Singapore, Singapore +65-97654321
bobsmith bob@smith.com 84 Riveting Road, Singapore, Singapore +65-87654321

Due to their tabular nature, data in a CSV file can very easily be imported into and stored in an SQL table. The commands to do that, however, are not very well-documented online.

If a CSV file does not open as a text file on your computer, that’s because your computer is opening the file with a spreadsheet program such as Microsoft Excel. In such a case, to see the file as text, you will want to open these files on a text editing software such as Notepad.

  1. Importing the file into an SQL table
  2. Importing a file with different columns
  3. Processing column data before insertion
  4. Common issues and errors
    1. Using the correct file path separators
    2. Setting the correct newline characters
    3. Permission denied
  5. Conclusion

1. Importing the file into an SQL table

So how do we load a CSV file into an SQL table? If we were loading the CSV into an SQL table with the exact same structure, we can use the following SQL code:

LOAD DATA INFILE '/path/to/example.csv' 
INTO TABLE `users`
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

You’ll need to replace the highlighted portions above with:

  1. The path to your CSV file;
  2. The name of the table you are importing the CSV into, and;
  3. The appropriate newline character(s) for your platform, which is \n for macOS / Linux / Unix-based systems, and \r\n for Windows-based systems.

2. Importing a file with different columns

In practice, your SQL table structure is going to different from your CSV file’s structure most of the time, so you’ll need to specify which columns of the CSV file go into which columns in your table. For example, if your SQL table structure looks like this:

Users
PKUser_ID
Username
User_Email
User_Contact
User_Address

Then, you’ll need to add the following line into your SQL code:

LOAD DATA INFILE '/path/to/example.csv' 
INTO TABLE `users`
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(`Username`,`User_Email`,`User_Address`,`User_Contact`);

In the highlighted section above, we are specifying that column 1 of the CSV file goes into the Username field in the SQL table, column 2 goes into User_Email, and etc. Note also that the User_Address and the User_Contact fields are swapped, because the address field appears first in the CSV file, and that the User_ID field is auto-generated.


Article continues after the advertisement:


3. Processing column data before insertion

In some cases, you might want to process the data in specific columns before inserting them into your table. For example, you might want to split the:

  1. Address field into Street, City and State, and the;
  2. Contact field into Country Code and Phone Number.

This gives us the following database structure:

Users
PKUser_ID
Username
User_Email
User_Country_Code
User_Phone
User_Street_Address
User_City
User_Country

Instead of splitting the CSV file into these different columns so that we can import the data, we can use LOAD DATA INFILE to process the column data for us:

LOAD DATA INFILE '/path/to/example.csv'
INTO TABLE `user`
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS
(`Username`, `User_Email`, @User_Address, @User_Contact)
SET `User_Country_Code` = SUBSTRING(@User_Contact, 1, 3),
`User_Phone` = SUBSTRING(@User_Contact, 5),
`User_Street_Address` = SUBSTRING_INDEX(@User_Address, ",", 1),
`User_City` = SUBSTRING_INDEX(SUBSTRING_INDEX(@User_Address, ",", -2), ",", 1),
`User_Country` = SUBSTRING_INDEX(@User_Address, ",", -1);

Essentially, what’s happening above is that:

  1. Instead of assigning every column directly to a field, we assign the last 2 columns to variables @User_Address and @User_Contact (highlighted green above);
  2. Then, using the SET keyword, we set the algorithms that determine the values of each field (highlighted yellow above). The SQL SUBSTRING() and SUBSTRING_INDEX() functions are used to extract parts of the string from @User_Address and @User_Contact.

Unlike most other programming languages, the first index of a string is numbered 1 — not 0 — in MySQL. This means that, for example, to capture the first character in a substring in SQL, we would use SUBSTRING(@str, 1, 1) instead of SUBSTRING(@str, 0, 1).

After running the above query, the CSV data should be input nicely into our database table as follows:

User_IDUsernameUser_EmailUser_Country_CodeUser_PhoneUser_Street_AddressUser_CityUser_Country
1johndoejohn@example.com+659123456771 Pickering StreetSingaporeSingapore
2janedoejane@website.com+658123456724 Raffles LaneSingaporeSingapore
3marysmithmary@smith.com+659765432183 Riveting RoadSingaporeSingapore
4bobsmithbob@smith.com+658765432184 Riveting RoadSingaporeSingapore

4. Common issues and errors

Here are some of the common errors that people run into when using LOAD DATA INFILE:

a. Using the correct file path separators

For Windows machines, note that when you’re copying filepaths, the system will always use a \ as a directory separator by default. In MySQL strings, the \ is an escape character, so you’ll need to double up the character for it to be read properly in a string. Alternatively, you can also use / in place of all \.

This means that the path C:\users\john can be typed out as:

  • C:\\users\\john, or;
  • C:/users/john

An escape character is a special character that is necessary for certain character entities to be represented in a string. Take for example the ' character, which is used to open and close strings in MySQL. For it to be represented in a string without closing the string, it will need to be escaped with a preceding \, e.g. 'Tommy \'s phone'. An escape character is also be used to represent characters that cannot be easily-represented otherwise, such as Tab or newline characters.

b. Setting the correct newline characters

In our example above, in the line LINES TERMINATED BY '\n', we used \n for the newline character, which is the newline character used for operating systems except of Windows. If the CSV file was generated in Windows, the file may be using \r\n to represent newlines instead, which means that you will have to change the LINES TERMINATED BY line to the following:

LINES TERMINATED BY '\r\n'

c. Permission denied

For some devices, when you upload the CSV file using the LOAD DATA INFILE command, you may get a Permission denied error, like shown below:

Permission denied for LOAD DATA INFILE
Permission denied for the LOAD DATA INFILE command.

Most of the time, rectifying this issue is simply a matter of moving the CSV file you want to load into a folder XAMPP can read. The best choice for this is XAMPP’s install folder, which is in:

  • C:/xampp/htdocs for Windows
  • /Application/XAMPP/xamppfiles for macOS

5. Conclusion

Have you run into any issues uploading your CSV file onto MySQL? Let us know in the comments below, so we can add them as solutions to this guide and help future readers.


Article continues after the advertisement:


Leave a Reply

Your email address will not be published. Required fields are marked *

Note: You can use Markdown to format your comments.

This site uses Akismet to reduce spam. Learn how your comment data is processed.