CSV stands for Comma-Separated Values, and CSV files are text files that look something like this:
example.csv
Username,Email,Address,Contact johndoe,john@example.com,"71 Pickering Street, Singapore, Singapore",+65-91234567 janedoe,jane@website.com,"24 Raffles Lane, Singapore, Singapore",+65-81234567 marysmith,mary@smith.com,"83 Riveting Road, Singapore, Singapore",+65-97654321 bobsmith,bob@smith.com,"84 Riveting Road, Singapore, Singapore",+65-87654321
Essentially, the CSV file format is meant to represent tabular data. The above CSV file represents the following table:
Username | Address | Contact | |
---|---|---|---|
johndoe | john@example.com | 71 Pickering Street, Singapore, Singapore | +65-91234567 |
janedoe | jane@website.com | 24 Raffles Lane, Singapore, Singapore | +65-81234567 |
marysmith | mary@smith.com | 83 Riveting Road, Singapore, Singapore | +65-97654321 |
bobsmith | bob@smith.com | 84 Riveting Road, Singapore, Singapore | +65-87654321 |
Due to their tabular nature, data in a CSV file can very easily be imported into and stored in an SQL table. The commands to do that, however, are not very well-documented online.
If a CSV file does not open as a text file on your computer, that’s because your computer is opening the file with a spreadsheet program such as Microsoft Excel. In such a case, to see the file as text, you will want to open these files on a text editing software such as Notepad.
- Importing the file into an SQL table
- Importing a file with different columns
- Processing column data before insertion
- Common issues and errors
- Conclusion
1. Importing the file into an SQL table
So how do we load a CSV file into an SQL table? If we were loading the CSV into an SQL table with the exact same structure, we can use the following SQL code:
LOAD DATA INFILE '/path/to/example.csv' INTO TABLE `users` FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 ROWS;
You’ll need to replace the highlighted portions above with:
- The path to your CSV file;
- The name of the table you are importing the CSV into, and;
- The appropriate newline character(s) for your platform, which is
\n
for macOS / Linux / Unix-based systems, and\r\n
for Windows-based systems.
2. Importing a file with different columns
In practice, your SQL table structure is going to different from your CSV file’s structure most of the time, so you’ll need to specify which columns of the CSV file go into which columns in your table. For example, if your SQL table structure looks like this:
Users | |
---|---|
PK | User_ID |
Username User_Email User_Contact User_Address |
Then, you’ll need to add the following line into your SQL code:
LOAD DATA INFILE '/path/to/example.csv' INTO TABLE `users` FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 ROWS (`Username`,`User_Email`,`User_Address`,`User_Contact`);
In the highlighted section above, we are specifying that column 1 of the CSV file goes into the Username
field in the SQL table, column 2 goes into User_Email
, and etc. Note also that the User_Address
and the User_Contact
fields are swapped, because the address field appears first in the CSV file, and that the User_ID
field is auto-generated.
3. Processing column data before insertion
In some cases, you might want to process the data in specific columns before inserting them into your table. For example, you might want to split the:
- Address field into Street, City and State, and the;
- Contact field into Country Code and Phone Number.
This gives us the following database structure:
Users | |
---|---|
PK | User_ID |
Username User_Email User_Country_Code User_Phone User_Street_Address User_City User_Country |
Instead of splitting the CSV file into these different columns so that we can import the data, we can use LOAD DATA INFILE
to process the column data for us:
LOAD DATA INFILE '/path/to/example.csv' INTO TABLE `user` FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n' IGNORE 1 ROWS (`Username`, `User_Email`, @User_Address, @User_Contact) SET `User_Country_Code` = SUBSTRING(@User_Contact, 1, 3), `User_Phone` = SUBSTRING(@User_Contact, 5), `User_Street_Address` = SUBSTRING_INDEX(@User_Address, ",", 1), `User_City` = SUBSTRING_INDEX(SUBSTRING_INDEX(@User_Address, ",", -2), ",", 1), `User_Country` = SUBSTRING_INDEX(@User_Address, ",", -1);
Essentially, what’s happening above is that:
- Instead of assigning every column directly to a field, we assign the last 2 columns to variables
@User_Address
and@User_Contact
(highlighted green above); - Then, using the
SET
keyword, we set the algorithms that determine the values of each field (highlighted yellow above). The SQLSUBSTRING()
andSUBSTRING_INDEX()
functions are used to extract parts of the string from@User_Address
and@User_Contact
.
Unlike most other programming languages, the first index of a string is numbered 1 — not 0 — in MySQL. This means that, for example, to capture the first character in a substring in SQL, we would use SUBSTRING(@str, 1, 1)
instead of SUBSTRING(@str, 0, 1)
.
After running the above query, the CSV data should be input nicely into our database table as follows:
User_ID | Username | User_Email | User_Country_Code | User_Phone | User_Street_Address | User_City | User_Country |
---|---|---|---|---|---|---|---|
1 | johndoe | john@example.com | +65 | 91234567 | 71 Pickering Street | Singapore | Singapore |
2 | janedoe | jane@website.com | +65 | 81234567 | 24 Raffles Lane | Singapore | Singapore |
3 | marysmith | mary@smith.com | +65 | 97654321 | 83 Riveting Road | Singapore | Singapore |
4 | bobsmith | bob@smith.com | +65 | 87654321 | 84 Riveting Road | Singapore | Singapore |
4. Common issues and errors
Here are some of the common errors that people run into when using LOAD DATA INFILE
:
a. Using the correct file path separators
For Windows machines, note that when you’re copying filepaths, the system will always use a \
as a directory separator by default. In MySQL strings, the \
is an escape character, so you’ll need to double up the character for it to be read properly in a string. Alternatively, you can also use /
in place of all \
.
This means that the path C:\users\john
can be typed out as:
C:\\users\\john
, or;C:/users/john
An escape character is a special character that is necessary for certain character entities to be represented in a string. Take for example the '
character, which is used to open and close strings in MySQL. For it to be represented in a string without closing the string, it will need to be escaped with a preceding \
, e.g. 'Tommy \'s phone'
. An escape character is also be used to represent characters that cannot be easily-represented otherwise, such as Tab or newline characters.
b. Setting the correct newline characters
In our example above, in the line LINES TERMINATED BY '\n'
, we used \n
for the newline character, which is the newline character used for operating systems except of Windows. If the CSV file was generated in Windows, the file may be using \r\n
to represent newlines instead, which means that you will have to change the LINES TERMINATED BY
line to the following:
LINES TERMINATED BY '\r\n'
c. Permission denied
For some devices, when you upload the CSV file using the LOAD DATA INFILE
command, you may get a Permission denied error, like shown below:
Most of the time, rectifying this issue is simply a matter of moving the CSV file you want to load into a folder XAMPP can read. The best choice for this is XAMPP’s install folder, which is in:
C:/xampp/htdocs
for Windows/Application/XAMPP/xamppfiles
for macOS
5. Conclusion
Have you run into any issues uploading your CSV file onto MySQL? Let us know in the comments below, so we can add them as solutions to this guide and help future readers.