Once upon a time, gathering the data you needed to make smart business decisions was difficult, time-consuming, and rather hit-or-miss. The shift to digital technologies has largely alleviated this problem through streamlined communication, interactivity, and vastly improved meta-data collection. But in making it easier to collect oceans of data, folks navigating these seas of information find themselves with a new issue: how to manage all of the data effectively without drowning. One of the earliest attempts to tame the chaos came in the form of the comma separated values (.csv or CSV) file format.
More than 40 years after its debut, the .csv format is still an essential part of effective data management for companies around the globe. Getting the most from this simple, but powerful, file type, however, requires a full understanding of what it is, where it came from, and how it works.
The Basics of CSV Files
If you find yourself asking, “What’s a .CSV file?”, take heart. Files of this type are full of individual tabular data records, each with one or more fields. Commas (or other characters) are used to separate the records. It’s a plain-text format, which makes it extremely versatile and easy to adapt for different uses, although its primary use remains transferring or converting data from one database to another database or format.
While the file format is called “comma separated values” (also called character separated values or comma delimited files), in execution you may find fields and data records separated by a variety of other characters, including:
- Single or Double Quotation Marks
So while the comma is certainly the most common punctuation used to delineate separate pieces of information, it’s not the only option. That said, every .csv file follows the same overall format: every column of data is set apart by a punctuation mark (such as a comma), and every new line indicates a new row of data.
So, for example, if you wanted to create a .csv file of your own, you could do it in any plain-text editor, like so:
In the example, you can see that the data is continuous across each line, with each field separated from the others by only a comma. If, for example, you use a spreadsheet program like Google Sheets, OpenOffice Calc, Microsoft Excel, etc. to open the .csv file, it will put the first line into separate cells to create headers and then the following rows in the corresponding cells in the same column, like so:
This deceptively simple plain-text file is a powerful tool for collecting, sorting, and managing thousands of entries of varying complexity. The format has no official standards in place, but it does have some general guidelines established by the Internet Engineering Task Force (IETF) to create some semblance of formalization. The World Wide Web Consortium (W3C) has also released its own set of standards in an attempt to modernize and bring consistency to .csv files for all applications.
The simplicity of this format, which not only supports huge files sizes but is largely consistent and easily read by both humans and computers, made it a natural fit for database integration. That compatibility, and its well-deserved reputation for reliability, carried it forward to today, where spreadsheet programs, database applications, and other data management tools continue to offer native support for .csv files.
The CSV File: A Brief History
Back in the 1970s, computers like the IBM FORTRAN were using punched cards to execute programs and read data. The comma-separated values file format was a natural match for the straightforward, uniform number-list cards that drove computers during that era.
In a serendipitous turn of events, many popular database formats still in use today were being developed at the same time as the .csv file format. The simplicity of this format, which not only supports huge files sizes but is largely consistent and easily read by both humans and computers, made it a natural fit for database integration.
That compatibility, and its well-deserved reputation for reliability, carried it forward to today, where spreadsheet programs, database applications, and other data management tools continue to offer native support for importing and exporting .csv files.
Working with CSV Files
Like a lot of other popular file formats, .csv files can be opened with a variety of applications and edited, modified, and used in a number of ways.
Viewing a CSV File
If you’d like to preview a .csv file before you open it in another application, you can use a text editor, such as Notepad or Notepad++ (in Windows) or TextEdit (Mac), to open it. You can either choose to open the file using the Open or Edit dialogue built into the operating system, or open your text editor first and then open the file by navigating to it and choosing “Open.”
You’ll be presented with a plain-text list of data fields, separated into columns with punctuation and rows with line breaks. If your editor has a “Word Wrap” feature, be sure to turn it off, as it can wreak havoc with readability and flow.
You can also edit a .csv file in a plain-text editor, but be aware that you’ll need to follow the formatting used exactly in order to avoid breaking the file. You may be more successful editing the file in another application, such as a spreadsheet application, and then exporting the results as a .csv file to avoid fatally misplacing a comma or a set of double quotes.
Opening a CSV File
Two of the most common uses for .csv files are to store data and manipulate it in spreadsheet programs and database applications. The latter of these includes everything from offline, bespoke databases for custom applications to cloud-based data repositories for high-level Enterprise Resource Planning (ERP), Customer Resource Management (CRM) and automation-driven procurement packages.
- Spreadsheet Programs make it much easier to view and work with .csv files by breaking the columns and rows into discrete cells. Generally, double-clicking on the .csv file will open it in its associated application, so if you have that file type associated with Microsoft Excel, or Google Spreadsheets, or LibreOffice Calc, or another spreadsheet program, it will open automatically when clicked.
If you don’t have a file association in your system for .csv files, you can right-click (control-click on a Mac) on the file and choose “Open with…” and then select the program of your choice. To open a .csv file inside a program you’re already using, choose File>Open… and then navigate to the file you want, then either double-click or choose “Open” from the dialogue box.
- Database Applications are designed for more advanced integrations than spreadsheet programs. Many database programs support building new databases from existing .cvs files, but to add data to an existing database, you’ll need to import it. Choosing “Import,” “Import .csv”, “Import data from .csv file,” etc. from your File menu will bring the information into the database, usually through a dialogue that lets you verify the information to be imported is correct, as well as the character used as a delimiter.
Simplicity Has Its Limits
In addition, there’s no way to indicate which character set is being used in a .csv file. That information must be included in a separate file, or parsed by the application being used to read the .csv file. This is fine for applications using Structured Query Language (SQL), which supports multiple character sets, but may prove to be a roadblock for other applications that don’t.
Finally, as database technology grows more complex, the simplicity of .csv files can work against it, as more robust databases that contain multiple sets of relationship data cannot be exported or converted to a single .csv file. In a nutshell, this means spreadsheets with multiple tabs or sheets won’t make the jump, so you’ll have to create multiple .csv files when you export data to accommodate this limitation.
You’re CSV Certified
As a reliable and ubiquitous presence in data management for more than four decades, .csv files have a unique place in a field that’s known for treasuring novelty over the tried-and-true. Now that you understand their capabilities, and their limitations, you, too, can benefit from this old-school file format and make it a valuable part of your data management toolkit.