One of the trickiest things for a newbie in litigation support to learn is the terminology we use when discussing load files. There are many overlapping meanings and it can be quite confusing until the day it “just clicks”. Like many learning curves, the big picture can be difficult to grasp when we are focused so much on the details. I've seen it happen many times — one day the light bulb goes off and the newbie all of sudden gets it.
The term “text file” is a generic term. As a definition, it is a file that contains data that is organized in a specific format for a specific use. There are many different formats of text files.
The term “ascii delimited text file” is another generic term. It has the same definition as above. The data within a text file can be organized by using “delimiters“. A delimiter is a character that separates one value from another in the text file. For each of the different formats of text files, there can be different delimiters depending on the intended use. One example is a CSV file. If you use File | Save As in Excel and save a spreadsheet to a CSV file type, the resulting text file will contain data values each separated by a comma character (delimiter).
In the context of litigation support, a “text file” can also be a specific term instead of a generic one. Some examples of these would be a TXT file type that contains OCR text or a TXT file type that contains extracted text from an e-mail message or an MS Word document.
Technically, all load files for litigation support databases are “text files” as defined above.
For databases like Summation and Concordance, the load files are formatted differently, they contain different delimiters and they are named with different file extensions. A Summation database load file must have data in the correct sort order. The delimiters can be anything you want. The file extension is usually .TXT. A Concordance database load file can have the data in any sort order. It should have Concordance-specific delimiters. The file extension should be .DAT.
Note: Some people will name a Summation load file with a .DAT file extension as if it doesn't matter. I consider this to be wrong and misleading.
For database image viewers, the load file for a Summation database is called a DII file. This load file requires a very specific format. Concordance has two primary image viewers. When Concordance is purchased, you have the option to purchase IPRO or Concordance Image (formerly called Opticon) as your image viewer. The load file for a Concordance database with an IPRO image viewer is called an LFP file. This load file requires a completely different format. The load file for a Concordance database with a Concordance Image viewer is called an OPT file or a LOG file interchangeably. This load file has its own format as well. All of these image viewer load files are very recognizable by both the file extension and the format once the file is opened in a text editor.
For the searchable text that gets loaded into a Summation or Concordance database, these files are generically referred to as “text files”. They have a .TXT file extension. The caveat to this is that Concordance offers an alternate way to load searchable text. Instead of separate .TXT files, the searchable text can be added as the last field in the DAT file. This is a preference decision.
Database load files (TXT or DAT) should contain a “header row” (with delimiters) that lists the field names that correspond to the data type between each delimiter in each row. Each row in the file is equal to a record in the database. Each value between a delimiter is equal to a field within a record in the database. Image viewer load files do not contain a header row.
As you can see, you might need to read this article several times for it to sink in. I promise there is a rhyme and reason to all of this. If you think I did a pretty good job of making sense of it, let me know in the comment area below.
Our goal: We should be able to look at a file extension and assume for the most part what the contents will contain.