A Text File Disguised as a Load File

One of the trickiest things for a newbie in litigation support to learn is the terminology we use when discussing load files. There are many overlapping meanings and it can be quite confusing until the day it “just clicks”. Like many learning curves, the big picture can be difficult to grasp when we are focused so much on the details. I've seen it happen many times — one day the light bulb goes off and the newbie all of sudden gets it.

The term “text file” is a generic term. As a definition, it is a file that contains data that is organized in a specific format for a specific use. There are many different formats of text files.

The term “ascii delimited text file” is another generic term. It has the same definition as above. The data within a text file can be organized by using “delimiters“.  A delimiter is a character that separates one value from another in the text file. For each of the different formats of text files, there can be different delimiters depending on the intended use. One example is a CSV file. If you use File | Save As in Excel and save a spreadsheet to a CSV file type, the resulting text file will contain data values each separated by a comma character (delimiter).

In the context of litigation support, a “text file” can also be a specific term instead of a generic one.  Some examples of these would be a TXT file type that contains OCR text or a TXT file type that contains extracted text from an e-mail message or an MS Word document.

Technically, all load files for litigation support databases are “text files” as defined above.

For databases like Summation and Concordance, the load files are formatted differently, they contain different delimiters and they are named with different file extensions.  A Summation database load file must have data in the correct sort order.  The delimiters can be anything you want. The file extension is usually .TXT. A Concordance database load file can have the data in any sort order. It should have Concordance-specific delimiters. The file extension should be .DAT.

Note: Some people will name a Summation load file with a .DAT file extension as if it doesn't matter. I consider this to be wrong and misleading.

For database image viewers, the load file for a Summation database is called a DII file. This load file requires a very specific format. Concordance has two primary image viewers. When Concordance is purchased, you have the option to purchase IPRO or Concordance Image (formerly called Opticon) as your image viewer. The load file for a Concordance database with an IPRO image viewer is called an LFP file. This load file requires a completely different format. The load file for a Concordance database with a Concordance Image viewer is called an OPT file or a LOG file interchangeably. This load file has its own format as well. All of these image viewer load files are very recognizable by both the file extension and the format once the file is opened in a text editor.

For the searchable text that gets loaded into a Summation or Concordance database, these files are generically referred to as “text files”. They have a .TXT file extension. The caveat to this is that Concordance offers an alternate way to load searchable text. Instead of separate .TXT files, the searchable text can be added as the last field in the DAT file. This is a preference decision.

Database load files (TXT or DAT) should contain a “header row” (with delimiters) that lists the field names that correspond to the data type between each delimiter in each row. Each row in the file is equal to a record in the database. Each value between a delimiter is equal to a field within a record in the database. Image viewer load files do not contain a header row.

As you can see, you might need to read this article several times for it to sink in. I promise there is a rhyme and reason to all of this. If you think I did a pretty good job of making sense of it, let me know in the comment area below.

Our goal:  We should be able to look at a file extension and assume for the most part what the contents will contain.

 

    I am very passionate about helping legal professionals succeed. I even quit my day job to devote more time to mentoring! I want to encourage you to subscribe and join the LitSuppGuru community. I share humorous, informative, and time-sensitive emails above and beyond what appears on this site.

    Please note: I reserve the right to delete comments that are salesy, offensive or off-topic.

    • Philip Hallquist

      Hi Amy: A VERY good post. It’s your most valuable and informative yet. Thanks very much!

      • Anonymous

        Thanks Philip, I know the newbies really want to learn the nitty gritty.  As it turns out, much of this site’s audience are people already in litigation support looking for additional guidance.  As I’ve mentioned before, a litigation support professional only knows what they have been exposed to. Hopefully they will learn something from this site right along side the newbies.  I appreciate your feedback.  It is rewarding to know when the audience is learning.  That’s why I’m here.

    • Philip Hallquist

      Hi Amy: A VERY good post. It’s your most valuable and informative yet. Thanks very much!

    • Jere Wilson

      Hello Amy – I only recently “found” your blog.  That is partly due to your being much more active in various LinkedIn groups, but also due to the quality and value of your articles (which leads to MANY groups re-posting your blogs!).

      I admire what you are doing, and appreciate the quality and the value you bring to your blog.

      The above blog on text/load file basics is right on.  As you point out, some of us have learned to do things in a particular way, and don’t often have the time to explore other, sometimes more productive ways of doing things.  So never worry about us long-timers getting bored or insulted – ain’t gonna happen!

      Keep up the good work! 

      • Anonymous

        Thanks so much for the feedback, Jere. I really appreciate you taking the time to comment. I have been planning to launch this site for over 5 years and as you can imagine, I’ve had trouble finding enough hours in the day between work, life, and Women in eDiscovery. But I promised myself that 2011 is the year.

        I have been touched by some of the positive feedback I have received from seasoned people in our industry and on a daily basis I meet someone who wants to join us. How cool is that? It really is a great career path that most don’t even know exists.

    • laura

      This is excellent .Thanks!

      • LitSuppGuru

        Thanks for the feedback, Laura. It means a lot.

    • DigitalCrimeInv

      Hi Amy .. excellent article … 

      • LitSuppGuru

        Thanks. Glad you enjoyed it. Keep in touch.

    • DigitalCrimeInv

      Hi Amy … want to do a article on “Image Disguised As A Picture” … Discuss the numerous ways we use the term “image” … That term can be very confusing to those new to E-Discovery and Information System Forensics.

      • LitSuppGuru

        LOL, that’s a great idea! Thanks for the suggestion. Send me an email with your name and I will give you credit.

        Similar to the military using acronyms for everything, we tend to use terms that slightly overlap or appear duplicative but they are not.

    • Roland Jones

      I remeber the first (and last) time this happened to me and the Vendor acted like I was crazy becasue I didnt see the DAT file was in fact a text file. My own supervisor who was the head of IT didnt even know but felt it necessary to yell at me for it. Needless to say it was the last time I this ever happened and I left that firm years later with a valuable lesson.

      • LitSuppGuru

        Yup, we all learn on-the-job in litigation support and this is a particularly confusing thing for newbies to learn at first.

    • LitSuppGuru

      Agreed. Yes, I do have experience with Ringtail.

    • Trina Adams

      Ammmmmmyyyyyyyyyy. 

      This is very well put. I have been in Litigation Support for 15 years and I mostly have trained some of my peers on Unitization, Coding, Scanning and Qcing and Committing batches 10 years ago as I became an Analyst 6 years ago and now a Project Manager I trained an operator in 2012 before I resigned from my job due to my 5 month old.   I trained a good friend of mines who was very excited to learn since no one wanted to show him anything. So i took it upon myself and made cheat sheet & Screen shots as I process and build out loadfiles and convert from IPRO and Concordance.  What he had a hard time understanding was all the TEXT loadfiles LFP/DAT/TXT/DII/CSV/OPT etc as he felt many contained the same information and he didnt undertsand why so many was created any why do we use a conversion tool on some to get other loadfiles etc……… to make a long story short…….. I wish i read this 3 months ago to get a quicker breakdown. I did get the breakdown to my newbie but it took 1 month to understand.
      All he knew Scanning & Q and repairing machines.. So i didn’t blame him. He was very shocked to learn that there is so much more to be done after Scanning/Coding and QC. I tell him that is ONLY THE BEGINNING…

      AMY, Thank you… Looking forward to reading more of you BLOGS/Artices.

      • LitSuppGuru

        Thanks so much for sharing your experience, Trina. That’s pretty cool that you helped show your friend litigation support tasks.

    • Imran

      Hello Amy… the article is well framed and it cleared most of the doubts I had.

      I still have couple of doubts to clarify. Could you please provide your email address to get in touch with you

    • Pingback: An Index Disguised as a List()

    • Pingback: Understanding Data/Metadata Load Files |()

    • tope philips

      Amy this is perfect, look forward to more KB.