Electronic Discovery – Indexing

ByAmy Bowser-Rollins 07/14/201708/17/2021

In a previous article on the topic of electronic discovery, I began the conversation about a term we use called “processing”. As I mentioned in that article, the term “processing” encompasses many steps, which can make it one of the most difficult topics to teach to newbies in litigation support.

Some of the “processing” steps that I've covered so far are DeNIST, Deduplication, Embedded Objects, Exceptions, Password Protected Files, and Time Zone.

Indexing is another step in the “processing” stage and it is one of the steps that can take some time to complete, depending on the volume of data that needs to be indexed.

If an attorney is anxious (read: impatient) about the turnaround time to process a new set of electronic data, I usually explain to them that there are many steps in the process and, more specifically, we will not be able to search against the document database until the indexing step has been completed. If they are interested in hearing more, I explain what indexing is, and how it improves database searches.

In simple terms, the indexing process will go grab every word in every document and then generate a list of those words, sorted in alphanumeric order. This facilitates fast and more accurate information retrieval when we perform a full-text search across all of the documents in the database.

A database index is comparable to an index at the end of a book or to a concordance (index) at the end of a transcript.

For example, let's say we need to search for the term “maintenance”. If there is no index, the search will take much longer as it reads one document after another, looking for the term we want. Alternatively, if an index exists, the search will be much faster because it can jump right to the term within the sorted list of words and the index already knows everywhere that term exists.

Keep in mind that if an index already exists and we are adding new electronic data, we will need to perform a re-index so that all of the new words can be added, as well as sorted, along with all the previous words in the pre-existing index. Additionally, if documents are removed from a database, we also perform a re-index.

Now, there is much more to this topic related to which search engine is being used to create the index. Some of the search engines you might hear about from a software provider are dtSearch, Lucene, SQL Server and Elasticsearch. There are also different types of indices, such as an “inverted index” and “latent semantic indexing”. In addition, many of our document databases have multiple indices that are used in different types of targeted searches.

But, I want you to focus on understanding that indexing is part of the electronic discovery “processing” steps and I want you to be able to explain to an attorney how indexing improves the legal team's database searching.

Don't forget – if you're trying to search your database and you're getting frustrated with the results, you might want to learn about Stop Words.

Databases

Search Query Success Tip

ByAmy Bowser-Rollins 04/25/201608/02/2021

Have you ever gotten frustrated running search queries against a database and it won’t find what you’re looking for? One reason for the lack of success could be that you are not aware of the “stop words” (also called “noise words”) in your particular database system. Over the years, I have had frustrated paralegals and attorneys…

Databases

OCR Text vs. Extracted Text

ByAmy Bowser-Rollins 04/30/2015

Another topic when training a litigation support newbie is the concept of how we go about getting searchable documents for our document databases. One of the most important reasons why we provide litigators with document databases is to enable the legal team to perform searches across all of the documents in order to find the…

Analysis | EDRM

Electronic Discovery – False Hits or False Positives

ByAmy Bowser-Rollins 03/28/201208/15/2021

In litigation cases, one of the primary reasons for creating document databases is so that we can gain the ability to search across all of the documents. Paralegals, attorneys and litigation support professionals will all perform database searches. Service providers will also perform searches at the request of their client. When the searches are conducted,…

Databases | Expressions

A Database is a Database is a Database

ByAmy Bowser-Rollins 02/05/201207/30/2022

One of the expressions I use in the world of litigation support is “A database is a database is a database“. When we are exposed to yet another database tool for the first time this expression will apply because all databases have the same basic features. It is just a matter of finding out where…

Analysis | EDRM

Electronic Discovery – Keyword Searches

ByAmy Bowser-Rollins 09/19/201708/17/2021

After electronic discovery has been “processed” and “indexed“, the legal team can then perform “keyword searches”. Although there is much debate in our industry about the proficiency and accuracy of performing keyword searches to isolate subsets of data to be reviewed and analyzed, prior to production, the legal team performs numerous keyword searches, throughout a…

Databases

Limit Keyword Search to Beginning or End of the Document

ByAmy Bowser-Rollins 05/01/2016

In a previous article, I discussed “stop words” and within the free resource at the bottom of the article, I mentioned a search engine called dtSearch. You may not be aware that you are using dtSearch because it is “baked in” to many of the legal industry database systems we use. To increase the success of…

2 Comments

mgolab says:

07/15/2017 at 4:52 pm

Very good. I like the paragraph that starts with “If an attorney is anxious”, as it gives me hope that somewhere out there is a world where lawyers/attorney’s aren’t anxious and actually take what you say at face value, and don’t keep pestering you to tell them how long it ‘really takes’ ie they appear to operate under the assumption that we’re outright lying about how long things take – after all you just click a button right!?

Reply
1. Amy Bowser-Rollins says:
  
  07/16/2017 at 5:23 pm
  
  Hey Matthew – That’s why I’m here, to give you some hope. Ha! You described the reality of our role, for sure.
  
  Reply

Similar Posts

2 Comments

Leave a Reply Cancel reply