Electronic Discovery – Keyword Searches

keyword searches

After electronic discovery has been “processed” and “indexed“, the legal team can then perform “keyword searches”. Although there is much debate in our industry about the proficiency and accuracy of performing keyword searches to isolate subsets of data to be reviewed and analyzed, prior to production, the legal team performs numerous keyword searches, throughout a litigation matter, in an effort to locate specific documents related to the case.

I believe there is an art to keyword searches.

It is a best practice to learn some basic syntax (format or pattern). The syntax will differ slightly depending on the document database software.

Here are some techniques you can research to learn more. I suggest running test searches in any database you have access to, including Google.

  1. Less is more. Try using as few keywords as possible in your initial searches.
  2. Add quotes to search for phrases. For example: “keyword searches” vs. keyword searches. These will yield different results.
  3. Use boolean operators. For example: OR, AND, NOT.
  4. Avoid using noise words or stop words.
  5. Be careful using wildcards (will*) and avoid using wildcards at the beginning of a keyword.
  6. Test several levels of fuzzy searching.
  7. Find out how your document database software handles a space or a special character (&, !, #).
  8. Take advantage of proximity searches (board w/2 game).
  9. Use parens to clarify which comes first. For example: ((Jim OR James) w/2 Wood).
  10. Avoid searching for company names or acronyms that are repeated over and over in document footers.

To learn more about search syntax, check out this Fast Tip Friday tutorial I recorded entitled Google Tips for Efficient Legal Research.

If you are asking your service provider to perform keyword searches, you should expect to receive an Excel report that displays five to ten columns of totals per search term. Some of the columns might be:

Total Hits

Total Unique Hits (Documents where the term is the only hit)

Total Documents

Total Family Documents

Total Excel Files

Custodian

Date Range

The best way to get proficient in performing keyword searches, is to practice. It is a good feeling to be the person on the legal team who finds the “right” documents that assist the attorneys in practicing law.

 

    I am very passionate about helping legal professionals succeed. I even quit my day job to devote more time to mentoring! I want to encourage you to subscribe and join the LitSuppGuru community. I share humorous, informative, and time-sensitive emails above and beyond what appears on this site.

    Please note: I reserve the right to delete comments that are salesy, offensive or off-topic.

    • mgolab

      Very good, Amy. Some further considerations:
      a) consider the email footer text for your client, and the other side, and specifically consider whether any of your keywords are going to have hits in the footer.
      b) case sensitive searches – does the search engine treat everything as the same case
      c) non-alphabet characters ie numbers and other – can the search engine search these or does it exclude them
      d) proximity searches – is the proximity number looking for characters or words ie within 20 – what does 20 mean?
      e) have your searches split into separate granular searches – rather than group many terms together, have them separate – this can be tedious to administer (although remarkably your care factor on this diminishes dramatically if someone else is conducting the search) however the benefits are that you can determine which search terms are returning the largest results
      f) ask if the search engine has a word wheel function where you can test the search term against the corpus to see variants

      Matthew

      • Thank you for the additions, Matthew. The article could be a novel considering all the tips we could offer from experience, right?

        My favorite is E and I should write a separate article on F because I teach my students how valuable it can be.