In litigation cases, one of the primary reasons for creating document databases is so that we can gain the ability to search across all of the documents. Paralegals, attorneys and litigation support professionals will all perform database searches. Service providers will also perform searches at the request of their client.
When the searches are conducted, the search results will almost always include what we refer to as “false hits” or “false positives“. The search results contain every hit on the particular search terms within any document. However, the search hit in this particular instance is not relevant to the search request. The search hit is accurate in a literal sense, but the context of the search term is not accurate.
For example, if a search term is the word “bond” and the search results include the two sentences below, one of them would most likely be a “false hit”.
A bond that offers no interest rate on its face, but that allows investors to convert to stock if the stock price reaches a level higher than its current price on the open market.
The bond has strength because the adhesive is hard enough to resist flow when stress is applied to the bond.
There are best practices when running these searches so that the volume of “false hits” is reduced. There are too many to discuss in this article, but two at the top of the list would be to search for phrases or proximity instead of individual words, where it makes sense. A search phrase would include the use of quotes, such as “false hit” instead of a search for false or hit. In this example, it will only return search hits where the word false is immediately to the left of the word hit. A proximity search might look like false w/2 hit, which translates to the word false within 2 words of the word hit.