Thursday, January 23, 2014

Real-Time Predictive Coding Added in Clustify 4.0

Hot Neuron LLC announces version 4.0 of its ClustifyTM software, the first technology-assisted review tool to offer real-time predictive coding.

Predictive coding is a machine learning technique where software learns to predict appropriate issue codes, tags, or categories for documents based on examples provided by a human reviewer, significantly reducing the time and expense of the document review phase of e-discovery.  The real-time predictive coding capability added in Clustify 4.0 updates the predicted relevance scores for the entire document population each time a document is reviewed, showing the impact on the progress pie and the precision-recall curve instantly.  The software also warns the user immediately, while the user's reasoning about the document is still fresh, if the tags applied to a document seem inconsistent with the tags the reviewer applied to other documents, helping to avoid errors.

Clustify 4.0 offers powerful sampling capabilities.  It allows both random and judgmental sampling when choosing documents to train the algorithm.  It offers several different active learning algorithms that suggest training documents for review that will help the system to learn efficiently.  It also allows the user to specify a diversity level when choosing training documents to ensure that no training documents are too similar to documents that have already been reviewed.

"This is really the next generation of predictive coding," according to Hot Neuron CEO Bill Dimm.  "The software doesn't get in your way.  You can review whatever documents you want, and the software shows you the progress you are making every time you review a training document without needing to wait until you've completed a batch.  It's like a teacher having a continuous one-on-one interaction with a student, rather than lecturing a student for an entire semester and finding out whether he/she learned anything via the final exam."

In a test on 1.3 million documents totaling 3.3 gigabytes of text on a modest desktop computer, Clustify took an average of a tenth of a second to update the relevance scores for the population when a training document was reviewed.  Speed will depend on the details of the document set.

Clustify 4.0 will be unveiled at the LegalTech trade show in New York on February 4-6, 2014.  A video preview is available at:

By Guest Blogger: Clustify / Hot Neuron LLC