Problems
Correctly classify a document on 4 dimensions including case type and winner.
Documents have undergone OCR and are machine readable. The documents can be written in multiple languages and come from a variety of courts and legal systems across the world.
Solution
A model factory that trains a model per language/court/class dimension that’s fine tuned to a specific combination of legal system, language, etc.
Technical Highlights
This model factory automatically optimizes hyperparameters of the whole MLpipeline, estimates the performance and performance stability, adjusts manual review flag thresholds to ensure that minimal performance is achieved, trains the model and prepares it for deployment.