Natural Language Processing in Law

As an outcome of TÜBİTAK 1001 Project named “Natural Language Processing and Machine Learning Techniques for Applications of Artificial Intelligence in Legal Texts”, KocLab is pleased to announce that a journal paper has been published in Information Processing & Management. The paper “Natural Language Processing in Law: Prediction of outcomes in the Higher Courts of Turkey” is a pioneering work that addresses the Turkish Legal System in a comprehensive manner and develops Natural Language Processing (NLP) techniques for the legal system of the Republic of Turkey.

The problem of predicting the decisions of the Turkish higher courts (Constitutional Court, Civil Court of Appeal, Criminal Court of Appeal, Administrative Court of Appeal and Court of Appeal on Taxation) are tackled with the use of various Machine Learning (ML) algorithms. The decisions of the courts are predicted using only fact descriptions. Specific to the Turkish legal system, codified in Turkish, several algorithms utilizing Decision Trees (DTs), Random Forests (RF), Support Vector Machines (SVM) and 3 state-of-the-art Deep Learning (DL) algorithms (Gated Recurrent Units (GRUs), Long Short-Term Memory networks (LSTMs) and bidirectional LSTMs (BiLSTMs) are developed. An attention mechanism specific to legal texts has also been integrated into the DL algorithms.

This study is broader than the similar works in other legal systems and languages in literature in terms of the number of investigated courts and the number of utilized algorithms. The study also discusses some of the practical applications, and the legal and ethical implications of the use of such machine-based predictive systems. Hence, it not only pioneers the NLP and AI for our country’s legal system but also provides a reference point and baseline for further studies in this field.

The paper can be accessed here.

Abstract:

Natural language processing (NLP) based approaches have recently received attention for legal systems of several countries. It is of interest to study the wide variety of legal systems that have so far not received any attention. In particular, for the legal system of the Republic of Turkey, codified in Turkish, no works have been published. We first review the state-of-the-art of NLP in law, and then study the problem of predicting verdicts for several different courts, using several different algorithms. This study is much broader than earlier studies in the number of different courts and the variety of algorithms it includes. Therefore it provides a reference point and baseline for further studies in this area. We further hope the scope and systematic nature of this study can set a framework that can be applied to the study of other legal systems. We present novel results on predicting the rulings of the Turkish Constitutional Court and Courts of Appeal, using only fact descriptions, and without seeing the actual rulings. The methods that are utilized are based on Decision Trees (DTs), Random Forests (RFs), Support Vector Machines (SVMs) and state-of-the-art deep learning (DL) methods; specifically Gated Recurrent Units (GRUs), Long Short-Term Memory networks (LSTMs) and bidirectional LSTMs (BiLSTMs), with the integration of an attention mechanism for each model. The prediction results for all algorithms are given in a comparative and detailed manner. We demonstrate that outcomes of the courts of Turkish legal system can be predicted with high accuracy, especially with deep learning based methods. The presented results exhibit similar performance to earlier work in the literature for other languages and legal systems.