Ask a question about this section

Predicting judicial decisions of the European Court of Human Rights: a Natural Language Processing perspective

View article
PeerJ Computer Science
An amicus curiae (friend of the court) is a person or organisation that offers testimony before the Court in the context of a particular case without being a formal party to the proceedings.
ECHtR provisional annual report for the year 2015: http://www.echr.coe.int/Documents/Annual_report_2015_ENG.pdf.
HUDOC ECHR Database: http://hudoc.echr.coe.int/.
Nonetheless, not all cases that pass this first admissibility stage are decided in the same way. While the individual judge’s decision on admissibility is final and does not comprise the obligation to provide reasons, a Committee deciding a case may, by unanimous vote, declare the application admissible and render a judgment on its merits, if the legal issue raised by the application is covered by well-established case-law by the Court.
Rules of ECtHR, http://www.echr.coe.int/Documents/Rules_Court_ENG.pdf.
The data set is publicly available for download from https://figshare.com/s/6f7d9e7c375ff0822564.
Note that all the cases used as examples in this section are taken from the data set we used to perform the experiments.

Main article text

 

Introduction

Materials and Methods

European Court of Human Rights

Case processing by the court

Main premise

Case structure

  • Procedure: This section contains the procedure followed before the Court, from the lodging of the individual application until the judgment was handed down.

  • The facts: This section comprises all material which is not considered as belonging to points of law, i.e., legal arguments. It is important to stress that the facts in the above sense do not just refer to actions and events that happened in the past as these have been formulated by the Court, giving rise to an alleged violation of a Convention article. The ‘Facts’ section is divided in the following subsections:

    • The circumstances of the case: This subsection has to do with the factual background of the case and the procedure (typically) followed before domestic courts before the application was lodged by the Court. This is the part that contains materials relevant to the individual applicant’s story in its dealings with the respondent state’s authorities. It comprises a recounting of all actions and events that have allegedly given rise to a violation of the ECHR. With respect to this subsection, a number of crucial clarifications and caveats should be stressed. To begin with, the text of the ‘Circumstances’ subsection has been formulated by the Court itself. As a result, it should not always be understood as a neutral mirroring of the factual background of the case. The choices made by the Court when it comes to formulations of the facts incorporate implicit or explicit judgments to the effect that some facts are more relevant than others. This leaves open the possibility that the formulations used by the Court may be tailor-made to fit a specific preferred outcome. We openly acknowledge this possibility, but we believe that there are several ways in which it is mitigated. First, the ECtHR has limited fact-finding powers and, in the vast majority of cases, it defers, when summarizing the factual background of a case, to the judgments of domestic courts that have already heard and dismissed the applicants’ ECHR-related complaint (Leach, Paraskeva & Uelac, 2010; Leach, 2013). While domestic courts do not necessarily hear complaints on the same legal issues as the ECtHR does, by virtue of the incorporation of the Convention by all States Parties (Helfer, 2008), they typically have powers to issue judgments on ECHR-related issues. Domestic judgments may also reflect assumptions about the relevance of various events, but they also provide formulations of the facts that have been validated by more than one decision-maker. Second, the Court cannot openly acknowledge any kind of bias on its part. This means that, on their face, summaries of facts found in the ‘Circumstances’ section have to be at least framed in as neutral and impartial a way as possible. As a result, for example, clear displays of impartiality, such as failing to mention certain crucial events, seem rather improbable. Third, a cursory examination of many ECtHR cases indicates that, in the vast majority of cases, parties do not seem to dispute the facts themselves, as contained in the ‘Circumstances’ subsection, but only their legal significance (i.e., whether a violation took place or not, given those facts). As a result, the ‘Circumstances’ subsection contains formulations on which, in the vast majority of cases, disputing parties agree. Last, we hasten to add that the above three kinds of considerations do not logically entail that other forms of non-outright or indirect bias in the formulation of facts are impossible. However, they suggest that, in the absence of access to other kinds of textual data, such as lodged applications and briefs, the ‘Circumstances’ subsection can reasonably perform the function of a (sometimes crude) proxy for a textual representation of the factual background of a case.

    • Relevant law: This subsection of the judgment contains all legal provisions other than the articles of the Convention that can be relevant to deciding the case. These are mostly provisions of domestic law, but the Court also frequently invokes other pertinent international or European treaties and materials.

  • The law: The law section considers the merits of the case, through the use of legal argument. Depending on the number of issues raised by each application, the section is further divided into subsections that examine individually each alleged violation of some Convention article (see below). However, the Court in most cases refrains from examining all such alleged violations in detail. Insofar as the same claims can be made by invoking more than one article of the Convention, the Court frequently decides only those that are central to the arguments made. Moreover, the Court frequently refrains from deciding on an alleged violation of an article, if it overlaps sufficiently with some other violation it has already decided on.

    • Alleged violation of article x: Each subsection of the judgment examining alleged violations in depth is divided into two sub-sections. The first one contains the Parties’ Submissions. The second one comprises the arguments made by the Court itself on the Merits.

      • Parties’ submissions: The Parties’ Submissions typically summarise the main arguments made by the applicant and the respondent state. Since in the vast majority of cases the material facts are taken for granted, having been authoritatively established by domestic courts, this part has almost exclusively to do with the legal arguments used by the parties.

      • Merits: This subsection provides the legal reasons that purport to justify the specific outcome reached by the Court. Typically, the Court places its reasoning within a wider set of rules, principles and doctrines that have already been established in its past case-law and attempts to ground the decision by reference to these. It is to be expected, then, that this subsection refers almost exclusively to legal arguments, sometimes mingled with bits of factual information repeated from previous parts.

  • Operative provisions: This is the section where the Court announces the outcome of the case, which is a decision to the effect that a violation of some Convention article either did or did not take place. Sometimes it is coupled with a decision on the division of legal costs and, much more rarely, with an indication of interim measures, under article 39 of the ECHR.

Data

Description of textual features

  • N-gram features: The Bag-of-Words (BOW) model (Salton, Wong & Yang, 1975; Salton & McGill, 1986) is a popular semantic representation of text used in NLP and Information Retrieval. In a BOW model, a document (or any text) is represented as the bag (multiset) of its words (unigrams) or N-grams without taking into account grammar, syntax and word order. That results to a vector space representation where documents are represented as m-dimensional variables over a set of m N-grams. N-gram features have been shown to be effective in various supervised learning tasks (Bamman, Eisenstein & Schnoebelen, 2014; Lampos & Cristianini, 2012). For each set of cases in our data set, we compute the top-2000 most frequent N-grams where N ∈ {1, 2, 3, 4}. Each feature represents the normalized frequency of a particular N-gram in a case or a section of a case. This can be considered as a feature matrix, C ∈ ℝc×m, where c is the number of the cases and m = 2, 000. We extract N-gram features for the Procedure (Procedure), Circumstances (Circumstances), Facts (Facts), Relevant Law (Relevant Law), Law (Law) and the Full case (Full) respectively. Note that the representations of the Facts is obtained by taking the mean vector of Circumstances and Relevant Law. In a similar way, the representation of the Full case is computed by taking the mean vector of all of its sub-parts.

  • Topics: We create topics for each article by clustering together N-grams that are semantically similar by leveraging the distributional hypothesis suggesting that similar words appear in similar contexts. We thus use the C feature matrix (see above), which is a distributional representation (Turney & Pantel, 2010) of the N-grams given the case as the context; each column vector of the matrix represents an N-gram. Using this vector representation of words, we compute N-gram similarity using the cosine metric and create an N-gram by N-gram similarity matrix. We finally apply spectral clustering (von Luxburg, 2007)—which performs graph partitioning on the similarity matrix—to obtain 30 clusters of N-grams. For Articles 6 and 8, we use the Article 3 data for selecting the number of clusters T, where T = {10, 20, …, 100}, while for Article 3 we use Article 8. Given that the obtained topics are hard clusters, an N-gram can only be part of a single topic. A representation of a cluster is derived by looking at the most frequent N-grams it contains. The main advantages of using topics (sets of N-grams) instead of single N-grams is that it reduces the dimensionality of the feature space, which is essential for feature selection, it limits overfitting to training data (Lampos et al., 2014; Preoţiuc-Pietro, Lampos & Aletras, 2015; Preoţiuc-Pietro et al., 2015) and also provides a more concise semantic representation.

Classification model

The problem of predicting the decisions of the ECtHR is defined as a binary classification task. Our goal is to predict if, in the context of a particular case, there is a violation or non-violation in relation to a specific Article of the Convention. For that purpose, we use each set of textual features, i.e., N-grams and topics, to train Support Vector Machine (SVM) classifiers (Vapnik, 1998). An SVM is a machine learning algorithm that has shown particularly good results in text classification, especially using small data sets (Joachims, 2002; Wang & Manning, 2012). We employ a linear kernel since that allows us to identify important features that are indicative of each class by looking at the weight learned for each feature (Chang & Lin, 2008). We label all the violation cases as +1, while no violation is denoted by −1. Therefore, features assigned with positive weights are more indicative of violation, while features with negative weights are more indicative of no violation.

Results and Discussion

Predictive accuracy

Discussion

Legal formalism and realism

Topic analysis

Conclusions

Additional Information and Declarations

Competing Interests

Nikolaos Aletras is an employee of Amazon.com, Cambridge, UK, but work was completed while at University College London.

Author Contributions

Nikolaos Aletras and Vasileios Lampos conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, performed the computation work, reviewed drafts of the paper.

Dimitrios Tsarapatsanis conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Daniel Preoţiuc-Pietro conceived and designed the experiments, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Data Availability

The following information was supplied regarding data availability:

ECHR dataset: https://figshare.com/s/6f7d9e7c375ff0822564.

Funding

DPP received funding from Templeton Religion Trust (https://www.templeton.org) grant number: TRT-0048. VL received funding from Engineering and Physical Sciences Research Council (http://www.epsrc.ac.uk) grant number: EP/K031953/1. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

 
0

In the selection of number of clusters and in parameter tuning, how can data from one article, say Article 3, be used for cluster size selection and parameter tuning for Article 6 if the rows of the feature matrices (n-gram frequency matrices) across articles do not correspond with each other? (For example, the first row for Article 3 data corresponds to abducted while for Article 6 it is abolitio...

read more, vote or answer

waiting for moderation
Ask a question
420 Citations 93,583 Views 17,727 Downloads