Legal NLP Releases Law Stack Exchange Classifier, Subpoena NER and more

09.08.2023

David Cecchini

Data Scientist at John Snow Labs

The latest version of Legal NLP comes with a new classification model on Law Stack Exchange questions and Named-Entity Recognition on Subpoenas, which will raise the efficiency of NLP for contracts.

Law Stack Exchange Classifier

This new model is a text classifier trained on the Law Stack Exchange dataset. It can predict the topic of the question into one of the pre-selected law-related categories:

business
constitutional law
contract law
copyright
criminal law
employment
liability
privacy
tax law
trademark

Please note that some categories from the original data were removed (e.g., “internet”) due to limited examples. To use the model, simply download it from Spark NLP Models Hub on your pipeline:

sequenceClassifier = (
    legal.BertForSequenceClassification.pretrained(
        "legclf_law_stack_exchange", "en", "legal/models"
    )
    .setInputCols(["document", "token"])
    .setOutputCol("class")
)

With the model, questions can be categorized. For example, the following text is categorized by the model as belonging to the copyright category.

I have been helping a nonprofit by developing a piece of software that they needed. The software is more-or-less built to their specs in a ‘functional’ way, but I wrote 100% of the code: they are not programmers. Anyhow, we didn’t make any kind of contract at the beginning verbally or otherwise. Who owns the copyright to all of this? Do they have any rights to it at all for providing ‘ideas’?

Subpoena NER

This model is trained on an in-house dataset to identify entities in Subpoena documents. The list of entities that the model can identify is the following:

STATE
COUNTY
COURT
CASE
RECEIVER
ADDRESS
APPOINTMENT_DATE
APPOINTMENT_HOUR
DOCUMENT_TYPE
DOCUMENT_PERSON
DOCUMENT_DATE_FROM
DOCUMENT_DATE_TO
SUBPOENA_DATE
SIGNER
COURT_ADDRESS
MATTER
MATTER_VS
DOCUMENT_TOPIC
DOCUMENT_DATE_YEAR
DEADLINE_DATE

To use the model, you can simply download it from Spark NLP Models Hub in your pipeline:

ner = legal.NerModel.pretrained()

Then it can be used to identify the entities in new subpoena documents.

Example of identified entities

Fancy trying?

We’ve got 30-days free licenses for you with technical support from our legal team of technical and SME. This trial includes complete access to more than 926 models, including Classification, NER, Relation Extraction, Similarity Search, Summarization, Sentiment Analysis, Question Answering, etc. and 120+ legal language models.

Just go to https://www.johnsnowlabs.com/install/ and follow the instructions!

Don’t forget to check our notebooks and demos.

How to run

Legal NLP is extremely easy to run on both clusters and driver-only environments using johnsnowlabs library:

! pip install johnsnowlabs

from johnsnowlabs import nlp

nlp.install(force_browser=True)

# Start Spark Session
spark = nlp.start()

# Import the Legal NLP module
from johnsnowlabs import legal

For alternative installation methods of how to install in specific environments, please check the docs.

How useful was this post?

David Cecchini

Data Scientist at John Snow Labs

Our additional expert:

Ph.D. at Tsinghua-Berkeley Shenzhen Institute | Data Scientist

Legal NLP releases new Contract NLI model and more

David Cecchini

The latest 1.16.0 version of Legal NLP releases a new Contract NLI model (NLP for contracts) and added new demonstration apps for...

Legal NLP Releases Law Stack Exchange Classifier, Subpoena NER and more

Law Stack Exchange Classifier

Subpoena NER

Fancy trying?

How to run

Legal NLP releases new Contract NLI model and more

Recommended For You