I have a corpus of a few 100-thousand legal documents (mostly from the European Union) – laws, commentary, court documents etc. I am trying to algorithmically make some sense of them.
I have modeled the known relationships (temporal, this-changes-that, etc). But on the single-document level, I wish I had better tools to allow fast comprehension. I am open for ideas, but here's a more specific question:
For example: are there NLP methods to determine the relevant/controversial parts of documents as opposed to boilerplate? The recently leaked TTIP papers are thousands of pages with data tables, but one sentence somewhere in there may destroy an industry.
I played around with google's new Parsey McParface
, and other NLP solutions in the past, but while they work impressively well, I am not sure how good they are at isolating meaning.
NLP-powered legal search engines can translate plain language into “legalese,” making it easier to sift through relevant documents and cases. More advanced NLP programs can search for concepts, not just specific keywords, helping lawyers find what they need faster.
NLP and text mining differ in the goal for which they are used. NLP is used to understand human language by analyzing text, speech, or grammatical syntax. Text mining is used to extract information from unstructured and structured content. It focuses on structure rather than the meaning of content.
... Contract review is one of the main commercial applications of NLP in the legal domain (Dale, 2019) .
Email filters are one of the most basic and initial applications of NLP online. It started out with spam filters, uncovering certain words or phrases that signal a spam message.
In order to make sense out of documents you need to perform some sort of semantic analysis. You have two main possibilities with their exemples:
Use Frame Semantics: http://www.cs.cmu.edu/~ark/SEMAFOR/
Use Semantic Role Labeling (SRL): http://cogcomp.org/page/demo_view/srl
Once you are able to extract information from the documents then you may apply some post-processing to determine which information is relevant. Finding which information is relevant is task related and I don't think you can find a generic tool that extracts "the relevant" information.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With