Setting panel

Color schemes:

Slide background

LANGUAGE.AI

Natural Language Processing

Information Extraction

Named Entity Recognition

Document Classification

Machine Learning

Slide background

Unlock the Value of Your Data

Data Transformations

Training Data Crowdsourcing

Machine Learning and Ad-hoc Rule-based Solutions

In-house and Cloud-based Software

Data Quality Assurance

services

Your data is piling up but never used? You need data analysis and mining but your data is in free-text form? You are using data entry and manual process to structure and export your data? You need to organize and be able to search through your documents?

DATA CLEANSING AND TRANSFORMATIONS

We work with free-form or semi-structured textual data. You have a variety of document formats, some are scans and images? No worries, we use state-of-the-art OCR and document transformation solutions.

INFORMATION EXTRACTION

You need to use, analyze, and make predictions based on structured data but all you have is free-form text? Our speciality is Information Extraction, the science of automatically extracting structured information.

MACHINE LEARNING AND AD-HOC RULE-BASED SOLUTIONS

We value common sense and efficiency above complicated research solutions. After analyzing your dataset and problem we will suggest the most efficient approach: rule-based, machine learning, or a combination of the two.

CLOUD-BASED AND IN-HOUSE SOFTWARE

We offer one-time text processing, SaaS solutions with data model updates, or in-house software for private and sensitive data.

DATA ANALYSIS AND PERFORMANCE ESTIMATES

Some problems are easier than others. Before we embark on a solution we analyze your data and create a scientific performance estimation model. Statistics don't lie, you will know what to expect.

DOCUMENT CLASSIFICATION

We use algorithmic document classification techniques and provide solutions for automatic document categorization for electronic discovery and routing, sentiment analysis, email routing, and spam filtering.

PERFORMANCE METRICS AND DOCUMENTATION

We provide various performance measures on a statistically representative sample of your data. Both rule-based and machine learning solutions are thoroughly documented.

DATA QA AND CROWDSOURCING

Manual data cleansing, quality assurance, and training data generation are provided via managed crowdsourcing or in-house personnel.

As a Relativity Developer Partner, we offer development of custom Text Analytics / Machine Learning applications and scripts built on kCura’s Relativity eDiscovery platform.

Sample applications / scripts include:

  • Privileged document discovery
  • Contract review: identifying contract clauses of interest, semantic document comparison
  • Identify sentences/paragraphs within a document pertaining to an issue
  • High-accuracy Machine Learning-based Document Classification, complementing the Relativity Assisted Review functionality
  • Machine Learning based highlights and redactions
  • Intelligent Concept Discovery (including synonyms, spelling variations, typos, and excluding negated concepts)
  • Integrated Document Annotation Interface
  • Generation of Custom Reports, Document Samples, and Batches
  • Intelligent Deduplication
Years NLP has been around
Zettabytes of data generated by 2020
Percent of unstructured data in organizations
Percent growth in NLP job postings

about

We offer consulting services around small, big, and medium textual data. Our specialties are Natural Language Processing, Machine Learning, and Information Extraction. We are a team of NLP researchers (PhDs), experienced polyglot software engineers, data QAs.

NLP and ML

Our infrastructure of NLP and ML tools allow us to quickly build both prototypes and production-ready applications. We have built solutions using established NLP frameworks, as well as a variety open-source and in-house ML algorithm implementations.

Software Development

We are full-stack developers and polyglots and have developed projects using Java, Python, Scala, Clojure, and C#.

Out-of-the-box Solutions

We are familiar with most available third party NLP solutions. When applicable, we evaluate performance of existing solutions.

Data Transformations

We extract and clean text from a variety of document formats (e.g. PDF, HTML, RTF, Word, Images and Scans). We have built an infrastructure for manual data cleansing and training data generation.