John Snow Labs’ Large Language Models and AWS Marketplace

28.07.2023

David Talby

Chief technology officer at John Snow Labs

Medical Large Language Models LLMs

In recent years, Large Language Models (LLMs) have revolutionized various industries by their ability to process and generate human-like text. In the field of healthcare, LLMs hold immense potential for a wide range of applications, from clinical note summarization to medical data de-identification. John Snow Labs, has introduced their Healthcare NLP library and a suite of healthcare-specific LLMs, offering industry-leading accuracy and privacy. You can check out our Healthcare NLP Medical Language Models here: https://www.johnsnowlabs.com/healthcare-nlp/

Accuracy: John Snow Labs’ benchmarking results reveal a significant leap in accuracy when compared to general-purpose LLMs like BART, Flan-T5, Pegasus, ChatGPT, and GPT-4. The findings demonstrate that John Snow Labs’ models outperform these state-of-the-art LLMs in crucial tasks such as clinical note summarization, clinical entity recognition, medical data de-identification, and extracting ICD-10-CM codes. For instance, the extraction of ICD-10-CM codes achieves an impressive 76% success rate, which is a substantial improvement over GPT-3.5 (26%) and GPT-4 (36%). These results highlight the exceptional accuracy that healthcare-specific LLMs bring to clinical and biomedical text mining.

Privacy and Control: Instead of relying on cloud APIs and sharing sensitive data with external companies, these healthcare-specific LLMs can be deployed within an organization’s infrastructure. The models operate behind the organization’s firewall and under their security controls, ensuring that no text is ever transmitted to third-party or cloud services. This approach safeguards both the privacy and intellectual property of healthcare providers, allowing them to harness the power of LLMs while maintaining full control over their data.

Customization and Adaptability: Our models can be trained and fine-tuned on the organization’s specific data and tasks, encompassing patient data, clinical guidelines, clinical trial protocols, and biomedical research. This customization enables healthcare providers to unlock the full potential of LLMs in answering medical questions, understanding research findings, generating clinical text, and summarizing clinical encounters. Additionally, the models support various content types, including structured data, unstructured text, PDF documents, and custom data formats, offering a comprehensive solution for healthcare text analysis.

Real-time Updates and Scalability: John Snow Labs’ LLM platform ensures that the information provided remains up-to-date and relevant. Organizations can update documents in near-real-time without the need for retraining the entire system, guaranteeing that users receive the latest answers based on the most recent data. Furthermore, the platform offers scalability, allowing organizations to process millions or billions of documents. This flexibility makes it suitable for healthcare providers with varying volumes of data and diverse use cases, ensuring the models can handle their specific requirements.

Medical Chatbot

John Snow Labs’ Medical Chatbot, powered by healthcare-specific Large Language Models (LLMs), offers healthcare organizations a comprehensive and secure conversational healthcare generative AI tool tailored to their specific needs. With robust security and privacy controls, scalability, and the ability to provide accurate and explainable answers, this AI Healthcare Chatbot is transforming healthcare conversations.

Learn all the features of Generative AI in Healthcare in the article, and use the State of the Art Generative AI for Healthcare solution: https://www.johnsnowlabs.com/generative-ai-in-healthcare/

John Snow Labs’ Medical Chatbot is an advanced conversational AI platform designed for high-compliance healthcare environments. It leverages Healthcare NLP, Visual NLP, and Healthcare LLMs to preprocess documents, understand user queries, and provide accurate answers in natural language. The platform excels in smart information extraction, allowing it to extract key facts from complex documents. This streamlines access to essential data, ensuring users receive comprehensive and relevant answers. Additionally, the Medical Chatbot incorporates pre-built knowledge bases, including medical terminologies and biomedical research, provided by John Snow Labs. These knowledge bases are regularly updated, guaranteeing access to the latest medical knowledge.

The platform operates within a private, single-tenant environment, ensuring that no data is shared with any third-party APIs. It incorporates a suite of security and privacy controls, including role-based access, AD/LDAP integration, audit trails, and LLM safety checks. By running in a self-contained Kubernetes environment, the platform offers scalability, accommodating varying content volumes and user demands. These security measures provide healthcare organizations with peace of mind, knowing that sensitive patient data remains confidential and protected.

Unlike general-purpose LLMs, the Medical Chatbot avoids the risk of hallucinations by design. Instead of directly generating answers, the healthcare-specific LLM is employed to form queries to a set of medical knowledge bases. The platform supports the addition of project-specific knowledge bases, where answers are restricted to the content within those documents. It enables scaling the knowledge bases to handle a large number of documents, even reaching into the millions. Documents can be updated regularly, ensuring that the chatbot provides up-to-date information in near real-time.

Introduction to John Snow Labs on Amazon Web Services (AWS) Marketplace

Our software is now available on Amazon Web Services (AWS), which allows users to access our software on a pay-as-you-go basis. This means that you can use our software without the need for any upfront capital expenditures, licenses, or contracts. With this option, you only pay for the computing resources that you actually use, so you can adjust your usage based on your business requirements. Using our software on AWS can help you reduce your overall IT costs.

Solutions on the AWS marketplace

https://aws.amazon.com/marketplace/seller-profile?id=961e2d20-005b-4aba-a82b-6fb560567d01&ref=dtl_prodview-kpac4xtqkxuqu

Healthcare NLP: This product offers state-of-the-art Natural Language Processing (NLP) libraries and Python notebooks that are specifically tuned for the Healthcare domain. The software includes licensed software & models for text mining, deep learning, and visual model training, tuning, and testing. It is available for use at prices ranging from $1.86 to $253.56/hr, in addition to AWS usage fees.
NLP Lab: This is a free end-to-end, no-code platform for data labeling and deep learning model training. It lets us extract data from PDFs, text documents, or images and train models that will automatically predict those facts on new documents.
Finance and Legal NLP: Similar to the Healthcare NLP, this product offers NLP libraries and Python notebooks, but these are specifically tuned for the Finance and Legal domains. The pricing for this product is the same as for the Healthcare NLP, ranging from $1.86 to $253.56/hr plus AWS usage fees.
NLP Libraries Prepaid: This product offers NLP libraries and Python notebooks, including licensed software & models for text mining, deep learning, and visual model training, tuning, and testing.

With this, you will be dealing directly with AWS and all billing, technical questions etc. will have to be directed to them. You can avail a free trial of our NLP Library as well. You can use your AWS account. AWS themselves will charge you for any usage, you’ll get one invoice from them.

Note: Free Trials will automatically convert to a paid subscription upon expiration and you will be charged for additional usage above the free units provided.

Installing John Snow Labs NLP Libraries via AWS Marketplace

The libraries provided by John Snow Labs are accessible via the AWS Marketplace. You would need an AWS account to access these libraries. Once you have an account, you can follow these general steps:

Navigate to the AWS Marketplace.
Search for “John Snow Labs” in the search bar or click here: https://aws.amazon.com/marketplace/seller-profile?id=961e2d20-005b-4aba-a82b-6fb560567d01&ref=dtl_prodview-kpac4xtqkxuqu
Click on the product you’re interested in. You’ll be taken to a page with more details about the product.
On this page, you’ll see an option to “Continue to Subscribe” or similar. Click on this button.
Review the pricing details and the terms of the subscription, then click “Continue to Configuration.”
After subscribing, you’ll see various options to configure the software. Choose your fulfillment options, software version and region.
Review your settings and then click “Launch.”
You’ll now have access to the software.

Additional AWS installation resources

Installing John Snow Labs NLP Libraries via AWS Marketplace: A Step-by-Step Tutorial

Setup the NLP Lab on AWS Marketplace

NLP Server deployment on AWS

Important User Information

Please note that the above is a general guideline and the exact steps may vary slightly based on the specific library and AWS’s interface at the time of your access.

After you have launched the software, it will be accessible through the AWS Management Console, Command Line Interface (CLI), or Software Development Kit (SDK) depending on the specific product and its configurations.

Each library will come with its own specific instructions for use, so you should read the product’s documentation to learn exactly how to use it. Some products might include Jupyter notebooks, Python packages, or other software that you can use directly in your own applications.

Keep in mind that usage of these libraries will incur charges based on the pricing outlined on the product’s page on the AWS Marketplace, in addition to any AWS infrastructure costs associated with running the software (like compute, storage, and data transfer costs).

No sensitive information related to you or your organization will be managed or stored by the John Snow Labs NLP library on the instance(s) where you deploy the product.

It is the user’s responsibility to safely manage sensitive documents processed using the John Snow Labs NLP libraries. All document processing is done locally and documents are not copied nor shared outside of the running instances.

On instance startup, navigate to http://<PUBLIC_IP_OF_INSTANCE> for information about the included libraries and links to ready-to-use notebooks for the most popular tasks. Click on one of the library boxes to navigate to your John Snow Labs – NLP Jupyter server and run the notebooks on your data.

The password for connecting to Jupyter Lab is the instance id. To run one of the notebooks, just click on it to open it and then click Run.

Prepaid NLP Libraries on AWS Marketplace

Offered as a prepaid subscription, this library allows businesses to better predict and control their costs, and eliminates pay-as-you-go microtransactions. The listing of prepaid NLP Libraries on the AWS Marketplace reflects a commitment to ease of access, user convenience, and seamless implementation. With the prepaid model, you pay upfront for a specific time period, typically a year. This allows you to predict your costs accurately, making budget planning much simpler and more reliable. For intensive projects with higher usage, Pay-as-you-go can sometimes lead to unexpectedly high costs. With prepaid, you know your costs upfront, enabling better control over your financial resources. With prepaid libraries, you make a single payment and have access to the tools and resources you need for the duration of the period.

AWS Architecture

Any structured or unstructured data can be consumed in AWS in a format of choice and process by AWS Glue, AWS Kinesis, or a 3rd party tool of choice.
All data is staged in AWS S3 Bucket for any further consumption. S3 Bucket serves as a data source for SparkNLP as well as the lakehouse layers.
Customers have the choice between a managed environment e.g. AWS Sagemaker, AWS Databricks, Amazon EMR or the AWS Marketplace offers to run Spark NLP. Both require compute and storage AWS is charging for. Both require a customer license for us to monetize.
NLP Libraries is deployed on a VM of choice which AWS charges for. The VM enables the compute to interact with the NLP Libraries Prepaid. The NLP Server functions similar to API interactions.
NLP Lab is a free environment on a VM of choice which AWS charges for enabling Low code/No code NLP. We would only monetize if a customer uses with their license our proprietary models.
NLP Libraries are deployed on a VM of choice which AWS charges for. This service gives customers code level access to Spark NLP and more.
The S3 Bucket cleans unstructured text data with SparkNLPs available tasks to prepare the text to be properly processed. Common tasks are Splitting, Cleaning, Understand Grammar, or Classify Documents.
The S3 Bucket analyses unstructured text data with SparkNLPs available tasks to retrieve information through e.g. entities or summaries. Common tasks are Named Entity Recognition, Translation, Summarization, Information Extraction
The S3 Bucket further enhances unstructured text data with SparkNLPs available tasks to create enhanced business value for down-stream engines e.g. Search, Graphs, or industry models. A common task here is Relationship creation/extraction to understand the linguistic relationships between entities.
Customers can consume through the marketplace or cloud service any output. Common patterns for QuickSight tools or webapps are writing the data back to a storage location and then consume it to decrease runtimes.

Try Healthcare LLMs

See in action

David Talby

Chief technology officer at John Snow Labs

Our additional expert:

David Talby is a chief technology officer at John Snow Labs, helping healthcare & life science companies put AI to good use. David is the creator of Spark NLP – the world’s most widely used natural language processing library in the enterprise. He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK. David holds a PhD in computer science and master’s degrees in both computer science and business administration.

John Snow Labs Continues Its Lifelong Carbon Neutrality Commitment by Achieving CarbonNeutral® Company Status in 2023

Gina Devine

John Snow Labs, the award-winning Healthcare AI and NLP company renowned for its Spark NLP library, proudly announces its continued commitment to...