Watch Healthcare NLP Summit 2024. Watch now.
was successfully added to your cart.

Analyzing a Large-Scale Healthcare Providers Data Set using GraphLab Create

The IPython notebook shared in this post was built by Mark Pinches, a data scientist with years of expertise in pharma, and the founder of

It uses the US healthcare providers dataset from John Snow Labs to answer queries using slicing, joins, aggregations and visualizations. The dataset is tabular and has almost 5 million rows and over 300 columns.

This analysis makes heavy use of GraphLab’s SFrame object from Turi (formerly Dato and one of our data science partners), taking advantage of its optimizations & scalability for large out-of-memory datasets.

We’d love to hear your feedback on this example, showing how fast & simple it is for an expert data scientist to get answers when pairing high-quality data with a cutting edge platform.

Please  VIEW THE ANALYSIS HERE and feel free to contact us for any technical troubleshooting or query at


Solving DataOps for Healthcare Analytics

Data Science & 21st Century Healthcare At John Snow Labs, we believe that data science will be a major driver of progress...