The IPython notebook shared in this post was built by Mark Pinches, a data scientist with years of expertise in pharma

It uses the US healthcare providers dataset from John Snow Labs to answer queries using slicing, joins, aggregations and visualizations. The dataset is tabular and has almost 5 million rows and over 300 columns.

This analysis makes heavy use of GraphLab’s SFrame object from Turi (formerly Dato and one of our data science partners), taking advantage of its optimizations & scalability for large out-of-memory datasets.
VIEW THE ANALYSIS HERE