The IPython notebook shared in this post was built by Mark Pinches, a data scientist with years of expertise in pharma, and the founder of Alderly.ai.
It uses the US healthcare providers dataset from John Snow Labs to answer queries using slicing, joins, aggregations and visualizations. The dataset is tabular and has almost 5 million rows and over 300 columns.
This analysis makes heavy use of GraphLab’s SFrame object from Turi (formerly Dato and one of our data science partners), taking advantage of its optimizations & scalability for large out-of-memory datasets.
We’d love to hear your feedback on this example, showing how fast & simple it is for an expert data scientist to get answers when pairing high-quality data with a cutting edge platform.