Data scientists and the teams they serve can easily spend 50 percent to 80 percent of their time preparing and standardizing data for analysis. In the healthcare industry, this includes regularly updating medical indexes and dictionaries to comply with federal mandates, as well as curating and verifying lists of known cybersecurity threats. This is called data operations (“DataOps”), or among more beleaguered data scientists: “data wrangling.”
For Atigeo, a technology company whose flagship big data analytics product, xPatterns, generates knowledge from all available data to deliver insights, predict outcomes, and mitigate risks, DataOps is a necessary priority—but not its first. To reduce the time spent finding, cleaning, formatting, updating, and publishing data for analysis, the team of healthcare data analysts at Atigeo partnered with John Snow Labs, a DataOps company whose mission is to “accelerate the use of data to improve human well-being.” On both a practical and ideological level, John Snow Labs was an apt fit for Atigeo, whose own charge in healthcare is to help organizations leverage its technology to drive operational efficiency, improve clinical and financial performance, and provide the most compassionate, patient-centric care.
Working with John Snow Labs and its data libraries—which include healthcare provider indexes, billing codes, and sets of approved drugs and medical terminology—Atigeo saves an estimated 4,096 hours preparing data every month.
Inside The Numbers
- xPatterns must ingest approximately…
- 200 frequently changing datasets (monthly on average) like the Master Provider Index
- 100 complex datasets—twice-yearly updates that now take half a day instead of a week
- 250 infrequently changing datasets—one update per year, on average
Now that both frequently and infrequently changing datasets take about ten minutes instead of two days to update, and complex datasets take about half a day instead of one week to update, Atigeo saves an estimated…
- 3,166 man-hours making monthly updates to frequently changing datasets
- 600 man-hours making twice-yearly updates to complex datasets
- 330 man-hours making yearly updates to infrequently changing datasets
…for a grand total of 4,096 man-hours saved on average, every month.
This equates to 24 man-months, or 24 full-time employees updating datasets every hour of every month.
Domain Expertise Delivers Quality Data For Wiser Decisions
Domain expertise drives the high quality and corresponding ease of use of John Snow Labs datasets. As a company specializing in DataOps for healthcare analytics, our teams contain real doctors, among them data researchers with medical degrees from Harvard and doctorates in cognitive psychology. Experts of this caliber act as…
- librarians, dissecting broad healthcare inquiries to find data scientists the exact information they need
- domain esxperts, directing data scientists to the most relevant datasets for a given healthcare problem
- alchemists, researching and simulating data to fill privacy-related gaps in datasets
- and engineers, recommending the right tools for specific datasets
The partnership with John Snow Labs means that the team at Atigeo is now able to update the MPI 96 times faster than before.
Please download here the full DataOps Case Study – xPatterns John Snow Labs