The healthcare industry is revolutionizing oncology and it would not be an overstatement to hold Big Data as being largely responsible for this revolution. Taking breast cancer as an example, the Big Data revolution has given rise to a multitude of treatment options with the potential to further personalize the diagnosis.

Unlike IT, finance and/or automobile sector(s), the healthcare industry has, for a long time, remained deprived of the Big Data benefits. In the absence of Big Data to draw insightful analysis from, doctors would scrutinize only a handful of data sets and prescribe treatments based on the corresponding analysis.

Such treatments, however, are more general than personal. For instance, statistics shows that only 3-4% of cancer patients participate in clinical trials while the data from the remaining 96% of cancer population largely remains untapped. This 3-4% of patients are mostly young, healthy, and highly likely to benefit from treatment. But in practice, people who end up receiving medications resulting from those trials do not match this profile, and thus largely remain uncured. It is estimated that nearly 75% of cancer treatments don’t work, and billions of dollars spent on cancer drugs are wasted in this process.

Far and away, the relative lag in adopting Big Data in the healthcare industry can be attributed to patients’ confidentiality concerns, besides the lack of IT infrastructures to handle and store such heterogeneous and large-scale data which comprises of the patient’s personal, clinical, genetic, and molecular information from pre-treatment and pre-diagnostic phase to post-diagnostic and post-treatment phase.

However, with the introduction of Big Data in the form of electronic medical records (EMRs) and accessible databases, doctors could start to learn from more than just those patients used in clinical trials.

The figures below show the relative percentages of diagnosed and death cases resulting from breast cancer in nine prime areas of the US: San Francisco, Connecticut, Detroit, Hawaii, Iowa, New Mexico, Seattle, Utah, and Atlanta, from 1975-2012 (Data Source: National Center for Health Statistics, Centers for Disease Control and Prevention). According to estimations till 2015, approximately 2 million people were affected from breast cancer in the US alone.

 

Figure 1: Percentage of diagnosed breast cancer cases per 100,000. The percentage was the highest (around 130%) for white females, while the lowest for white males (around 2%)

Figure 1: Percentage of diagnosed breast cancer cases per 100,000. The percentage was the highest (around 130%) for white females, while the lowest for white males (around 2%)

 

Figure 2: Percentage of death cases resulting from breast cancer per 100,000. The percentage was the highest for black females (around 38%), while the lowest for white males (around 1%)

Figure 2: Percentage of death cases resulting from breast cancer per 100,000. The percentage was the highest for black females (around 38%), while the lowest for white males (around 1%)

Figure 1: Percentage of diagnosed breast cancer cases per 100,000. The percentage was the highest (around 130%) for white females, while the lowest for white males (around 2%)

Figure 3 is of prime importance as it reflects the change in death percentage year after year from 1975 to 2012.  The lowest panel of the time series plot shows that there has been a steady decline in the death percentage from 19% to 12%, from 1990 onwards. The reason for this marked decrease is likely to be attributed to the introduction and advancement of Big Data in hospitals and cancer diagnosis.

With Big Data, there is a promising opportunity to influence a significant decrease in the death rates due to cancer or any such hazardous disease. With data analytics, doctors would be able to see what therapies worked best with most patients in similar circumstances, for example, or they’d be able to evaluate their own outcomes with, say, breast cancer treatment against those of other specialists across the nation and correct any deficiencies quickly.

They would also be able to track in real-time what lots of patients are experiencing during their treatment, and later on in their cure. Not only this, the data could also shine a spotlight on cost-effective therapies and, conversely, on wasteful health care spending. It can help match more patients with suitable clinical trials, thus speeding up the development and approval of new medicines.