Developer, Trends

Disease Pattern Analysis using Big Data in Healthcare

In the modern era big data analytics has becomes one of the most popular and demanding technologies for healthcare as it has become imperative to obtain useful insights from available data-sets. In this article, we are going to talk about healthcare data analysis to identify disease patterns using big data techniques such as ‘Hadoop Map reduce’ followed by ‘K-means clustering’ and prediction methods that provide clear insights.

Big data and data mining techniques in healthcare have made tremendous improvements in areas like feature extraction, availability and accessibility of healthcare data set.

Big data typically refers to the following types of data:

  • Traditional enterprise data –information from CRM systems, ERP data etc.
  • Machine-generated /sensor data – includes call detail Records (“CDR”), weblogs etc.
  • Social data 

Need for Big Data Analytics in Healthcare:

  • Providing patient centric services
  • Detecting spreading of diseases
  • Improving the hospital’s quality
  • Improving treatment methods 

Goals and objectives to achieve:

  • Identify the root cause of disease spread
  • Identify diseases pattern in sample data
  • Provide medical assistance and Improve quality of care
  • Control cost of medical treatments

 Implementation of disease pattern analysis:

The basic elements of this architecture are data source, the analytical part, map-reduce, k-means clustering, classification and data visualization.

Data source: It provides collected data-sets for Hadoop framework to perform the map reduce process. Usually the medical dataset is in the form of unstructured data.

Map-Reduce: It is an important part in big data analytics, map-reduce framework contains two important tasks, map and reduce. The map process takes a set of data and converts it into another set of data, where an individual element is split into two to form value pairs and the reduce process takes the output from the map process. Map-Reduce program executes in three stages- map stage, shuffle stage and reduce stage.

Analytics part: Once clustering has been performed, it is moved into the analytical part to predict disease patterns using the provided attributes.

Visualization part: It provides visual aspects for the predicted disease patterns and provides a pictorial representation.

Information Governing System: It stores all the processed information and is capable of decision making.


BDA can help make healthcare more effective and efficient. They can be used for a range of operations, from disease management and prevention to medical research, and lead to insights which support healthcare providers in making more-timely and informed decisions about the population they are managing. A combination of Map-reduce with k-means clustering mechanism can prove to be an optimum solution in analyzing disease patterns and providing evidence based care.


[1] BalaSundar V,T Devi, N Savan,”Development of a Data Clustering Algorithm for Predicting Heart”, International Journal of Computer Applications (0975 – 888) Vol.48– No.7,2012.

[2] Sachin Shinde, Bharat Tidke, “Improved K-means Algorithm for searching Research Papers”, International Journal of Computer Science & Communication networks, ISSN: 2249-5789, Vol.4 (6), 197-202.

Raju R

Software Engineer

Tagged ,

About Incarnus

We are a global provider of next-generation, cloud-enabled solutions for all levels of healthcare from primary health practice to large-scale hospital networks.
View all posts by Incarnus →

Leave a Reply

Your email address will not be published. Required fields are marked *