The client was facing issues because the high latency of query response in their databases. They had approximately 1.8 TB of structured data stored in Relational-DB's. They also wanted to analyze some unstructured data generated on the web by their customers. Finding answers for questions like percentage increase in a particular financial channel usage (ATM's, internet-banking, mobile-banking etc.) in the past few years, or comparison of one channel to another was simply not possible. Clearly, the data was too big and also growing daily, hence an adequate infrastructure setup suitable to Big Data needs was required.
To start with, we implemented a complete open source analytics workbench on a 5-nodeHadoop cluster as asked by the client. This is fully operational and could be scaled out easily as and when need arises. This resulted in fast data access on large data size (approx. 4 TB). We also built and deployed a few some big data analytics models to fulfill various client requirements. The big data cluster setup is complete and operational. We are presently developing some customized analytics models for the client as per their requirement.