Typically all Government data is in MS-Word / PDF documents with non-uniform syntax. There are thousands of files, each having tens of thousands of data elements. There is also a strong sense of locality and crop-dependence in this domain. Data blending, cleansing and then application of models is needed, for any structured decision-support from this wealth of data.
Business managers like to drill down & visualize the data, and also do scenario analysis of the form "If rainfall in district x is N cm in next season, how many tons of product K should we ship?"
A single, central data mart is designed. Data is automatically brought into a staging area from the Excel files, taking care of format variations. Tools are built for automatic verification of data quality and integrity. Then crop-wise, region-wise data is extracted into sub tables and cleansed.
These data sets are used to build crop-wise, region-wise acreage prediction models, using sophisticated time series, dynamic regression and other methods. Acreage & crop sowing practices in turn determine expected fertilizer consumption. A software tool within which ensemble models are integrated is delivered.