To address the problem of providing reading materials at different reading complexity levels, often teachers make an effort to find materials from various online sources. Unfortunately, this process is difficult and time consuming. To suffice the needs of different students, teachers are often forced to rewrite the material themselves to suite the various needs of students. Application of text mining and machine learning approaches on the reading materials automates the process of determining the complexity of the material.
Collected data from common core state standards. Each data entry has a grade level which determines the complexity of the reading comprehension. With this data as training set, we applied text mining techniques to extract useful information from the data set. Each text document is represented as a graph and features to define each text document are extracted by applying social network analysis on the graph.
A combination of social network aspects and the graph properties of the text act as features to define the text document. After extracting features from all the text documents, a machine learning algorithm for classification is applied. The accuracy of the model built for commendable for the test data provided by the university.