With the increase in the number of students for whom English is a second language, one model fits all approach fails. To elaborate, same reading material and teaching styles for all the students will fail in these scenarios. Reading plays a crucial role for any education development, but finding appropriate reading materials for students at different reading complexity levels is quite often difficult. To address the problem of providing reading materials at different reading complexity levels often teachers make an effort to find materials from various online sources. To suffice the needs of different students, teachers often rewrite the material themselves to suite the various needs of students. Unfortunately, this process is difficult and time consuming. The client wanted us to build a tool using text mining and machine learning approaches to automate the process of determining the complexity of the material.
We collected data from common core state standards. Each data entry has a grade level, which determines the complexity of the reading comprehension. With this data as training set, we applied text-mining techniques to extract useful information from the data set. Each text document isa node on a graph;the features to define each text document a combination of social network aspects of the text and the graph properties of the text actsas a composite feature to define a specific document. After extracting features from all the text documents, a machine-learning algorithm for classification is applied.