The company shared varioustypes of text documents viz. employee offer letters, policy documents, client related documents, HR documents etc. amongst various users via an online intranet repository.These documents are often lengthy and difficult to understand, especially for people belonging to a different domain e.g. HR personnel reading a financial document etc. because of the domain specific jargons used in the documents. Our aim was to make these documents jargon-free. This projectwas a part of their Plain language initiative. Plain Language as defined by the government of USA is a language construct, which is jargon-free and ensures understanding of the same, when a person reads it or listens to it for the first time. As the client had a huge volume of text documents in the repository, they wanted us to build a text analytics tool for doing the same.
We collated various documents from various verticals of the company such as HR documents, Finance documents, Marketing documents, Project Management documents etc. and applied text-mining techniques to build the required model. We built a model, which comprisedof two modules, viz. Auto Analyzer: to analyze the document and Auto Guidance: for writing and correcting the document. Auto guidance method also generated suggestions for correction when in a dilemma. For example, if a sentence is in passive voice, it suggests an active voice for thesame, if a complex word is used it suggests an alternate commonly used word to replace it. Auto Analyzer analyses the document across various dimensions such as, design of the document, organization of content, language used, context analysis etc. In addition, the end user can view the areas for improvement and make the required modifications. Finally, we deployed the models on the system. Mostlythe corrections/modifications were automatic but at times, it required human intervention. If human intervention was required for correction/modification, one needed to select a sentence amongst the few choices displayed on the screen or replace it with a more appropriate sentence.