Maharashtra State Board Class 10 Science-2 Question Bank
As state education boards are planning to launch a new examination in the second week of October, we were invited by School Education Minister Uday Samant to share how our paper-based machine learning and deep-learning models have been utilized in these examinations. The results of the State Examinations were announced in last week’s issue of School Magazine (an initiative of Sefal) along with other major tests like GATE exams as well as English Paper 2 etc. It is an honor to be able to share our Machine Learning model developed with our Research Engineer team at SVHCL for conducting these Examinations.
Our data scientists and engineers have worked on two of the most important phases of the science of NLP — “Evaluation of Deep Models on Common Language Data” and then used them to analyze both question papers and answers submitted to these same questions papers. All papers in which SVHCL has helped us are mentioned below.
The first phase was about using various ML algorithms to identify keywords in all the papers and then converting them into TF-IDF vectors. We got many keywords out of millions of documents and our TF-IDF vector representations were generated from them. In the next step, we converted those words into word embeddings, where each keyword represented one word. These word embeddings then served as input to another classifier that identifies whether a sentence belongs to a particular class or not. After this classification, the text can be given any type of label (i.e. student or teacher). As soon as it gets one type, it can be fed to the remaining classifier. This helps us classify whether a person is a teacher or a student. Once we got these classes, we went ahead to train more than 20 separate models and then tested them on the dataset mentioned above. With the help of Google Colab, we now had access to Python libraries along with TensorFlow, sci-kit-learn, and Scikit-Learn. For training our models, we used standard datasets like MNIST, Stanford Sentiment Treebank, IMDB Movie Reviews, Twitter, Facebook posts, and Wikipedia articles. Also, we wanted to test the generalization ability of our models by training them on different language/topic combinations as well as on unseen documents. We have noticed that when we trained our models, the accuracy score decreases after some time. So it means that it is suffering from unbalanced input that may lead to problems during the final prediction. We tried various things to overcome that problem like using small batches, increasing the number of epochs, etc. However, none of our efforts seem to prove anything fruitful, and thus decided to use cross_validation to see what happens when we change the training set. Cross-validation gives us the chance to train our model multiple times over, once at once on its own dataset and again on the testing dataset. Since it has seen all test samples, the weight of every parameter of the model changes. Now, if we test the same model we trained on the entire dataset, the accuracy score can go up and down with changing every parameter. We found cross_validation works better and faster as compared to hyperparameter tuning and finally found a way to reduce the parameters of our models by changing their ranges. We tried various ranges of values and finally settled on 0.5 for a range of hyperparameters and 20 epochs (the maximum limit for the epoch parameter). By doing so, we now have only 30 parameters in the optimization process (after dividing that variable by the max possible value of one line) and the accuracy score goes up even after a few iterations(50–50,000 iterations). The validation scores have also gone up even before the start of training. While training was a little difficult due to this problem and took some time to converge, cross_validation didn’t show much improvement.
The reason behind such a high difference between the training and testing score is simple: during cross_validation the whole train-test split was kept here while cross_valuation changed the training set with the only test set. When this was done, the validation scores went down and never reached the same level as the training set. But now we have reduced the dimensionality of the feature space of our ML model and hence, we got good validation scores, which was enough for the further predictions of future sentences. There were no other changes in the parameter values for cross_validation and hence it did not affect our model.
But this process has also caused trouble in the implementation of the model (as cross_valcation did away with the weights that were not considered in the original optimization process and these weightings can no longer be optimized with proper techniques) so we started looking for alternate ways to increase the model accuracy. One idea was to increase the batch size and increase epochs but this approach didn’t work because all of our models had already grown very large and there were no improvements. Thus, we concluded that the best solution for us was to increase the batch size and call it a day to get better performance in case of upcoming examinations like these. However, we made sure of course, that cross-validation shouldn’t be confused with regular cross-validation and instead should be called multi-cross validation. At the end of these exercises, we didn’t gain any meaningful knowledge but we are happy to be back with the top-notch model.