Automatic computer science domain multiple-choice questions generation based on informative sentences

View article
PeerJ Computer Science

Main article text

 

Introduction

Introduction to informative sentences

Summarization

Quality phrases

Introduction to distractors

  • Searching words relevant to key

  • Create a list of distractors

  • Choosing random words from the list

Introduction to WordNet

Wiktionary

Google search results

The basic flow of the system

  • informative sentence extraction

  • key identification

  • distractor generation

Problem description

Specific objective

  • To make a desktop-based application that could generate MCQs from the unstructured text of the computer science domain.

Scope of the system

  • To make the informative sentence-based MCQs of the computer science domain from the given unstructured text.

Background and literature review

Purpose of MCQ generation systems

Preprocessing

Sentence selection

Summarization

BERT for text embedding

Clustering embeddings

Keyphrase extraction algorithm

TF-IDF

Key selection

Question formation

Distractor generation

Post-processing

MCQ system evaluation

Evaluation of stem and key

Evaluation of distractors

Methodology

Dataset acquisition

Method

  • preprocessing

  • sentence selection

  • key selection

  • question formation

  • distractor generation

Preprocessing

  • sentence tokenization

  • removing special characters

  • tokenize words

  • remove stop words

  • change words in lower case

  • lemmatization of words

  • repetition of words or frequency of words

  • parts of speech tagging

  • named entity recognition

Informative sentence extractor

Summarization module

Scoring module

Quality phrases

Popularity

Concordance

Informativeness

Completeness

TF-IDF

No. of nouns and verbs

No. of stop words

Jaccard similarity of the title with sentences

Sentence selection module

Stem and distractor generation module

Key selection

  • Skimming the sentence

  • Finding domain-relevant keys in the sentence with the help of the dataset.

Stem formation

  1. Scan sentence

  2. Select key

  3. Replace the key with fill in the blank

Distractor generation

Creating a list of distractors

  • By using WordNet, finding synonyms of key

  • Finding synonyms on Wiktionary one by one

  • Providing derived words of key

  • Repeating the process for all synonyms

  • Adding all results in a list of dictionary

  • Finding list items on Google search

  • Picking one item from the list

  • Including the “AND” operator as a search query, i.e., “Keyboard And.”

  • Searching given suggested Google equery one by one

  • Scanning results of the searched query for relevant keywords

  • Adding discrete effects in a list of dictionary

System design and implementation

Front end

Back end

System requirements

Results and discussion

Providing unstructured text

Processing of text

MCQs generated by the system

Evaluation of the system

Conclusion and future work

Advantages and assumptions

Advantages

  • Fast and efficient results

  • Free of bulky computation devices

  • Bettering learning process

  • Easily accessible

Achieving research objectives

  • We can reduce the research gap

  • MCQs are based on informative sentences

  • These reduce the cost and time of finding informative sentences, keys, and appropriate distractors.

Assumption

Future work

Supplemental Information

Additional Information and Declarations

Competing Interests

Muhammad Asif is an Academic Editor for PeerJ.

Author Contributions

Farah Maheen conceived and designed the experiments, performed the experiments, analyzed the data, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Muhammad Asif conceived and designed the experiments, performed the experiments, analyzed the data, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Haseeb Ahmad conceived and designed the experiments, performed the experiments, performed the computation work, authored or reviewed drafts of the article, and approved the final draft.

Shahbaz Ahmad conceived and designed the experiments, performed the experiments, analyzed the data, performed the computation work, authored or reviewed drafts of the article, and approved the final draft.

Fahad Alturise analyzed the data, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Othman Asiry analyzed the data, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Yazeed Yasin Ghadi conceived and designed the experiments, analyzed the data, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

The code and data are available in the Supplemental Files.

Funding

The authors received no funding for this work.

17 Citations 2,917 Views 461 Downloads

Your institution may have Open Access funds available for qualifying authors. See if you qualify

Publish for free

Comment on Articles or Preprints and we'll waive your author fee
Learn more

Five new journals in Chemistry

Free to publish • Peer-reviewed • From PeerJ
Find out more