Effectiveness of the Structured and Conventional Methods of Viva Examination in Medical Education: A Systematic Review and Meta-analysis
Correspondence Address :
Dr. K Anbarasi,
Professor, Department of Dental Education, Unit Sri Ramachandra Institute of Higher Education and Research, Chennai, Tamil Nadu, India.
E-mail: anbarasi815@gmail.com
Introduction: Oral examination (viva voce) is one of the common assessment methods for medical students. Literature shows that Conventional Oral Examination (COE), is a widely adopted method and uses a consolidated scoring system. There came an alternative method, Structured Oral Examination (SOE) that uses the recommended rating scale (prevalidated questions and markings). The emergence of a new method raised the research question of whether the conventional or structured oral examination is effective in assessing medical students.
Aim: To evaluate the effectiveness of traditional and structured viva-voce examination across the specialties in medical education.
Materials and Methods: A systematic review was conducted on 18 peer-reviewed articles about conventional and structured oral examination among medical students. Medical Education Research Study Quality Instrument (MERSQI) was used to assess the quality of evidence.
Results: The level of evidence was moderate where the MERSQI score ranges from 7.5 to 15.5 for the 18 articles included in the review process. SOE overcomes COE by assessing students’ cognitive skills, communication skills, behaviour, and attitude whereas COE principally assesses the recall knowledge. Analytical and reasoning power remains the predominant domain in SOE. With psychometric properties like good reliability, sensitivity, and acceptability, SOE remains the best strategy for the evaluation of medical students. Pooled results in the forest plot showed no difference in the viva voce marks between COE and SOE with a mean difference of 0.46 (p=0.53).
Conclusion: The review analysis revealed that there is no difference in the mean marks scored by COE and SOE. However, a SOE will allow examiners to assess the medical students’ learning achievement with no partiality, stress, and anxiety compared to COE.
Assessment, Dental education, Structured oral exam, Unstructured oral exam, Viva-voce
The knowledge and skills of medical students have been assessed using written and oral examinations since 1950. An oral examination (viva-voce) is an interview between a candidate and one or more examiners holding an important place in a medical examination (1). The oral examination is a way of assessing the candidates’ ability to understand and express the ideas in particular topics and judging how deep they understand them (2).
The conventional or traditional or unstructured oral examination is an interview or discussion between examiner(s) and student in the absence of patients (3). This COE mainly focuses on the professional aspects of medical subjects like practice-oriented knowledge, mental sharpness, positive verbal communication, and subtle decision making (4),(5). In this method, each student receives different questions about the content addressed, the difficulty of the question, and different levels of prompting or help. It has been claimed that this oral examination format is not uniform, too subjective, and is more prone to errors (6),(7).
SOE is recently used in the assessment of medical education, including basic medical subjects. SOE assesses the knowledge, skills, and attitude of the students using a set of predetermined questions (8). It is well planned in content and competencies to be assessed in a specific duration and is supported by a checklist. Though SOE is well framed, it increases apprehension among the students (difficulty level of questions, problem solving type of questions, direct feedback) and reluctance among the faculty members (SOE demands detailed planning, pre validated well-structured questions, scoring criteria, resources, and manpower) in terms of implementation (9). It is the need of the hour to decide whether COE or SOE will help in a successful medical student’s examination.
This systematic review aimed to evaluate the effectiveness of the COE and SOE in all disciplines of medical education and consolidate the results based on students’ test scores.
The present study was a systematic review and a meta-analysis. There was no language restriction placed, and articles published from 2010 to March 2019 were included. This time frame was selected since the structured viva examination entered its major application in medical education in the previous decade (10). The study was conducted from August 2021 to February 2022. This review work on published literature did not require Ethical approval and informed consent.
Search strategy: The databases such as MEDLINE, Cochrane, and Google scholar were used for the search. Keywords of published articles and MeSH terms were the search terms. Search criteria using MeSh terms had been built. These terms were refined using keywords of published articles. The search terms were connected by Boolean Operators ‘AND’, ‘OR’, and ‘NOT’ to find all relevant articles. Search terms used were oral examination, assessment tool, viva, viva-voce, interactive exam, structured, traditional, medical education, medical students, and dental students, reliability.
Inclusion criteria: Articles published in peer-reviewed journals with comparative analysis of SOE and COE in medical and dental education were included in this review.
Exclusion criteria: Oral examination of medical and dental education at the undergraduate level had been included, excluding nurses, physical therapists, pharmacologists, and other healthcare professionals. The Objectively Structured Clinical Examination (OSCE), multiple mini-interviews types of assessment, and narrative or literature reviews describing the importance of structured oral examination were excluded from the review.
Selection process: The retrieved articles from the database search and hand search were screened for the title. Duplicates were excluded. Three researchers read the abstracts and full text of selected articles separately and then discussed their findings. The review process continued after the agreement between the researchers. In case of any conflict of interest, all researchers read the articles again for further discussion and decision.
Data extraction: A data extraction spreadsheet was developed using Microsoft Excel®. This sheet was divided into study identification (author, year), study population and settings (number of participants, subject), study design (intervention, comparison), study method and measurement, study outcomes, and study citation parts. The data extraction sheet was pilot tested with five articles. After making necessary corrections to the sheet, it was applied to all the selected studies. A double review of the abstracts and full-text articles was conducted.
Quality Assessment: MERSQI scale (11),(12) was used for the quality assessment as it assesses the methodological rigor of articles. MERSQI tool consists of six domains which include study design, sampling, type of data, validity of evaluation instrument, data analysis, outcomes. The scoring is based on the 10 items within the six domains ranging from 0 to score 3 for each domain. Thus, the maximum score will be 18 for an article (Table/Fig 1).
The scale is comprehensive with its list of 10 review items and also has evidence for its validity. This scale adopts Kirkpatrick’s four-level model (13) to approach the effectiveness construct. The first level (reaction) focuses on the participants’ perceptions of the intervention, the second level (learning) evaluates knowledge, skills, and attitudinal change, and the third level measures changes in behaviour. The fourth level (results) focuses on the Organisation’s benefits because of the intervention.
Statistical Analysis
Descriptive statistics such as percentages were used to analyse the data based on MERSQI domain perspectives. MERSQI score for each article based on all sections was calculated. The total number and percentage of articles for each MERSQI domain were also calculated. Two reviewers conducted a meta-analysis using RevMan 5.4 (Cochrane Collaboration, Copenhagen, Denmark) to yield outcomes. Mean±Standard Deviation (SD) was chosen for expressing the results of continuous outcome (mean viva voce marks). I2 test was used to test the heterogeneity. We selected the random effect model to merge data if I2 > 40%; otherwise, a fixed-effect model was used. The 95% Confidence Interval (CI) was adopted in this review.
After the initial search through PubMed, Cochrane Library, Google Scholar, and hand search. Using the search terms and MeSH terms, 63 relevant articles were obtained. During the first stage of screening, 58 articles remained after removing five duplicates. Then, 38 articles were removed subsequently by screening titles and abstracts. After assessing the full texts, two articles were excluded for not fulfilling the inclusion criteria. Eighteen articles (14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(25),(26),(27),(28),(29),(30),(31) were finally included in qualitative synthesis and eight articles were included in quantitative synthesis (15),(17),(21),(22),(23),(28),(30),(31) (Table/Fig 2).
Qualitative assessment: Most of the SOE to study its effectiveness was administered at only one institution (94.4%). These study articles (14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31) reported that structured viva-voce had positive effects on the overall experience and student satisfaction compared to traditional viva-voce. However, the level of scientific evidence and effectiveness varied among the studies. The MERSQI scale helped us to identify the quality of evidence. This scale reported that the quality of evidence of all included articles was moderate (Table/Fig 1).
Out of 18 articles, only two articles used Randomised Controlled Trials (RCT) (11.1%) to test the effectiveness of SOE. The study design used in most of the articles was a single group with pretest and post-test (55.6%) followed by a non randomised two groups study (27.7%). Single group pretest and post-test study design got the highest MERSQI score of 15, followed by a non randomised two groups and RCT study design with a score of 10 and 6, respectively. The single group cross-sectional study design got the least MERSQI score of 1.
When assessed for the outcomes, eight articles assessed both the mean viva voce marks and students’ perception (15),(16),(17),(21),(23),(26),(28),(31). Two studies (18),(19) assessed the student’s perception alone, and one study assessed both student and teachers' perception (25). Three studies assessed viva voce marks, and the perception of both students and teachers (22),(24),(30). Three studies analysed the structured viva voce questionnaire and conducted perception survey among the participants (19),(20),(27) and one study (14) explored the reliability of structured viva voce and mean marks.
To evaluate the students’ perception regarding SOE, 12 studies (66.7%) used open-ended and closed-ended questionnaires. The closed-ended questionnaire was collected as students’ feedback based on a 2-point (yes or no) and a 5-point (strongly agree to strongly disagree) Likert scale. Most of the studies (66.7%, N=12) assessed the outcome subjectively and objectively. Also, about 77.8% (N=14) of the test instruments had internal validity tests. The authors of 16 articles (88.8%) used appropriate statistical tests according to MERSQI. Similarly, 66.7% of the studies used inferential statistics besides descriptive statistics. All studies included in this review had an excellent response rate of 75%.
The MERSQI score that can be obtained by a study ranged from 5 to 18 points. According to (Table/Fig 3), the highest score for an article was 18 and the lowest score was 7.5. All the studies (100%) framed the SOE as question set cards or question template. The questions in the question sets were from must know (core) and nice to know (non-core) areas. These questions were set with increasing grades of difficulty from easy to very difficult and the questions used were recall, analytical and reasoning power types.
The participants of all studies were undergraduate medical and dental students. The structured viva voce questions were developed from the following specialties: Community medicine, physiology, pathology, microbiology, biochemistry, pharmacology, periodontology, molecular biology, integrated basic science, forensic medicine, and anatomy. Almost all the studies (83.3%, N=15) (15),(16),(17),(18),(19),(20),(21),(22),(23),(26),(27),(28),(29),(30),(31) compared structured viva voce against traditional viva voce, and one study measured the reliability of structured viva voce and one study reported sensitivity and specificity of structured viva voce (14),(18).
The structured viva voce strategy was stated explicitly as “recall, analytical and reasoning power” and “must know, good to know, and nice to know” types in 14 articles (77.7%) (15),(16),(17),(18),(19),(20),(21),(22),(24),(25),26],(27),(30),(31). The remaining four articles (14),(23),(28),(29) have not mentioned the strategy. Of 13 articles that assessed the viva voce marks, three articles (16.6%) reported that the marks obtained by the students were higher in traditional viva voce than in structured viva voce. Almost 78% of the participants in all studies felt that SOE can be introduced in the formative assessment. Eleven articles (61.1%) (14),(15),(17),(20),(21),(22),(25),(26),(27),(29),(30) had mentioned the time frame allotted for structured viva examination which ranged from 5 to 15 minutes, whereas no time frame had been mentioned for traditional viva examination.
Meta-analysis: Eight studies compared the mean viva voce marks. The forest plot was produced according to the mean viva voce marks of the conventional and structured oral examination. The results of the meta-analysis showed no significant difference (p=0.53) in the mean viva voce marks with the conventional and structured oral examination (MD, 0.46; 95% CI, -0.99 to 1.92) (Table/Fig 4). A random-effect model was adopted because of high heterogeneity with a total sample of 81.
A systematic review was planned to find out whether structured viva voce or traditional viva voce is effective in terms of assessment scores, perception, and reliability in the evaluation of medical students. In consonance with the structured viva voce scheme, 77.7% of researchers followed recall, analytical and reasoning power domains for viva voce. This finding makes us think that formative assessment in medical education focuses on these three domains rather than any other additional domains. Viva voce is the most effective concept for the evaluation of clinical reasoning skills, an essential component of medical practice, and requires psychometric properties in terms of reliability and validity (32),(33).
Based on this review results, there was no significant difference in the marks scored by the medical and dental students using COE and SOE. However, structured viva voce eliminates inappropriate bias by careful selection and training of examiners, use of more formal structured questions, and application of this structure to assess the candidate making this concept a reliable and valid one. It has been suggested that rating candidates separately in three fields: recall, analytical, and problem-solving will improve their reliability (34). Providing training sessions for examiners to promote scoring consistency and conducting mock examinations for implementation integrity will make this concept most effective (35).
Of two articles that assessed the reliability of structured viva voce, one compared the reliability of the system by administering the 7th day and 14th day after a one-month lecture (14). Another one compared the inter-rater and internal consistency reliability between structured and traditional viva voce (15). These reviews reflect that structured viva voce has good reliability among students and examiners. Besides reliability and validity, the acceptability of structured viva voce among students and teachers was also assessed in all studies. Students expressed that structured viva voce was better than traditional viva voce based on certain criteria assessed by the closed-ended questionnaire. The criteria were that structured viva voce had a well-organised system, covered most of the topics in the syllabus, questions were from all levels, allotted time was adequate, and questions were comprehensive. In an open-ended questionnaire, students and teachers in all studies felt that structured viva voce had no partiality, no cross-questions, encourages deep learning, is transparent and fair, but requires training.
16.6% of the articles in this review reported that mean viva voce marks in the SOE are less when compared to COE. The reason was that structuring exposes students to all types of questions from easy to difficult levels whereas traditional viva voce may make students answer several easy or several difficult level questions (16). 78% of 1,311 students from all studies have reported that SOE covered a wide range of topics, was less stressful, not exhausting, and positively influenced the learning patterns. It has been suggested that structured viva voce examination can be improved by increasing the number of examiners. Although a moderate level of evidence has been reported according to the MERSQI scale, the feasibility and acceptability of a change in the formative assessment among the students and faculty for structured viva examinations have increased (17).
Limitation(s)
The limitation of this review was related to the MERSQI outcome domains. The scale is good for assessing evidence on effectiveness, but it makes no differentiation between knowledge and skills. Future work in this domain may develop this feature of the MERSQI scale. Also, the MERSQI scale does not consider the statistical power of the studies included, which is necessary to establish the levels of evidence in a well-organised manner. All the included articles have the limitation of being done the trial for the Short-term and done on a single topic in a single specialty. High-quality studies with crossover randomised controlled trials comparing the conventional and structured oral examination will help to derive a more convincing inference.
This review and meta-analysis showed no difference in the mean viva voce marks scored by the students in a COE and SOE. Though there is general acceptability for structured viva voce, future research based on learning domains (cognitive, psychomotor, affective, and communication) is needed to assess the effectiveness of structured viva voce in assessing the progress of learning.
DOI: 10.7860/JCDR/2022/57445.16977
Date of Submission: Apr 30, 2022
Date of Peer Review: May 28, 2022
Date of Acceptance: Jul 22, 2022
Date of Publishing: Sep 01, 2022
AUTHOR DECLARATION:
• Financial or Other Competing Interests: None
• Was informed consent obtained from the subjects involved in the study? NA
• For any images presented appropriate consent has been obtained from the subjects. NA
PLAGIARISM CHECKING METHODS:
• Plagiarism X-checker: May 05, 2022
• Manual Googling: Jul 20, 2022
• iThenticate Software: Aug 26, 2022 (12%)
ETYMOLOGY: Author Origin
- Emerging Sources Citation Index (Web of Science, thomsonreuters)
- Index Copernicus ICV 2017: 134.54
- Academic Search Complete Database
- Directory of Open Access Journals (DOAJ)
- Embase
- EBSCOhost
- Google Scholar
- HINARI Access to Research in Health Programme
- Indian Science Abstracts (ISA)
- Journal seek Database
- Popline (reproductive health literature)
- www.omnimedicalsearch.com