Multiple‐Choice versus Open‐Ended Questions in Advanced Clinical Neuroanatomy: Using a National Neuroanatomy Assessment to Investigate Variability in Performance Using Different Question Types

Methods of assessment in anatomy vary across medical schools in the United Kingdom (UK) and beyond; common methods include written, spotter, and oral assessment. However, there is limited research evaluating these methods in regards to student performance and perception. The National Undergraduate Neuroanatomy Competition (NUNC) is held annually for medical students throughout the UK. Prior to 2017, the competition asked open‐ended questions (OEQ) in the anatomy spotter examination, and in subsequent years also asked single best answer (SBA) questions. The aim of this study is to assess medical students’ performance on, and perception of, SBA and OEQ methods of assessment in a spotter style anatomy examination. Student examination performance was compared between OEQ (2013–2016) and SBA (2017–2020) for overall score and each neuroanatomical subtopic. Additionally, a questionnaire explored students’ perceptions of SBAs. A total of 631 students attended the NUNC in the studied period. The average mark was significantly higher in SBAs compared to OEQs (60.6% vs. 43.1%, P < 0.0001)—this was true for all neuroanatomical subtopics except the cerebellum. Students felt that they performed better on SBA than OEQs, and diencephalon was felt to be the most difficult neuroanatomical subtopic (n = 38, 34.8%). Students perceived SBA questions to be easier than OEQs and performed significantly better on them in a neuroanatomical spotter examination. Further work is needed to ascertain whether this result is replicable throughout anatomy education.


Importance of Anatomy in Undergraduate Medical Education
Anatomy is fundamental within the undergraduate medical education and is strongly associated with clinical competency (Sbayeh et al., 2016). Medical students consider anatomy teaching an essential part of the medical curriculum and deem it vital for future clinical practice (Ali et al., 2015). However, both students themselves and senior NHS clinicians that train them in the workplace report that their anatomical knowledge is insufficient at the point of graduation (Gogalniceanu et al., 2009;Monrouxe et al., 2017;O'Keeffe et al., 2019). This may be due in part to the globally evolving nature of modern curricula where anatomy has featured less heavily over recent decades (Heylings, 2002;Chapman and Hakeem, 2015;Memon, 2018;Trautman et al., 2019). Considering that assessment is one of the main extrinsic motivators which drive learning, it is important that anatomy assessments are fit for purpose and examine both deep knowledge acquisition and higher order thinking, rather than superficial factual recall (Gogalniceanu et al., 2009). The issue of appropriate assessment strategies is a wider consideration for all disciplines within higher education, and is an area that has been extensively researched.

Research into Higher Education Assessment Practices
Many disciplines will have been dominated historically with the application of both the multiple-choice question (MCQ) examination and the traditional evaluation by written essay style questions (Sambell et al., 1997). In most cases, students prefer the MCQ format and its variants compared to an essay style or short/long structured written type of examination. They reach this conclusion for a whole host of reasons, including perceived difficulty, anxiety, complexity, success expectancy, feeling at ease, and this is true for both male and female students (Zeidner, 1987). However, these conclusions are not reached by all students. There is evidence to suggest that those learners who excel or who are confident in their abilities prefer writing free text answers, where they have an opportunity to fully demonstrate their talents (Birenbaum and Feldman, 1998).
A long-standing viewpoint within this field suggests that MCQ formats or assessments that focus on providing detailed factual answers motivate students more toward using a surface approach, while free text questions tend to encourage a deeper strategy (Entwistle and Entwistle, 1991). This perspective appears to be corroborated by research implying that when students shift from the former to the latter they transfer from a surface approach toward a deep approach (Thomas and Bain, 1984).
Despite this, over time, the MCQ style of question writing has evolved and the current evidence strongly suggests that a well-constructed single best answer (SBA) question can engage problem solving strategies as much as open-ended questions (OEQ) (Hift, 2014). There is no clear position on how students perform in comparison to each question type and this is true across all disciplines within higher education. This has been investigated in a number of subject areas in higher education, often resulting in mixed outcomes (Badger and Thomas, 1991;Bridgeman, 1992;Wolfson et al., 2001;Schuwirth and van der Vleuten, 2004a;Ozuru et al., 2013;Wooton et al., 2014;Brenner et al., 2015;Thompson and O'Loughlin, 2015;Biria and Dehghan, 2016;Adamu et al., 2018;Hauer et al., 2020).

Changes to Modern Day Anatomical Curricula and Assessment Methods
There is currently considerable variation in the way anatomical education is delivered and assessed between medical schools in the United Kingdom (UK) and globally (Turney, 2007). There are multiple types of assessment methods used, which mostly falling into the categories of oral, written, or practical forms.
The repertoire and balance of assessment methods appears to have changed over time, but without any firm or robust conclusions to support one particular approach (Rowland et al., 2011;Papa and Vaccarezza, 2013). Ultimately, the purpose of all curricular assessment is to identify a threshold level of knowledge at which it is appropriate to allow students to progress. (Rowland et al., 2011;Papa and Vaccarezza, 2013). Practical anatomy has been typically assessed via a spotter assessment, also known as the steeplechase examination in some countries (Smith and McManus, 2015;Brenner et al., 2015;Geoghegan et al., 2019;Tirpude et al., 2019). Spotter examinations typically take place in the Dissecting Room and consist of 30-40 stations. Students move around the stations at a predetermined pace (typically 1-1.5 minutes) until all stations are completed. Questions may include various resources including cadaveric specimens, models, imaging, photomicrographs, and diagrams. Anatomy has often been integrated within written papers to assess conceptual and applied aspects of the discipline and that has certainly been the case at Southampton (Smith and McManus, 2015). Typically, questions require long or short structured answers, and are presented as a clinical case.

Evolution of the Spotter Style Examination Approach
This shift in assessment practice is evidence of a gradual evolutionary influence of the spotter style assessment. Historically, candidates have been required to mainly identify structures, however, this has been frequently criticized for only assessing superficial "low level" knowledge based on visual recall. It is now more common to include function-based questions or those that link to clinical understanding. This is considered pedagogically sound due to its capability of testing deeper understanding (Smith and McManus, 2015;Brenner et al., 2015;Choudhury et al., 2016). Furthermore, the standalone spotter style assessment may involve either open-ended questions (OEQ) or a form of multiple-choice question (MCQ) known as single best answers (SBAs) (Sam et al., 2016). However, sometimes the decision on which one to adopt is based on logistical convenience and staffing rather than academic principles (Collins, 2006). Additionally, OEQ marking may involve unwanted variation in scores due to subjective interpretations by teaching staff, and so requires standardized processes (Wass et al., 2001;Mehta et al., 2016). As the emphasis of modern medical curricula has shifted, including notable changes to how practical anatomy is taught and learned, it has become increasingly important to establish evidence-based assessment practices in anatomy, which can ensure safe clinical practice for all future medical graduates (Sugand et al., 2010).
One modern variation of the spotter examination is the Objectively Structured Practical Examination (OSPE). Each station is designed to test a skill, such as relating clinical knowledge to anatomical or histological structures (Yaqinuddin et al., 2013;Choudhury and Freemont, 2017;Tirpude et al., 2019). The integrated anatomy practical paper is another example (IAPP). The title is intended to acknowledge the inclusion of function and clinical application questions. This format is considered to be an effective and reliable method of assessing anatomical knowledge.
Although various anatomical teaching and assessment methods are being utilized in anatomy, it can often be ethically challenging to undertake assessment-related research within a live medical curriculum, due to the possibility of disadvantaging students through the experimental manipulation of existing procedures. This could easily lead to high stake consequences, such as the possibility of incorrect decisions concerning student progression, or the possibility that students could be held back unfairly.

The Potential Role of Extracurricular Anatomy Assessments
Extracurricular assessments are rare in anatomy and there are seldom opportunities for students to be earn academic awards in niche subject areas (Hall et al., 2020).
Clinical neuroanatomy is one particular specialist field that has been a hot spot for innovative educational interventions because many anatomists struggle to make it understandable and digestible (Chang and Molnár, 2015). Despite this, there are clearly a number of enthusiasts among the medical student population. The National Undergraduate Neuroanatomy Competition (NUNC) is an annual competition held at the University of Southampton run in partnership between medical students, qualified doctors, and academic staff with support from The Anatomical Society (Geoghegan et al., 2019). The aim was to provide an educational and intellectually stimulating event, in which students can enhance their neuroanatomy knowledge, strengthen their portfolios, and help to vertically integrate the subject at all stages of the curriculum. The impact upon Southampton's Centre for Learning Anatomical Sciences has been to increase transparency of the curriculum through a partnership model and to expand the quality of cadaveric resources for undergraduate teaching purposes.
In the current context, the NUNC creates an ideal opportunity to compare performance on SBA and OEQ types using a large cross section self-selecting medical students, representing different institutions, and from all stages of training. Therefore, the current study utilized this event to assess the impact of different question types on neuroanatomy performance in a very traditional spotter test.
We also took the opportunity to ask students about their perceptions of assessment practice at their host institutions. This is important because the learner's assessment experience will determine the way in which the student considers future learning strategies (Ramsden, 1997) and equally, the way in which students think about learning and studying, influences the way in which they tackle assessments.
The aims of this study are to investigate: (1) Medical students' performance in the SBA and OEQ components of the NUNC spotter assessment and (2) The perceptions of medical students toward the use of SBA and OEQ as a method of assessment of anatomical knowledge in the NUNC.

Examination Structure
The NUNC incorporates two examination papers: a written SBA paper and a practical spotter style examination using brain prosections.
The spotter style examination comprised between 40 and 70 stations, which has increased over the years in line with the evolving nature, and growing attendance, of the NUNC. Each station contained a prosection, or an image, of the brain or spine or spinal cord with two questions per station. At each station, students had 1 minute to answer both parts of the question. All questions were answered under examination conditions and adhere to the standard examination rubric, in line with University of Southampton assessment regulations. For full details see Hall et al. (2016). Delegates were allocated to either a preclinical or clinical category, depending on whether they had started the formal clinical years of medical school (usually third year), and only compared to other delegates in the same category.
In its original format, the NUNC 2013-2016 spotter examinations only comprised OEQ questions. In 2017-2020, the spotter format was changed so that one of the two questions at each station was a SBA question using three distractors. In 2017-2020, the second question at each station remained in the original OEQ format (Table 1). Example questions are demonstrated in Online Appendix 1.
Performance data from SBA questions in 2017, 2018, and 2020 papers (SBA group) were compared to the performance from OEQ in , 2014. Performance data between SBA questions and OEQ in 2017, 2018, and 2020 papers were further compared to each other. The 2019 data are not available for comparison hence was excluded from the study. At the end of the competition, the scores from the SBA and spotter examination are collated to provide an overall mark for each student.

Quality Assurance and Standard Setting Procedures
The questions for the NUNC were provided by clinicians, members of university faculty, members of the Anatomical Society, and senior medical students on the NUNC committee. All the questions were peer-reviewed by the NUNC committee to ensure quality and consistency. The group of question writers consisted of approximately 10 core members and were reviewed by up to another 20 individuals both within and external to the university. Questions were written based on a number of predefined neuroanatomical subtopics. The number of questions covering each subtopic were discussed and weighted proportionately to the depth and breadth of the subtopic, and also perceived importance in the curriculum. A modified Angoff approach to standard setting was adopted although without the need for determining a passing score (Yim, 2018).
The questions were categorized by neuroanatomical subtopic (cerebrum, diencephalon, vascular, brainstem, spinal cord, cerebellum, and other). The anatomical subtopic "other" typically refers to questions regarding the ventricular system and skull as these questions did not fit into any other category.

Questionnaire
At the end of the 2013-2017 NUNC, students were invited to answer a questionnaire (see Online Appendix 2) which aimed to gauge students' careers intentions and their perceptions of learning neuroanatomy. The items for the questionnaire were chosen with the aim of better understanding students experiences of neuroanatomy at their home institution, and also their perceptions to different assessment styles. Additional information such as student demographics including gender, host institution, and year of study were collected during the online registration process. In 2017, we asked students perceptions toward different assessment methods.

Ethical Approval
This research received ethical approval from the University of Southampton Faculty of Medicine Ethics Committee (Approval ID: 9351).

Statistical Analysis
Performance and feedback responses were compared using descriptive statistics, Mann-Whitney U test, Wilcoxon signedranked test, or Cohen's d as appropriate. Statistical significance was set at P < 0.05. Statistical analysis was performed in Prism software, version 7.02 (GraphPad Software Inc., La Jolla CA).

Demographics of Studied Population
Of the seven studied NUNC events, 631 delegates attended; 579 (91.8%) stated their gender and of these 349 (60.3%) were male and 230 (39.7%) female. There were significantly more males in the OEQ group than the SBA group (66.4% vs. 55.6%, P = 0.01).
Of the 631 delegates, 336 were in the clinical category and 295 were in the preclinical category. There was no significant difference in the distribution by category between OEQ (57.2% clinical) and SBA (50.1% clinical) groups (P = 0.09).

The Spotter Examination
In 2013-2016, 278 delegates completed the spotter examination in the OEQ format, and in 2017-2020 (excluding 2019), 353 delegates completed the new spotter examination format, with half of the questions as OEQs and the remaining half as SBAs.
The subtopics most heavily represented the cerebrum, vascular system, and brainstem/cranial nerves, whereas spinal cord and cerebellum questions were less common (Table 2). There was no difference in the distribution of questions by subtopic between the two groups (chi-square 2.702, P = 0.85).

Examination Performance
Overall, students performed significantly better on the SBA questions than OEQs questions (60.6% vs. 43.1%, P < 0.0001) ( Table 3). Students perform better in all subtopics with the exception of the cerebellum where they perform significantly worse. Students between 2017 and 2020, who undertook both OEQ and SBA questions within the examination, also performed significantly better on the SBA component of the examination (P < 0.0001, Wilcoxon matched pairs signed rank test) ( Table 4). Students performed significantly worse on cerebellum OEQ compared to SBA questions (P < 0.0001, Wilcoxon matched pairs signed rank test).
From the 2017 questionnaire, students agree that they perform better on SBA questions than OEQs with an average rating of 7.47 ± 2.40 (1 = disagree, 10 = agree). They also agree that a timed spotter style assessment is the most effective way of assessing anatomical knowledge (6.54 ± 2.04), although they also consider them to be more stressful due to the time restraints per station (6.23 ± 2.60). From the questionnaire, the subtopic most commonly rated to be the most difficult was Table illustrates the number of spotter stations and the total marks available in each year, which changed in line with the evolving nature of the National Undergraduate Neuroanatomy Competition. Over the studied period, the total number of questions was 665 equating to 1,330 marks. In 2020, three questions were removed from the examination post hoc review.

DISCUSSION
As anatomy assessments evolve to meet the need of larger student numbers and align with revalidated curricula, it becomes increasingly important for educators to be reliably informed when selecting appropriate assessment approaches (Wass et al., 2001;Chakravarty et al., 2005;Gregory et al., 2009;Smith and Mathias, 2011;Samarasekera et al., 2015). Within modern medical curricula, relying upon the identification of structures alone is no longer considered the most effective way in which to evaluate student's anatomy knowledge (Wass et al., 2001;Samarasekera et al., 2015).    However, most anatomists would probably agree that there are certain structures which will always remain on the "need to know" list (Smith and McManus, 2015). It is clear from the core undergraduate syllabi that the amount of knowledge a student should know on graduation has not significantly changed in the UK . Therefore, robust assessments are required throughout medical school so that students have sufficient anatomical knowledge by the time they reach clinical practice.

Assessment Context
In an age where many anatomists are expressing their concern for the level of knowledge held by graduating doctors, it is important that we understand how changing question structure in this way can impact upon the standards we set for our students. Although anatomy is commonly integrated into many assessment types, most medical schools still host a practical based anatomy examination of some kind held in a dissection room environment (Brenner et al., 2015). Due to a number of logistical pressures and constraints within anatomy departments, it is becoming common to have components of anatomy assessments that are delivered as SBAs rather than OEQs. This format eases the marking burden but may compromise the academic rigor of assessment strategies. From a cognitive perspective, there is evidence to suggest that answering an SBA question relies more on automatic passive processing, while an OEQs utilizes controlled active processing which is associated with stronger comprehension (Badger and Thomas, 1991;Husain et al., 2012;Ozuru et al., 2013). However, it has also been argued that a carefully crafted, context rich, multiple-choice item also uses cognitive processes that closely parallel OEQ design (Palmer and Devitt, 2007;Hift, 2014). Although both types of question type are commonly used in anatomical science assessments, it has been suggested that the use of SAQs should be considered more suited to those disciplines that require spontaneous generation of the answer as an essential part of the stimulus (Schuwirth and van der Vleuten, 2004a). This aligns particularly well to bedside or placement style teaching in health science education, where students are expected to be fluent with medical and clinical terminology when engaging professionally with patients.
The UK national context has witnessed a recent General Medical Council drive to reduce the amount of summative assessments in medical curricula and replace them with many more formative opportunities. The academic rigor will of course be influenced by the purpose of the assessment. Formative assessments utilize knowledge testing as a means to promote learning and benchmark progress, while summative papers are there to determine that only students with sufficient levels of knowledge progress within their program of study. The latter requires standard setting procedures which assists in the creation of a pass mark, which is frequently determined by the predicted performance of borderline students within the cohort (Brenner et al., 2015). As a selfselecting national competition there is no expectation of this with the NUNC, other than to follow best practice guidelines with question writing and to ensure anatomical and clinical accuracy. Despite this, standard setting procedures for each practical assessment were undertaken to monitor question level inflation and so that annual results could be meaningfully compared. The number of questions which are considered as part of a core curriculum and those that go beyond it have been quantified annually to maintain consistency (Hall et al., 2017).

Impact of Question Type on Performance
The exploration of this issue has been observed across a number of disciplines in higher education (Badger and Thomas, 1991;Bridgeman, 1992;Wolfson et al., 2001;Ozuru et al., 2013;Wooton et al., 2014;Biria and Dehghan, 2016) and within medical or dental topics (Schuwirth and van der Vleuten, 2004a; Brenner et al., 2015;Thompson and O'Loughlin, 2015;Adamu et al., 2018;Hauer et al., 2020) providing mixed results when student performances were statistically compared or correlated with each other. The NUNC's practical element relies entirely on asking participants to identify brain structures, which are noncontextual and mostly not reflective of the full range of questions used in formal anatomy assessments. Despite this, it can be adapted as a tool to compare students' performance on SBAs and OEQs in a specific subject area without compromising the curriculum. The results of the current investigation demonstrate that students attending the NUNC over its last eight-year duration have performed significantly better on the SBA questions compared to OEQs. When relating these findings within anatomical science our results compare favorably to the observations of Melovitz Vasan et al. (2018), but are not consistent with those of Adamu et al. (2018). Within the field of assessment research more broadly, the picture is somewhat contradictory, with a number of nonsignificant reports (Bridgeman, 1992;Wolfson et al., 2001). Given this variable history within the published literature, it is worth exploring some of the factors influencing outcomes in some of these settings. The work by Adamu et al. (2018) strongly suggests that performance in each test type is dependent on medical student's year of study, with more senior students performing better on SBA questions. This theoretical perspective corresponds well with the views of Palmer and Devitt (2007), Schuwirth and van der Vleuten (2004b), and Hift (2014) who indicate that OEQs are not superior to SBAs, especially by the point of exit level summative assessments in medicine. The NUNC is separated into clinical and preclinical categories and so further analysis might well shed some light on this issue. Interestingly, the NUNC is intended for those possessing "high level" neuroanatomical knowledge and there is some evidence to suggest these individuals (with high levels of subject-specific knowledge) are more positively correlated to a stronger SBA performance (Ozuru et al., 2013). The current findings appear to support this hypothesis and highlight the strong possibility of statistical interactions beyond the main effect of the outcome.

Perceptions versus Performance
The existing results show that medical students perform differently when an alternative test method is used for the same topic, and to this end, if used as part of a summative curricular process, may lead to a misleading impression of a student's knowledge in that area.
Unsurprisingly, the majority of medical students predicted that they would perform better on SBA questions than OEQs in our bespoke survey and this fits well with the existing evidence (Zeidner, 1987). This finding also resonates well with work exploring self-assessment ability of medical students in practical anatomy assessments. Published studies have demonstrated a negative correlation between the students predicted grade and actual grade (Sawdon and Finn, 2014;Hall et al., 2016). This questions the ability of students to accurately self-assess their own performance in pressured environments, and in a vocational subject such as medicine, being aware of knowledge limitations and competency is important for patient safety.
Student attitudes are interesting because they may influence psychological sensitivity toward each of these approaches when used as part of a summative curricular assessment. The current body of evidence indicates that students alter their learning and revision strategy depending on the nature of type of examination they anticipate undertaking (Simkin and Kuechler, 2005;Stanger-Hall, 2012). These studies further suggest that the variation of response formats will have an impact in anticipatory learning. Understanding why this occurs might be intuitive to some degree; when a student does not know the answer to an OEQ question, it becomes challenging for them to self-generate one because it is reliant on active rather than passive cognitive processing (Ozuru et al., 2013). However, with the same level of uncertainty a student may be able to select the most correct option (utilizing calculated reasoning) from the SBA, which is known as the "cueing effect" (Stiggins and Chappuis, 2005).Over time, students can learn and develop the ability to differentiate between genuine relevant answers and "distractors" due to their common characteristics-something often referred to as "test wiseness" (Cahill and Leonard, 1999). In educational terms, this is considered to be strategic learning approach and contributes to a variance in score that is attributable to the test method and not ability (Bridgeman, 1992;Sam et al., 2016). Despite this, it has also been reported that higher achieving students who engage with each question on a cognitive level are sometimes known to overthink questions (Hubbard et al., 2017) and this may be why previous evidence suggests that they prefer OEQs (Birenbaum and Feldman, 1998). Given the self-selecting nature of the NUNC, this is certainly something that could have affected this study. One approach of mitigating against issues with the reliability are to increase the overall number of SBA questions (Sam et al., 2018).
Although it is well known that poorly constructed SBA questions can lead to alphabetical pattern recognition or correct answers being identified through other characteristics such as being longer or more detailed than their distractors (Evans, 1984), writing good quality SBA questions is now considered somewhat esoteric. The skills and expertise to undertake the task effectively are acquired through training and experience (Evans, 1984;Case and Swanson, 2002). It is not uncommon for administrative teams to provide detailed post assessment analysis of student performance using common performance metrics such as "facility" and most common selected wrong answer to inform educators on how to improve upon their bank of SBA questions.

Neuroanatomy Subtopics
The subtopic and question difficulty distribution between 2013 and 2016 and that of the SBA in 2017-2020 were not significantly different. From a discipline-specific perspective, the subtopic frequently rated as the most difficult was the diencephalon, yet, it generated one of the highest mean performance scores in the NUNC. Conversely, the topic vascular was rated as the easiest but produced one of the lower mean performance scores. There is an existing argument which implies that when a subject area is considered to be challenging it stimulates an increase in time devoted to it, which subsequently impacts upon the depth of knowledge (Thompson and O'Loughlin, 2015;Melovitz Vasan et al., 2018). In this particular study, the SBAs were compared to those from the OEQs (between two modules) and concluded that the improved performance in the OEQs had resulted from a change in learning approach. This would suggest that by altering the assessment type can influence student approaches to learning, which has been historically supported by the literature (Biria and Dehghan, 2016).

Student Views on Modern Anatomy Assessments
The general consensus from participants in our study (and therefore from those participating in the NUNC) was that the timed spotter style assessment was the most effective way of assessing anatomical knowledge. Understandably, students voiced that this type of assessment method is more stressful than others in the curriculum, for the reason that in most cases each station is individually timed, and therefore, they are unable to revisit questions in the same way that they could for either a written MCQ or long structured essay style examinations. There have been some examples of institutions transitioning the traditional anatomy "steeplechase" style assessment to online alternatives. In terms of student perceptions, the removal of the need to physically move around the laboratory and being seated instead reduced anxiety levels, (Inuwa et al., 2011). The current evidence supports the view that there is no detrimental effect in outcomes with a shift in assessment modality of this nature, so educators could safely consider this option without fear of jeopardizing academic rigor (Meyer et al., 2016).
Despite the encouraging work on the efficacy of online assessment approaches the NUNC is one of only a handful of UK-based opportunities where students can demonstrate their motivation and knowledge (above and beyond a standard curriculum syllabus) in a specific field outside of a formal curriculum context. It is also an opportunity to network and physically interact with other enthusiasts and professionals at varying levels of expertise, which is what ultimately makes the NUNC such a valuable learning experience. The combination of physical attendance and individual feedback (which is emailed to all students after each competition) is considered invaluable even in the absence of winning prizes.

Limitations of the Study
It is important to acknowledge that there is clearly a selection bias in this study since students with a special interest in neuroscience would compete and devote more extracurricular time to win this competition. There were significantly more males the OEQ group than the SBA group (66.4% vs. 55.6%, P = 0.01). While this may reflect an increasing female attendance over the years which makes the current NUNC more in line with the female dominated medical student population, the early data may be less generalizable to the rest of the medical student population. It may also be less generalizable to other anatomy subject areas-the specialist pathway of neurosurgery tends to be male dominated in the UK. It is also possible that question type could have been influenced by the mean performance between the different subtopics. For example, mean performance in all OEQ cerebrum questions were compared to all SBA cerebrum questions. Like-for-like questions were not compared in this study. Therefore, the suggestion that question difficulty is accounting for some of the differences reported cannot be ruled out when interpreting these results. We recognize the assessments are designed to identify a winner, which does not reflect the purpose of assessments in the medical curriculum, which may impact on the generalizability of the data. However, given the limitations restricting changes to an assessment within the curriculum, we feel the present study uses a reasonable and comparable alternative to assess approaches to assessment within a similar population of students.

CONCLUSIONS
This study provides further contributory evidence suggesting that a change in assessment modality can result in variable outcomes when used as part of a bespoke extracurricular assessment in clinically oriented neuroanatomy (using identify only questions). However, it also endorses the view that performance of students in assessments of this nature can be extremely sensitive to other factors such as familiarity with the test scenario, stage of training, and level of topic-specific knowledge.
It adds some further credence to the view that the use of OEQs enables students to demonstrate deeper understanding, and through the process of generating cognitive errors, can positively impact upon learning, particularly at the earlier stages of training. Therefore, it is suggested that, where possible, medical schools include this question type within assessment formats at some stage during their programs. However, it is also acknowledged that well-written OEQ have the ability to test higher cognitive functions but may be more suited to those with existing levels of integrated knowledge, such as senior students.
Given that clinical practice requires the spontaneous recall of anatomical knowledge in professional conversations, it might be more suitable to use OEQs as the preferred assessment type over SBAs (at least when requesting students to identify essential anatomical structures) despite the fact that such context free questions as these can only ever determine superficial knowledge, which is not representative of the deeper levels of understanding required for clinical competency. and exploring how and why neurophobia manifests itself within undergraduate medical students. He is a cofounder of the NUNC and is a principal investigator of all NUNC-related projects.
AHMAD ELMANSOURI, B.M.B.S., is an anatomy demonstrator at the Department of Medical Education at Brighton and Sussex Medical School in Brighton, United Kingdom. His research interest is in medical education, specifically in the impact of near-peer teaching and technology-enhanced learning. He is the founder of The Wessex Finals Revision Weekend and the Coordinator and presenter for the Healthcare Leadership Academy Podcast. He is a senior member of the NUNC committee and has led on data collection for all strands of research related to NUNC evens.
ROB PARKER, is a fifth-year medical student at the University of Southampton School of Medicine, Southampton, United Kingdom. His research interest is in medical education and immunopsychiatry. He led on anatomical dissections and devised questions for the spotter element of NUNC over a period of four years.
ALISTAIR D. ROBSON, M.Med.Sci., is a fifth-year medical student at the University of Southampton School of Medicine, Southampton, United Kingdom. He has a keen interest in technology-enhanced learning resources in neuroanatomy, a theme he developed for the completion of his integrated Masters project. He coordinated the 2020 National Undergraduate Neuroanatomy Competition, and has been involved in data collection and in contributing to all NUNC-related research themes since 2017.
OCTAVIA KURN, B.Med.Sci., is a fourth-year medical student at the University of Southampton School of Medicine, Southampton, United Kingdom. She has led the peer assisted teaching program in anatomy and is a core member of the NUNC committee. She has been involved in data collection an in contributing to all National Undergraduate Neuroanatomy Competition related research since 2017.
RACHEL PARROTT, is an assistant prosector in the Department of Anatomy at St Andrews University, St Andrews, Scotland. She provides dissection training to technical staff and demonstrators as amended by the Human Tissue Authority Act 2006 (Scotland). She has a research interest in education and provides technical support to both academic and technical staff, first-, second-, and third-year medical students in the dissecting room, and perform various technical duties. She is a senior committee member of the National Undergraduate Neuroanatomy Competition, overseeing the organization and running of the laboratory-based activities of the competition.
KATE GEOGHEGAN, B.M.B.S., is a junior doctor working in the Department of Cardiology at Royal United Hospital in Bath, United Kingdom. Her research interest is in medical education, neurology, and global health. She is a core member of the University of Southampton's peer-led neuroanatomy teaching team and coordinated the 2019 National Undergraduate Neuroanatomy Competition. CHARLOTTE H. HARRISON, B.M.B.S., M.Med.Sci., is a junior doctor working in the Department of Emergency Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, United Kingdom. She is an alumnus of the University of Southampton and has been involved in the organization of neuroanatomy near-peer teaching at the University of Southampton School of Medicine. Her research interests range from medical education to neuropathology. She is the lead for assessment at each NUNC, coordinating and overseeing the entire marking team and process, while ensuring quality control and correct data entry procedures.
DEEPIKA ANBU, M.Med.Sci., is a fourth-year medical student at the University of Southampton School of Medicine, Southampton, United Kingdom. She has a keen interest in medical education and conducted her masters' study on nearpeer teaching and technology-enhanced learning resources in neuroanatomy. She is a core member of the peer-led neuroanatomy teaching team and coordinated the 2020 National Undergraduate Neuroanatomy Competition. She has been involved in data collection and in contributing to all NUNCrelated research.
OLIVER DEAN, B.Med.Sci., is a fourth-year medical student at the University of Southampton School of Medicine, Southampton, United Kingdom. Alongside undertaking his academic studies, he teaches anatomy to second-year medical students and is on the committee of the National Undergraduate Neuroanatomy Competition. His research interest is in medical education. He was responsible for the MCQ element of the NUNC, leading a team of question writers, and compiling the MCQ paper.
SCOTT BORDER, B.Sc. (Hons) Ph.D., S.F.A.H.E., F.A.S., N.T.F., is a principal teaching fellow in anatomy in the Centre for Learning Anatomical Sciences at the University of Southampton, Southampton, United Kingdom and Chair of the Education Committee for the Anatomical Society in United Kingdom. His pedagogical focus is on the scholarship of neuroanatomy and head and neck anatomy, working with students as partners in anatomical education, and his research interest is focused on how learning technologies can enhance the student experience and improve knowledge gain. He is a cofounder of NUNC and senior author of all NUNC-related research work, including managing all contributions to the current project.