The Effect of Stereoscopic Augmented Reality Visualization on Learning Anatomy and the Modifying Effect of Visual‐Spatial Abilities: A Double‐Center Randomized Controlled Trial

Monoscopically projected three‐dimensional (3D) visualization technology may have significant disadvantages for students with lower visual‐spatial abilities despite its overall effectiveness in teaching anatomy. Previous research suggests that stereopsis may facilitate a better comprehension of anatomical knowledge. This study evaluated the educational effectiveness of stereoscopic augmented reality (AR) visualization and the modifying effect of visual‐spatial abilities on learning. In a double‐center randomized controlled trial, first‐ and second‐year (bio)medical undergraduates studied lower limb anatomy with stereoscopic 3D AR model (n = 20), monoscopic 3D desktop model (n = 20), or two‐dimensional (2D) anatomical atlas (n = 18). Visual‐spatial abilities were tested with Mental Rotation Test (MRT), Paper Folding Test (PFT), and Mechanical Reasoning (MR) Test. Anatomical knowledge was assessed by the validated 30‐item paper posttest. The overall posttest scores in the stereoscopic 3D AR group (47.8%) were similar to those in the monoscopic 3D desktop group (38.5%; P = 0.240) and the 2D anatomical atlas group (50.9%; P = 1.00). When stratified by visual‐spatial abilities test scores, students with lower MRT scores achieved higher posttest scores in the stereoscopic 3D AR group (49.2%) as compared to the monoscopic 3D desktop group (33.4%; P = 0.015) and similar to the scores in the 2D group (46.4%; P = 0.99). Participants with higher MRT scores performed equally well in all conditions. It is instrumental to consider an aptitude–treatment interaction caused by visual‐spatial abilities when designing research into 3D learning. Further research is needed to identify contributing features and the most effective way of introducing this technology into current educational programs.

Monoscopically projected three-dimensional (3D) visualization technology may have significant disadvantages for students with lower visual-spatial abilities despite its overall effectiveness in teaching anatomy. Previous research suggests that stereopsis may facilitate a better comprehension of anatomical knowledge. This study evaluated the educational effectiveness of stereoscopic augmented reality (AR) visualization and the modifying effect of visual-spatial abilities on learning. In a double-center randomized controlled trial, first-and second-year (bio)medical undergraduates studied lower limb anatomy with stereoscopic 3D AR model (n = 20), monoscopic 3D desktop model (n = 20), or two-dimensional (2D) anatomical atlas (n = 18). Visual-spatial abilities were tested with Mental Rotation Test (MRT), Paper Folding Test (PFT), and Mechanical Reasoning (MR) Test. Anatomical knowledge was assessed by the validated 30-item paper posttest. The overall posttest scores in the stereoscopic 3D AR group (47.8%) were similar to those in the monoscopic 3D desktop group (38.5%; P = 0.240) and the 2D anatomical atlas group (50.9%; P = 1.00). When stratified by visual-spatial abilities test scores, students with lower MRT scores achieved higher posttest scores in the stereoscopic 3D AR group (49.2%) as compared to the monoscopic 3D desktop group (33.4%; P = 0.015) and similar to the scores in the 2D group (46.4%; P = 0.99). Participants with higher MRT scores performed equally well in all conditions. It is instrumental to consider an aptitude-treatment interaction caused by visual-spatial abilities when designing research into 3D learning. Further research is needed to identify contributing features and the most effective way of introducing this technology into current educational programs. Anat Sci Educ 13: 558-567. © 2019 The Authors. Anatomical Sciences

INTRODUCTION
Anatomical knowledge among undergraduate medical students and recently graduated doctors has repeatedly been reported to be insufficient (McKeown et al., 2003;Prince et al., 2005;Spielmann and Oliver, 2005;Waterston and Stewart, 2005;Bergman et al., 2008). One of the main reasons is the decrease in anatomy teaching time in undergraduate education, related to increasing costs and limited availability of cadavers, and time pressure on the curriculum, have led to a decreased exposure to traditional cadaveric dissections (Pryde and Black, 2005;Waterston and Stewart, 2005;Azer and Eizenberg, 2007;Drake et al., 2009;Bergman et al., 2013). Although, the educational value is being debated (Azer and Eizenberg, 2007), cadaveric dissections provide a complete visual and tactile learning experience of anatomy which is three-dimensional (3D) by nature. Features such as stereopsis (visual sense of depth), dynamic exploration (the possibility to view the object of study from different angles), and haptic feedback (sense of touch) are crucial for the engagement in 3D anatomy (Klatzky and Lederman, 2011;Reid et al., 2018).
In search of additional educational resources, computer assisted resources have been widely explored in anatomical education. A considerable number of studies have evaluated the effectiveness of digital 3D anatomical models which can be explored on a two-dimensional (2D) screen of a regular computer, smartphone or tablet. In an extended meta-analysis of these studies, Yammine and Violato (2015) concluded that three-dimensional visualization technology (3DVT) is effective in improving factual (effect size of 0.50) and spatial (effect size of 0.30) anatomical knowledge. However, despite of the overall positive effect on learning, 3DVT appears to have significant disadvantages for students with lower visual-spatial abilities (Garg et al., 1999a,b;2002;Levinson et al., 2007;Naaz, 2012). The disadvantages are well known in the research field of 3D learning and were first described by Garg and colleagues (1999a,b;2002). In these studies, visual-spatial abilities significantly affected the learning process of spatial anatomy showing a great disadvantage for low performing students. Viewing an unfamiliar 3D object from multiple angles would be challenging for these students due to evidence that 3D objects are remembered as key view-based 2D images (Garg et al., 2002;Huk, 2006;Levinson et al., 2007;Khot et al., 2013).
However, when traditional digital 3D models are viewed stereoscopically by projecting a slightly shifted image to the left and right eye, the disadvantages for students with lower visual-spatial abilities seem to disappear. Cui and colleagues (2017) have evaluated the effectiveness of a stereoscopic 3D view of the head and neck vascular anatomy in comparison to 2D representations of the same anatomical model. They reported a better performance of undergraduate medical students after learning anatomy with a stereoscopic 3D model. Most importantly, students with lower visual-spatial abilities have improved their knowledge test scores to a level comparable to that demonstrated by the students with higher visual-spatial abilities. The role of stereopsis has also been evaluated by Luursema and colleagues (2006Luursema and colleagues ( , 2008Luursema and colleagues ( , 2017 within various 3D environments, such as virtual reality and stereoscopic projection on a computer with the use of 3D shutter glasses. Although the stereoscopic view of an anatomical model has had a positive effect only on one of the two post-tasks, the interaction between visualspatial abilities and the stereoscopic condition remained significant (Luursema et al., 2008). Overall, stereoscopic 3DVT appears to have a positive effect on learning as recently demonstrated by Hackett and Proctor (2018). Their intervention concerned an autostereoscopic holographic visualization of a cardiac 3D model which has been compared to a monoscopic desktop view and 2D printed images of the model. Students in the intervention group scored significantly higher on the anatomical knowledge test and have reported a significantly lower cognitive load in comparison to both control groups. However, a possible interaction between intervention and visual-spatial abilities has not been evaluated. The positive role of stereopsis has also been shown when a physical model of the pelvis was compared to a monoscopic 3D model by Wainman and colleagues (2018). Authors have concluded that stereopsis, and not haptic feedback, primarily contributed to the improved knowledge scores when learning with a physical model.
In regard to these findings, two aspects come into play. First, beneficial effects of stereopsis support the evidence that 3D mental representations depend on the nature of the input by activating different regions of the brain, and might contain spatial information instead of key view-based 2D images alone (Jolicoeur and Milliken, 1989;Kourtzi et al., 2003;Luursema et al., 2006;Verhoef et al., 2016). Stereopsis might therefore facilitate a better comprehension of anatomy especially among students with lower visual-spatial abilities. Second, the reported differences in learning effect between students with lower and higher levels of visual-spatial abilities in various interventions possibly reflect an aptitude-treatment interaction. An aptitudetreatment interaction occurs when a student's attribute, e.g., visual-spatial abilities, predicts different outcomes for different treatments (Cook, 2005). Such interaction is only detectible when the outcomes are stratified by the variable or when the variable is included in the regression analysis as an interaction term (variable × intervention), as demonstrated by Luursema and colleagues (2008) and Cui and colleagues (2017).

Augmented Reality in Anatomy Education
Augmented reality (AR) is a new generation of 3DVT technology that is eagerly being explored in the field of anatomical education and research in recent years (Moro et al., 2017;Kuehn 2018). It gained popularity due to its ability to combine 3D computer-generated virtual objects with physical environment. This enables learners to interact with each other and with the digital environment using mobile devices, such as smart phones and tablets, or, more recently, head-mounted displays (HMDs) such as AR and VR devices. Whether the anatomy can be perceived in a real 3D plane, depends on the type of device. From flat screens visualization of 3D content is usually obtained monoscopically with various interactive features added to the digital overlay provided by these devices (Küçük et al., 2016;Barmaki et al., 2019;Sugiura et al., 2019). HMD can provide an interactive and stereoscopic way of 3D visualization (Supplementary Material 1). With AR technology, such as with the HoloLens ® , the most distinguishing feature is the ability to perceive an anatomical model in a real 3D plane without losing the sense of the user's own environment. Dynamic exploration, an object centered view, enables users to walk around the stereoscopic model and explore it from all possible angles. The use of this technology has been reported in the surgical field of preoperative planning and tumor localization (McJunkin et al., 2018). The educational effectiveness of this technology for teaching anatomy has not been evaluated yet. For the purpose of this study an augmented reality application DynamicAnatomy was developed at the department of Anatomy and Embryology at Leiden University Medical Center and the Centre for Innovation of Leiden University. This application provides a dynamic stereoscopic 3D view on the lower limb including the musculoskeletal anatomy. Further specification of the application is provided in the Methods section.

Objectives and Aims
The aim of this study was to evaluate the learning effect of an anatomical stereoscopic 3D AR model of the lower leg among medical undergraduates when compared to a monoscopic 3D desktop model and 2D anatomical atlas. The secondary objective was to evaluate whether visual-spatial abilities would modify the observed learning effect. In addition, the study aimed to evaluate the student's experience of learning anatomy in AR. The authors hypothesized that the stereoscopic 3D AR model is more effective in improving anatomical knowledge than the monoscopic 3D desktop model and the 2D anatomical atlas, and that students with lower levels of visual-spatial abilities benefit most from the stereoscopic 3D view of the model.

Study Design
A double-center randomized controlled trial was conducted at the Leiden University Medical Center (LUMC) and the Erasmus University Medical Center Rotterdam (EMC), The Netherlands in the spring of 2018 ( Figure 1). The study protocol was approved by the Institutional Review Board at the Leiden University (registration no. CEP17-1215/420). Participation was voluntary and written consent was obtained from all participants.

Study Population
Participants were a volunteer sample of first-and second-year undergraduate students of Medicine and Biomedical Sciences at the LUMC and EMC and were recruited through flyers and announcements during the lectures. The study took place prior to the anatomy courses on the musculoskeletal system of the limbs, ensuring limited knowledge of the lower limb anatomy among all participants. Students who had already taken part in this course were excluded. The baseline knowledge was not assessed to avoid extra burden for students and possible influence on learning during the intervention and the performance on the posttest (Cook and Beckman, 2010). Participation in the study did not interfere with the curriculum and the assessment results did not affect student's academic grades. Participants received a compensation of 15 euros at the completion of the experimental session.

Figure 1.
Flowchart of study design. LUMC, Leiden University Medical Center; EMC, Erasmus Medical Center Rotterdam; n, number of participants; AR, Augmented Reality; 3D, three-dimensional; 2D, two-dimensional. anatomical atlas group. Students were assigned an identification number, and these were randomly allocated to the three groups using an Excel Random Group Generator. Blinding of participants was impossible since the intervention was apparent to the students.

Educational Interventions
For the purpose of this study an augmented reality application DynamicAnatomy (LUMC, 2019) for HoloLens ® , version 1.0 (Microsoft Corp., Redmond, WA) was developed at the Department of Anatomy and Embryology at Leiden University Medical Center and the Centre for Innovation of Leiden University (www.mr4ed ucati on.com). The application represented a dynamic and fully interactive stereoscopic 3D model of the lower leg. The model was presented as a 3D virtual object in the physical space (Supplementary Material 1A). The HoloLens ® glasses are transparent which enabled participants to stereoscopically interact with the model without losing the sense of their physical environment. A unique feature includes an object centered view, i.e., dynamic exploration, which enabled participants to walk around the 3D model and explore it from all possible angles. Participants navigated through the user interface and selected desirable functions by making specific hand gestures or giving a voice command. Active interaction included size adjustments, showing or hiding structures by group or individually, visual and auditory feedback on structures and anatomical layers, and animation of the ankle movements (Table 1). With the gaze function switched on, the text of the anatomical descriptions appeared next to the highlighted structure. The anatomical layers included musculoskeletal, connective tissue, and neuro-vascular systems. During this experiment, study participants focused on the musculoskeletal system. Prior to the experiment, participants completed a 10-minute training module, without anatomical content, to get familiar with the use of the application and device.
For the intended comparison, a Windows desktop application was developed with all the features of DynamicAnatomy. The desktop application included the identical anatomical model of the lower limb which was now displayed monoscopically on a 2D computer screen. The model could be rotated along the Y-axis in both directions with a slide-bar using a computer mouse (Supplementary Material 1B). All other features such as voice control, auditory feedback, and scaling were unchanged (Table 1).
In the 2D anatomical atlas group, study material included selected handouts from an anatomy atlas (Putz and Pabst, 2006) and an anatomy textbook (Moore et al., 2013) covering anatomy of the musculoskeletal system. The selection consisted primarily of 2D images of bones and muscles of the lower leg and ankle movements with short descriptions. Each handout included an index for the ease of navigation. In all groups the anatomical descriptions were limited to the names of the structures. No additional textual descriptions were provided.

Learning Objectives and Instructional Activities
Participants received a handout with a description of the learning goals (identical for each group) and instructions for the learning session (specific for their group). Both were developed based on the constructive alignment theory to ensure the alignment between the intended learning outcomes, instructional activities, and knowledge assessment (Biggs, 1996) (Supplementary Material 2A, 2B). The learning goals were formulated and organized according to Bloom's Taxonomy of Learning Objectives (Bloom et al., 1956). An independent expert outside of the anatomy verified the alignment between the learning goals and the assessment according to the constructive alignment theory and Bloom's Taxonomy of Learning Objectives. Learning goals included memorization of the names of bones and muscles (factual knowledge), understanding the function of the muscles based on their origin and insertion (functional knowledge), and location and organization of these structures in relation to each other (spatial knowledge). Students were free to follow the provided instructions or to choose their own way of achieving the learning goals. Duration of the learning session was 45 minutes.

Visual-Spatial Abilities Assessment
Visual-spatial abilities were assessed prior to the start of the learning session. Mental visualization and rotation, as the main components of visual-spatial abilities, were assessed by the 24-item Mental Rotation Test (MRT) (Shepard and Metzler, 1971), previously validated by Vandenberg and Kuse (1978) and redrawn by Peters and colleagues (1995) (Supplementary Material 3A). This psychometric test is being widely used in the assessment of visual-spatial abilities and has repeatedly shown its positive association with anatomy learning and assessment (Guillot et al., 2007;

Anatomy Knowledge Assessment
The learning effect was evaluated by a 30-item knowledge test. The test consisted of a combination of twenty extended matching questions and ten open-ended questions. The knowledge was assessed in factual (i.e., memorization/identification of the names of bones and muscles), functional (i.e., understanding the function of the muscles based on their course, origin, and insertion) and spatial (i.e., location and organization of structures in relation to each other) knowledge domains (Supplementary Material 4). Content validation was performed by two experts in the field of anatomy and plastic and reconstructive surgery. The test was then piloted among 12 medical students for item clarity. The post-hoc calculated level of internal consistency (Cronbach's alpha) was 0.78. Duration of the assessment was 30 minutes.

Evaluation of Learning Experience
Participants' learning experience was evaluated by a standardized self-reported questionnaire. The evaluation included items on study time, perceived representativeness of the test questions, perceived knowledge gain, usability of and satisfaction with the provided study materials. Response options ranged from "very dissatisfied" (1 point) to "very satisfied" (5 points) on a 5-point Likert scale.

Statistical Analysis
Participant's baseline characteristics were summarized using descriptive statistics. The differences in baseline measurements were assessed with a one-way ANOVA for differences in means and X 2 test for differences in proportions. The normal distribution was assessed with Shapiro-Wilk Test of Normality in combination with the Normal Q-Q Plots. The differences in mean percentages of correct answers on the anatomy knowledge test between groups were assessed with one-way ANOVA including mean percentages of correct answers as a dependent variable and intervention group as a fixed factor. In case of a significant difference, a post-hoc Bonferroni test was performed to identify the pairs of means that differ. The obtained P values were adjusted for multiple comparisons with a Bonferroni correction (P value*k). The results were stratified by MRT, PFT, and MR test scores to evaluate possible aptitude-treatment interaction between visual-spatial abilities and type of intervention. In addition, an ANCOVA was performed to evaluate the interaction in a linear regression analysis. Anatomy knowledge test score was included as a dependent variable, intervention group as a fixed factor, visual-spatial abilities test score as a covariate, and "visual-spatial abilities test score" × "intervention group" as in interaction term. The effect size (Cohen's d) of the differences in anatomy knowledge test scores between groups was calculated using the mean scores and standard deviations of two groups (Cohen, 1988). All analyses were performed using SPSS statistical software package version 23.0 for Windows (IBM Corp., Armonk, NY). Statistical significance was determined at the level of P < 0.05.

RESULTS
A total of 60 participants were included in the study. Two participants allocated to the 2D anatomical atlas group did not show up for the experiment. The 2D anatomical atlas group, therefore, consisted of 18 participants. Participants were not aware of their allocation to one of the three groups in advance but were informed prior to the start of the experiment. Table 2 shows baseline characteristics of the 58 participants.

Scores Stratified by Visual-Spatial Abilities
When total scores on the anatomy knowledge test were stratified by MRT, PFT, and MR test scores, only the MRT scores did significantly impact the outcomes in all three conditions. Students who scored below the mean were assigned to the MRT-low group (n = 31) and students who scored above the mean were assigned to the MRT-high group (n = 26). As shown in Figure 3, the MRT-high group performed equally well in each of the three intervention groups (F(2,23) = 0.83, P = 0.448). However, among MRT-low participants significant differences were found between groups. The stereoscopic 3D AR group (49.2%, SD ± 9.5) significantly outperformed the monoscopic 3D desktop group (33.4%, SD ± 11.5; F(2,28) = 6.59, P = 0.015, Cohen's d = 1.54), and performed equally well as the 2D anatomical atlas group (46.4%, SD ± 14.5; F(2,28) = 6.59, P = 0.990, Cohen's d = 0.24). Although, students achieved higher scores in the 2D anatomical atlas group than in the monoscopic 3D desktop group with a moderate effect size (Cohen's d = 1.00), the observed difference was not significant (P = 0.250). The MRT-low group performed significantly worse than the MRT-high students in the monoscopic 3D desktop group (33.4%, SD ± 11.5 vs. 49.7%, SD ± 13.9; P = 0.015, Cohen's d = −1.3) However, they performed equally well in the stereoscopic 3D AR and 2D anatomical atlas groups. The observed differences strongly indicate an aptitudetreatment effect caused by visual-spatial abilities. This phenomenon occurs when the effect of an intervention is different in groups of subjects with different characteristics. Therefore, the observed interaction between the MRT scores and the intervention groups was additionally checked in a linear regression analysis. The interaction term "MRT score" × "intervention group" showed a marginal trend toward significance (F(2) = 3.04; P = 0.05). Including PFT and MR test scores as a covariate and an interaction term did not have any significant impact on the outcomes.

Evaluation of Learning Experience
As shown in Table 3, participants in the stereoscopic 3D AR group enjoyed the learning session more than the participants in other two groups (4.8 ± 0.4 vs. 3.4 ± 0.8 vs. 2.4 ± 0.9; F(2,54) = 50.3, P = 0.003). Participants found the application easy and intuitive to use and would recommend it to their fellow students. In all three groups participants reported that their knowledge about anatomy of the lower leg was improved (4.3 ± 0.6 vs. 4.1 ± 0.9 vs. 4.1 ± 0.8; F(2,54) = 0.6, P = 0.574).

DISCUSSION
This study aimed to investigate the educational effectiveness of learning with stereoscopic AR visualization technology and to evaluate whether visual-spatial abilities would modify the learning effect.
First, the observed aptitude-treatment interaction caused by visual-spatial abilities needs to be addressed in more depth. The results showed significant differences in learning effect upon interventions using 2D and 3D learning materials among participants with lower and higher visual-spatial abilities scores as measured by the MRT. These differences were detectible only after stratification of the overall results pointing toward an aptitude-treatment interaction, also referred to as "effect measure modification" (Cook, 2005;Rothman et al., 2008;Corraini et al., 2017). This phenomenon occurs when the effect of an intervention is different in groups of subjects with different characteristics, and is different from the effect of a confounder. In current analyses, when visualspatial abilities were treated only as a confounder, in the absence of stratification, the differences between monoscopic and stereoscopic conditions for different levels of visualspatial abilities were not evident. This means that an adjustment for this confounder by the study design (e.g., randomization) or statistical analysis (e.g., including it only as a covariate in the regression analysis), will still not be sufficient, and the results can still be misleading.
Second, the monoscopic 3D desktop model group only showed a lower learning effect in the MRT-low group. These findings are supported by previous research in the effectiveness of monoscopic 3D visualization technologies with disadvantages for students with lower visual-spatial abilities (Garg et al., 1999a,b;2002;Levinson et al., 2007;Naaz, 2012). It has been hypothesized that 3D objects are memorized as key view-based 2Dimages (Bulthoff et al., 1995;Garg et al., 2002). Viewing an unfamiliar 3D object from multiple angles, could therefore lead to an increase in extraneous cognitive load (Huk, 2006;Khot et al., 2013;Mayer, 2014). The beneficial effect of stereoscopic visualization of a 3D object could be explained by the fact that mental representations depend on the nature of the input (Jolicoeur and Milliken, 1989;Kourtzi et al., 2003;Luursema et al., 2006). In that case, mental representations do not primarily consist of key view-based 2D images, but they might also include spatial information. This is further supported by the evidence that disparity processing occurs in different visual pathways of the human brain (Verhoef et al., 2016). This means, that while a monoscopic 3D desktop view and 2D anatomical atlas images would stimulate key view-based 2D mental images, a stereoscopic 3D model would stimulate structural 3D mental representations. Stereopsis might then avoid the increase in extraneous cognitive load and therefore facilitate a better comprehension of 3D anatomy in students with lower levels of visual-spatial abilities. As dynamic exploration was the second distinguishing feature of the stereoscopic 3D AR model, it may also have contributed to the positive learning effect. Being able to walk around the model with its own reference point can create an additional sense of depth. Moreover, the object centered view is different from the egocentric view where the user moves the objects in their field with virtual tools, as was the case in the monoscopic 3D desktop group. The egocentric control can affect visual-spatial skills where the hands are involved in imagining the rotation of objects. Future research is needed to evaluate how these different types of view in a 3D environment affect spatial processing during learning. This should be performed in an identical environment using the same medium, configuration and presentation (Cook 2005). This eliminates all possible confounding effects of additional features such as hand gestures, that can vary between different types of media.
Third, participants in the 2D anatomical atlas group achieved anatomy knowledge test scores similar to those in the stereoscopic 3D AR model group. This unexpected effect can be hypothetically explained by several reasons. One is the 2D nature of the paper-pencil assessment which in fact was more aligned with the studied material in the 2D anatomical atlas group. In a recent study on the effectiveness of a monoscopic 3D visualization Differences in overall mean percentages correct answers on the anatomy knowledge test between three educational interventions. a P < 0.05 analysis of variance with a Bonferroni correction for multiple comparison. MRT, Mental Rotation Test; AR, augmented reality; 3D, three-dimensional; 2D, twodimensional.

Figure 3.
Differences in overall mean percentages correct answers on the anatomy knowledge test between three educational interventions stratified by Mental Rotation Test scores. A, Students who scored below the mean were assigned to the MRT-low group (n = 31) and B, students who scored above the mean were assigned to the MRThigh group (n = 26). a P < 0.05 analysis of variance with a Bonferroni correction for multiple comparison. MRT, Mental Rotation Test; AR, augmented reality; 3D, three-dimensional; 2D, two-dimensional. technology versus the use of prosected cadaveric specimens, students have performed best on the identification questions aligned with the respective study materials (Mitrousias et al., 2018(Mitrousias et al., , 2020. A similar effect has been reported by Henssen and colleagues (2020) with the use of cross sections. Therefore, participants in the 2D anatomical atlas group could have had an advantage over participants in the other two groups. More insight can be gained by future studies that include a combination of assessment methods aligning with each of the interventions.
Another explanation is of a more theoretical nature, namely the unfamiliarity with a new type of 3D visualization technology and the meta-representational competence of students as part of their spatial intelligence. Hegarty (2010) has described this competence as the ability to choose the optimal external representation for a task, and effective use of novel external representations, such as interactive visualizations. In their research, novice Navy weather forecasters tended to choose less effective interactive visualization than experts by adding unnecessary visual information to a display in order to interpret a weather forecast (Smallman and Hegarty, 2007). In the current study, relevant 2D images were selected form the anatomical atlas which made it easy for students to identify quickly the useful images. In the intervention group, however, students had to rely on their own choices of visual representations. In an interactive 3D environment, students with lower visual-spatial abilities could therefore be less effective in choosing the right representations of anatomical structures to learn from (e.g., exploring an anatomical structure in the presence of all other structures and/or menu options versus isolating a structure from all other anatomical layers and restricting the user interface to a minimal amount of visual information). In addition, students with lower visual-spatial abilities tend to use the interactive presentations less effectively. These students for example had difficulties in rotating a digital 3D anatomical structure to a specified view (Stull et al., 2009;Hegarty, 2010). However, with the aid of orientation references, students have been able to successfully manipulate and learn from the virtual model. The tendency to choose a less effective strategy by low performing students has recently been demonstrated by Roach and colleagues (2017a, b; in performing a mental rotation task. Students with higher visual-spatial abilities had a distinct eye movement pattern in solving mental rotation tasks than low performing students (Roach et al., 2017a). When low performing students had been instructed by a visual guidance protocol that was based on the eye movement pattern of high performing students, they had significantly improved in solving the mental rotation tasks (Roach et al., 2019). For the reasons stated above, these individual differences can potentially affect the learning strategies of students and are of great interest for further investigation.

Future Directions
The findings have implications for both research and education. The modifying effect of visual-spatial abilities should be taken into account when designing new research and analysis strategies, especially in the field of 3D technologies. For educational purposes, stereoscopic 3D AR models have a great potential to be effectively used in small-group teaching settings to stimulate active learning and peer-to-peer interaction by studying a synchronized anatomical 3D models. In addition to traditional ways of teaching, this new teaching tool can be used in the context of personalized learning in order to meet the students' individual learning needs. Especially, the combination of stereoscopic 3D models and 2D anatomical atlas is worth further research. A possible synergic learning effect would be desirable since the level of anatomical Response options on a five-point Likert scale ranged from 1 = very dissatisfied to 5 = very satisfied. Scores are expressed in means (±SD). a P < 0.05 analysis of variance with a Bonferroni correction for multiple comparison; b significant difference between all the three groups; c significant difference between (1) Stereoscopic 3D AR model and monoscopic 3D desktop model group; (2) Stereoscopic 3D AR model and 2D anatomical atlas group. n, number of participants; AR, augmented reality; 3D, three-dimensional; 2D, two-dimensional; SD, standard deviation.
knowledge among medical students still remain insufficient (McKeown et al., 2003;Prince et al., 2005;Spielmann and Oliver, 2005;Waterston and Stewart, 2005;Bergman et al., 2008). When designing new virtual reality (VR) and AR environments one should carefully align the learning environment with the (learning) goals, e.g., VR is better suited for individual learning experiences, whereas AR has many advantages for collaborative and embodied learning.

Limitation of the Study
There are some methodological limitations in this study. First, due to the limited availability of hardware, the study was restricted to a maximum of 20 participants in each group. In addition, no distributional data on anatomy knowledge assessment was available beforehand. Therefore, an a priori sample size calculation could not be performed. Only for this reason, a post-hoc power analysis was performed based on the observed effect sizes, which turned out to be sufficient. Second concern was the alignment between study materials and assessment. A different form of assessment that is closer to the clinical practice and in line with the learning method (e.g., cadaveric/specimen or digital 3D assessment) should be considered to assess the acquired anatomical knowledge. If not possible, a combination of assessment methods aligning with each of the interventions should be considered. In addition, a long-term retention test would have been valuable to measure the actual retention of anatomical knowledge. Third, the participants were not tested for their lack of depth perception which could be present in about 5% of the study population (Mather, 2006). Based on these statistics, 1-2 of the 20 participants in the stereoscopic 3D AR group could have perceived the model monoscopically, which could have unfairly lowered the total group score. Lastly, some of the features that were characteristic for the type of intervention, for example hand gestures in stereoscopic 3D AR group and audio cues in both stereoscopic 3D AR and monoscopic 3D desktop groups, could have introduced bias. To eliminate such differences between groups, it is desirable to conduct research within one level of instructional design when possible. In addition, this will decrease the chance of Hawthorne effect that can occur when learners tend to learn better or harder with a more popular tool or medium, as it could have been the case in the current study.

CONCLUSIONS
Three-dimensional anatomical models that can be viewed stereoscopically in AR can help to optimize anatomical knowledge acquisition in students with lower visualspatial abilities. Further research is needed to identify factors that contribute to the positive learning effect and the most effective way of combining this technology with current education.

ACKNOWLEDGMENTS
The authors sincerely thank Prof. Dr. Marco C. de Ruiter for content validation of the anatomical knowledge test and Renée A. Hendriks, M.Sc. for the verification of the alignment construct of the learning objectives and assessment. The authors have no conflicts of interest to disclose.

NOTES ON CONTRIBUTORS
KATERINA BOGOMOLOVA, M.D., is a graduate (Ph.D.) student in the Department of Surgery and Center for Innovation in Medical Education, Leiden University Medical Center, in Leiden, The Netherlands. She is investigating the role of three-dimensional visualization technologies in anatomical and surgical education in relation to learners' spatial abilities. INEKE J.M. VAN DER HAM, Ph.D., is an associate professor neuropsychology in the Department of Health, Medical and Neuropsychology at Leiden University, in Leiden, The Netherlands. She focuses her research on spatial cognition in healthy and neuropsychological populations and on methodological considerations in the use of virtual reality.
MARY E.W. DANKBAAR, Ph.D., is a program manager of e-learning and assistant professor at Erasmus University Medical Center Rotterdam in Rotterdam, The Netherlands. She has a background in educational science and has designed and implemented several blended educational programs for the medical curriculum. Her focus in research is on designing simulation and games for skills training.
WALTER W. VAN DEN BROEK, M.D., Ph.D., is the director of medical education at the Erasmus University Medical Center Rotterdam and the scientific director of the Institute for Medical Education Research Rotterdam in Rotterdam, The Netherlands. He is also director of residency training in psychiatry at the Erasmus MC, teaches psychiatry to medical students, and is a coach for bachelor medical students. STEVEN E.R. HOVIUS, M.D., Ph.D., is an emeritus professor of plastic and reconstructive surgery and hand surgery at Erasmus University Medical Center in Rotterdam and Radboud University Medical Center in Nijmegen, The Netherlands. He teaches anatomy of upper and lower limbs to medical students and surgical residents. JOS A. VAN DER HAGE, M.D., Ph.D., is a professor of intra-curricular education in surgery in the Department of Surgery, Leiden University Medical Center, Leiden, The Netherlands. He teaches surgery and anatomy to residents and medical students. One of his educational research topics is on 3D learning in anatomical and surgical education. BEEREND P. HIERCK, Ph.D., is an assistant professor of anatomy in the Department of Anatomy and Embryology and researcher at the Center for Innovation in Medical Education, Leiden University Medical Center in Leiden, The Netherlands. He teaches anatomy, developmental biology and histology to (bio)medical students. His educational research focuses on 3D learning in the (bio)medical curriculum, with a special interest in the use of extended reality.