A Preliminary Study on Teaching Prerequisite of Visual Perspective Taking
Bijun Wang and Fanyu Lin*
ALSOLIFE, Beijing, China
Submission: July 15, 2021; Published: July 22, 2021
*Corresponding author: Fanyu Lin, ALSOLIFE, Beijing, China
How to cite this article: Bijun W, Fanyu L. A Preliminary Study on Teaching Prerequisite of Visual Perspective Taking. Psychol Behav Sci Int J .2021; 17(3): 555961. DOI: 10.19080/PBSIJ.2021.17.555961.
Abstract
Many children with Autism Spectrum Disorders (ASD) have impairment with visual perspective taking (VPT). A prerequisite skill of VPT, or level I VPT, is to identify what another person looks at. Previous studies have made efforts in teaching level I VPT to children with ASD directly in the nature environment. This study explored children’s VPT performance across three conditions: card, real person, and nature condition. Results indicated that the participating children had the highest performance with real person condition. Generalization of behavior occurred across real person and nature condition, while card condition appeared to have little relation with the other two. In addition, joint attention may not be directly relevant to level I VPT.
Keywords: Perspective taking; Card; Real person; Nature context; Generalization; Autism spectrum disorder
Introduction
Visual perspective taking (VPT) is the ability to infer what other people see from a different point of view. This skill is crucial for communication and social cognition [1]. Various studies have found that children with autism spectrum disorders (ASD) may suffer from difficulties with [2]. To further analyze this critical skill, a prerequisite skill or component of VPT was isolated to identify what others can see, or Level I VPT. The success of this prerequisite could contributed to joint attention, another fundamental skill for social interactions for young children [3]. Joint attention is synchronizing one’s attention with another person by eye contacting, gaze following or pointing [4]. In general, children may exercise joint attention through two forms, initiating and responding [5]. Initiating joint attention is the ability for someone to seek the attention from another person by eye gaze direction or pointing. Responding joint attention, on the other hand, requires a person to react to these initiating cues. We processed cues from eye gaze following [6], body postures [7], or both, and gave our judgement on what can be seen from another person’s view. For this reason, responding joint attention could be closely related to Level I VPT, which further stresses on the importance of this VPT prerequisite.
Several studies have explored how to teach such a skill to children with ASD. Gould and colleagues [8] taught this prerequisite to three children with autism aged from 3 to 5, all with limited word utterances. They used picture cards of a person’s head and shoulders surrounded by four objects as teaching materials. The picture person turned toward left or right while the children were asked to tact the associated objects at each side. The objects placed above and below were served as distractors. Teaching procedure involved gesture prompts from the direction indicted by the eye gaze of the picture person to the associated objects. The results of this study showed that the children were able to follow the picture person’s eye gaze toward left or right through picture training; however, the same skill was not generalized to natural environments with real person for any direction.
Hahs [9] adapted a similar method when working with three boys with ASD aged 3, 9 and 13. They replicated the results in two of three participants. Only one child equipped with relatively higher behavior verbal repertoire was able to generalize this skill to nature context. There might be two potential explanations for lack of generalization across settings. The card condition limited the picture person’s eye gaze toward only right or left while the real person in the natural environment may look in all directions, which made the task more complex than the initial training [8]. The other reason may be related to the child’s verbal characters. Better verbal repertoire might lead to better generalization effect [8].
To further examine generalization effect to untrained people of level I VPT, a recent study attempted to teach children with ASD such skill directly in natural environment with a real person [10]. The result revealed that all three participants mastered the skill and generalized to untrained people. What’s noteworthy was all participants were performing at level three on the VB-MAPP assessment [11], which meant they had a relatively more complex verbal skill compared to those employed in previous studies [8,9].
We conclude three key findings regarding Level I VPT from previous studies: 1) children with low verbal repertoire could acquire this skill in the card/picture condition; 2) children with good verbal repertoire can master it in the nature condition; 3) mastering Level I VPT through cards did not imply generalization effect in nature context. It leaves a couple of questions: 1) performing Level I VPT via cards and via nature context may be separate tasks. Teaching one task does not suggest the generalization effect on the other; 2) if our ultimate goal is for children with ASD and low verbal repertoire to practice Level I VPT in nature condition, how should we teach it, with the picture person through card condition or with real person through nature condition?
We adapted card and nature context conditions from previous studies. Our intention was to have children with ASD, regardless of their verbal repertoire, consistently exercise Level I VPT in natural environments with real persons. The gap between card and nature conditions (i.e., the difference of eye gaze directions, real persons, and environmental setups) might contribute to the limitation of generalization effect. Thus, we had created another mid-level condition, real person condition, with similar set up to the card condition but having real persons look around instead of picture persons. The purpose of our study was to explore 1) how children with various verbal repertoire performed Level I VPT across different conditions; 2) whether learning said skill via one condition led to generalization to another; 3) whether generalization to untrained person occurred under nature condition; 4) whether mastery of the skill leads to maintenance after a month. Since our target task was Level I VPT in nature condition, we completed the study when the child mastered said task regardless of his/her performance with other conditions.
Methods
Participants and settings
Five boys with ASD participated in this study. They all enrolled in a private autism support center in Beijing China where eight 40-min ABA sessions were running throughout the day, including one-on-one (1v1), one-on-two (1v2), and group classes. Zuck was 4 years and 1 month old and communicated in full sentences. His curriculum objectives included discrimination between left and right, identifying an object from a different category, sentence-making and so on.
Zman was 4 years and 2 months old and communicated in full sentences. He was learning numeric comparison within ten, telling stories, recognizing basic Chinese Characters. Zuck and Zeman had one 1v1 and one 1v2 classes per day. They had been receiving ABA therapy at this school for one and a half year. Carl was 5 year and 5 months old and spoke in simple sentences. Carl was learning identification of missing objects, motor imitation from cards, discrimination between up and down, and telling three characteristics of an animal. He had been studying in this center for a half year and had two 1v1 and one 1v2 classes per day. He met the mastery criteria via baseline probes and was excluded for further treatment, but his baseline data was still included in data analysis.
Walts was 6 years and 8 months old and spoke in phrases and simple sentences. He was learning rational counting and telling the total number within ten, Chinese Characters matching to sample, properly rejecting unwilling activities, and sequencing events. For the past one and a half years, Walts had two 1v1 classes per day in this center while attending a special education class in a public elementary school for the rest of the day.
Scott was 9 years and 1 month old and spoke in simple sentences. His curriculum included manding preferred activities, discrimination between the same and different, tacting the functions and/or characteristics of an object, tacting an object according to its sound. He had been receiving services at this center for two years and currently took three 1v1 classes per day.
None of the boys had formal instruction in Level I VPT instruction prior to this study. We arranged a total of three conditions in responding to our research questions: card, real person, and nature context (see Table 1 for settings of each condition). All procedures were implemented in the autism support center by the participants’ teachers who had a minimum of one-year experience working one-on-one with children with ASD. We also recruited another teacher to serve as the target person in real person and nature condition. With parental consent, the teachers ran one or two sessions throughout the child’s typical class schedule. Prior to the beginning of the study, the researchers provided the teachers a series of training, including a procedure protocol, lesson plans, a pre-recorded one-hour training with video modeling, and one-hour on-site training with feedback.
Materials
Cards and objects
A minimum of four cards and six objects were selected for each student from his previous tact curriculum. The teachers further conducted a probe to confirm that the child could tact all selected items proficiently prior to the baseline probes. Two sets of 4-picture persons (Set 1: Yilia, Set2: Celia, each on a 11.7cm * 11.7 cm card) were created for training and probing (See Figure 1). Considering item identification of all directions (i.e., right, left, up, down) only appearing in nature context condition might increase the task difficulty and reduce the possibility of generalization, we designed the picture persons to look toward four directions instead of two, different from the study of Gould et al. [8].
t1 f1Joint attention scale
To measure the effect of Level I VPT on joint attention, we developed a Joint Attention Scale, adapted from Joint Attention Task (JTAT) [12] and The Childhood Joint Attention Rating Scales (C-JARS) [13]. This five-point scale included 21 items (see Table 2). The score of 0 to 4 represented “never”, “seldom”, “sometimes”, “often”, and “always”, respectively. Both the parent and the teacher of each child were asked to fill in this scale before and after the intervention.
Experimental design
A multiple baseline across behaviors design was implemented to evaluate the generalization effect across three different conditions.
Experimental conditions
Three experimental conditions, card, real person, and nature, were introduced in the study. In all conditions, the children were asked “what is (the name of the picture/target person) looking at?”. The correct response was defined as tacting the name of the item at the sight of the picture/target person within 2 seconds. For example, in card condition, the teacher started the session by asking the child to identify Yilia/ Celia. One card of a picture person was then placed on the center of the table, along with other four cards of objects or real objects at each direction (top, bottom, left, right), approximately 8cm away from the middle card. After securing the children’s attention, the teacher gave an instruction “what is Yilia/ Celia looking at?”. If the child gave correct answer in two seconds, a reinforcer was given to him. If the child had no response in 2 seconds after isntructed, the teacher pointed from the person’s eyes to the object he/she was looking at. We chose gesture prompt because the participating children were familiar with this method and the teachers confirmed its effectiveness for these children. An error correction procedure was implemented following an incorrect response. First, the teacher gave a feedback of saying “no” or shaking his/her head to the incorrect response. Then, he/she repeated the instruction and provided gesture prompt immediately. After the child gave the correct answer under prompting, a praise was given to him. Then the teacher presented another card of the picture person and began the next trial. Same procedure was introduced in the other two conditions, adding two more elements: (1) both conditions had a real person sitting on and looking around the table; (2) six, instead of four, cards/objects were arranged randomly on the table in the nature condition, which also extended the four directions to six.
t2Baseline
The baseline probe consisted of 12 trials for each condition, covering four or six directions respectively. The children received no feedback regarding their responses. Teachers delivered reinforcers to children for their engagement. Please refer to table 3 for children’s performance with each condition during baseline.
Treatment procedure
Children’s performance from the baseline probes determined the treatment procedures. Data were collected and calculated as percentage of correct responses for all sessions. Any condition that yielded a minimum of 90% accuracy would be removed from the treatment package and instruction would no longer be provided. In responding to our goal of exercising Level I VPT in nature context, if the child achieved a minimum of 90% with nature condition (i.e., Carl), such skill was considered mastery and he would not receive any further instruction. If 90% accuracy was not met, we provided instruction following this sequence: card, real person, and nature condition, one at a time. It was built on the assumption that card condition was the simplest with minimum distractions, followed by the real person condition, while the nature condition was the most complex. Later in this experiment, the teachers reported the child experiencing great difficulties with the card condition, so we updated the protocol by starting the condition that yielded the highest performance in baseline probes. If two conditions yielded the same performance, we would start with nature context, our ultimate goal. Each teaching session was composed of 12 trials. A time delay prompt procedure was used. The mastery criteria was set as 90% correct or higher in two consecutive sessions.
t3Generalization and maintenance
Response generalization across conditions were probed throughout the teaching with a condition (e.g., probing real person and nature condition, of which baseline data was below 90% accuracy, while teaching Level I VPT in card condition). The procedure was identical to the baseline. Again, if the child achieved a minimum of 90% with nature condition, we considered him mastery of the skill and would no longer provide instruction for any condition.
Generalization to untrained person was probed after the child met mastery criteria in nature condition, either through formal instruction or the effect of generalization of behaviors across conditions. This procedure was identical to the baseline probe except having another teacher served as the target person.
After one month upon the completion of the instruction, we conducted a maintenance probe with nature condition. This procedure was identical to the previous probes, but the target person remained the same as the one in the teaching sessions.
Dependent variables
The target behavior was defined as saying the name of the discriminative stimulus within two seconds when asked “what does (name of the target/picture person) see?”. Percentage of correct response was calculated by dividing the number of correct independent responses by 12 (total trials per session) multiplying by 100%.
The teachers took the data immediately after each trial while the first author independently recorded the responses by watching the video recordings. Interobserver agreement (IOA) was computed by dividing the number of agreements by 12 and multiplying by 100%. IOA was collected on 66.7% (2 of 3) of baseline sessions for all and on 33.3%, 50%, 66.7% and 50% of training sessions for Scott, Walts, Zman and Zuck, respectively. Mean IOA of baseline sessions was 100% for Scott and Zuck, 95.8% (range, 91.7%-100%) for Walts, and 91.7% (range, 83.3%-100%) for Zman.
Result
Baseline comparison between conditions
Baseline data (Table 3) revealed that all participants performed the best with the real person condition. For Carl and Walts, the percentage correct was higher in nature than card condition, while this pattern reversed for Zuck and Zman. For Scott, both card and nature conditions resulted in zero correct response.
Treatment effect
Figure 2 displayed the results for all children. Initially we provided the instruction based on a predetermined sequence (e.g., as shown in the first teaching session for Scott). We adjusted the protocol according to the teachers’ feedback. Following this updated design, Walts, Zman and Zuck started with nature condition, while Scott with real person condition. Zman and Zuck achieved mastery only after a few sessions and they both performed satisfying maintenance and generalization across untrained person. We did not further check the effect of response generalization across conditions, because of their mastery level in the other conditions at baseline. Walts reached 80% accuracy for nature condition after the first session of real person. After four instructional sessions, he reached mastery criteria and generalized this skill to untrained person. We also observed the generalization effect to real person condition from 41.7% to 91.7%, but not so much with card condition, which remained at a low level of 16.7 %. Scott’s performance in nature condition improved from 0% to 41.7% after eight instructional sessions and eventually reached 100% after twelve sessions of real person condition. His performance under card condition, on the other hand, remained at 0% correct. Walts and Scott both maintained Level I VPT within nature context one month later.
Social validity
To explore the influence of Level I VPT on joint attention, we compared the pre- and post-scores on Joint Attention Scale within each participant. As shown in table 4, the results indicated that the total score improved for all except one (Zuck) participants on both teachers’ and parents’ assessment while teachers tended to have higher post-intervention ratings. Carl, Zuck, and Walt’s regular teachers reported a relative huge improvement (15, 18 and 8 on total score increased, respectively) on their children’s joint attention skill. However, this pattern did not replicate on their parent’s data. Zuck had similar ratings from both parties prior to the study but had very different post-intervention scores with joint attention. The teacher gave him the highest improvement ratings amongst all (i.e, +18) while the parent gave the regressed score (i.e., -4). Carl’s pre and post scores also showed a great gap between the teacher’s and the parents’ ratings (i.e., +15 vs +5). Noted that Carl did not actually receive additional instruction for this task, yet we still observed the improvement of the ratings in joint attention. Zman and Scott’s joint attention total score make small improvement, rated by his teacher and parent.
f2The first author also conducted an informal interview with each of the teachers upon completion of this experment. All teachers reported that this task was easy to embed into their daily instruction. They also acknowledged level I VPT was important for the development of social interactions for children with ASD. Overall, the teachers had a strong will to add this task into their social skill curriculum.
t4
Discussion
In this study, we conducted a preliminary experiment to explore the acquisition and generalization effect of Level I VPT across three conditions via tacting what others saw. Contrary to our presumption that Level I VPT was the simplest in card condition, the results of baseline data indicated that all participants performed the best with real person condition, while varied with the other conditions. Typical developing infants showed advantages in processing information for real objects [14]. For young children, pictures functioned as a symbolic representation as words, instead of a replacement of real objects [15]. Children could judge one’s visual perspective by tracking the target person’s body posture with real person condition, which was more concrete and visually richer than with card condition. Despite of its concreteness and richness, the nature condition required the children to trace six angles of sight, relatively similar to each other. Therefore, it might create more challenges for children with ASD who experienced difficulties with eye contact [16].
Generalization probes across conditions showed the children quickly acquired Level I VPT under real person condition after receiving instruction with nature condition and vice versa. However, the same generalization effect did not occur with card condition regardless of training provided in whichever conditions. These results were consistent with previous findings [8,9] that learning Level I VPT with cards cannot generalize to nature environment. Research suggested that children with ASD could acquire tacting real objects and pictures (Partington JW, Sundberg ML, Newhouse L, Spengler 1994), and they also generalized learned tacting from picture to real objects and the other way around [17]. However, this relationship was not applicable in Level I VPT. The learning mechanisms for object recognition [18] and visual perspective taking [19] might be quite different. Further research is required to explore the nature of these tasks [20].
Analysis of Joint Attention Scale ratings revealed an inconsistency between the teachers and parents. Two possible directions might explain such rating discrepancies. First, the study took place at school with teachers, different from home settings with caregivers. It was arguable whether extended generalization effect might take some time or even some additional instruction. Secondly, Level I VPT task did not directly contributed with joint attention. Other basic learning skills, such as manding, making eye contacts, following directions etc. were critical parts of the joint attention. It would potentially take another round of application and generalization to improve joint attention. Carol had the highest performance with Level I VPT in nature condition prior to the study, which might yield improved responding to others’ initiation of attention without further instruction.
Several limitations should be noted in this study. First, generalization across behaviors was probes on two of five participants, who received instruction in different conditions. Replication with more children would further determine such effect. Secondly, due to the children’s baseline performance, we did not complete the instruction with card condition. It was unclear whether training on card condition would generalize to the other conditions. Third, although the teachers expressed their preference of this simple instruction, a systematic and sensitive method was needed to verify its social validity.
Nevertheless, these findings shed some lights for instructional design of Level I VPT. Regardless of children’s language characteristics, having a real person in a relatively restricted environment might be a good starting point for teaching children with ASD this Level I VPT due to its simplicity and potential generalization effect to nature context. For children with better verbal behavior repertoire, teaching directly in the natural environment could be a worthy option.
Acknowledgment
The authors would like to thank Ting Hu, Li Wang, Jiarun Shi, Rui Zhu, Jingwen Ma, Shengyu Liu and Zhuang Zhuo for their assistance with the study.
Data Availability Statement
The data used this study can be accessed from the first author upon reasonable request.
References
- Reilly EL (2020) Visual Perspective Taking in Autism Spectrum Disorders (CUNY).
- Pearson A, Ropar D, de Hamilton AFC (2013) A review of visual perspective taking in autism spectrum disorder.
Front Hum Neurosci 7: 1-10. - Warreyn P, Roeyers H, Oelbrandt T, De Groote I (2005) What are you looking at? Joint attention and visual perspective taking in young children with autism spectrum disorder. Journal of Developmental and Physical Disabilities 17(1): 55-73.
- Moll H, Andrew M (2011) Perspective-Taking and its foundation in joint attention. In Perception, Causation, and Objectivity, pp. 286-304.
- Bruinsma Y, Koegel RL, Koegel LK (2004) Joint attention and children with autism: A review of the literature.
Ment Retard Dev Disabil Res Rev 10(3): 169-175. - Moll H, Meltzoff A (2011) Perspective-Taking and its foundation in joint attention. Perception, Causation, and Objectivity, 286-304.
- Pavlidou A, Gallagher M, Lopez C, Ferrè ER (2019) Let’s share our perspectives, but only if our body postures match. Cortex, 119: 575-579.
- Gould E, Tarbox J, OHora D, Noone S, Bergstrom R, et al. (2011) Teaching children with autism a basic component skill of perspective-taking. Behavioral Interventions 26(1): 50-66.
- Hahs AD (2015) Teaching Prerequisite Perspective-Taking Skills to Children with Autism. International Journal of Psychology and Behavioral Sciences 5(3): 115-120.
- Welsh F, Najdowski AC, Strauss D, Gallegos L, Fullen JA, et al. (2019) Teaching a perspective-taking component skill to children with autism in the natural environment. J Appl Behav Anal 52(2): 439-450.
- Sundberg ML (2008) VB-MAPP: Verbal Behavior Milestones Assessment and Placement Program. AVB Press, Concord, CA, USA.
- Bean JL, Eigsti IM (2012) Assessment of joint attention in school-age children and adolescents. Research in Autism Spectrum Disorders 6: 1304-1310.
- Mundy P, Novotny S, Swain-Lerro L, McIntyre N, Zajic M, et al. (2017) Joint-Attention and the social phenotype of school-aged children with ASD. J Autism Dev Disord 47(5): 1423-1435.
- Gerhard TM, Culham JC, Schwarzer G (2016) Distinct visual processing of real objects and pictures of those objects in 7- to 9-month-old infants. Front Psychol 7: 827.
- Preissler MA, Carey S (2004) Do both pictures and words function as symbols for 18- and 24-month-old children? Journal of Cognition and Development 5(2): 185-212.
- Gillespie-Lynch K, Elias R, Escudero P, Hutman T, Johnson SP, et al. (2013) Atypical gaze following in autism: a comparison of three potential mechanisms. J Autism Dev Disord 43(12): 2779–2792.
- Gómez LC (2015) Acquisition and Generalization of Tacts across Stimulus Modes in Children Diagnosed with Autism Spectrum Disorder. Graduate Theses and Dissertations.
- DiCarlo JJ, Zoccolan D, Rust NC (2012) How does the brain solve visual object recognition? Neuron 73(3): 415-434.
- Elisabetta M, Richard R, Massimiliano C, Antonia H (2013) Brain systems for visual perspective taking and action perception. Soc Neurosci 8(3): 248-267.
- Lang R, Machalicek W, Rispoli M, Regester A (2009) Training parents to implement communication interventions for children with autism spectrum disorders (ASD): A systematic review. Evidence-Based Communication Assessment and Intervention, 3(3): 174-190.