1. Introduction
According to the World Health Organization’s global report on road safety, there were 1.19 million road traffic deaths worldwide in 2021. Pedestrians, the second largest group of fatalities, accounted for 23% of these deaths, highlighting their vulnerability to traffic. Almost 95% of people across 48 countries are identified as pedestrians (Global Status Report on Road Safety 2023 2023). Seniors, children, and individuals with disabilities (whether visible or not) are considered particularly at risk of injury as pedestrians (Xiang et al. 2006; Organization 2019; Schwartz et al. 2022). For people with visible disabilities, such as those using a white walking cane to signal visual impairment, human drivers tend to exhibit more yielding behaviors (Andrew Harrell 1992; Guth et al. 2005). What about people with invisible disabilities ? Individuals with cognitive or social impairments, who do not have easily recognizable signs, face additional stress and danger during daily road-crossing experiences, often without understanding or consideration from human drivers (“Going Our Own Way: Public Transit Accessibility for Neurodivergent People” 2023).
As one of the invisible disabilities, Autistic Spectrum Disorder (ASD, or autism) is a lifelong developmental disorder that can vary in individuals along a continuum of severity, characterized by difficulties in social interaction and communication, repetitive behaviors, and intense focus on specific interests (Diagnostic and Statistical Manual of Mental Disorders: DSM-5 2013; Lord et al. 2018). Every 1 in 36 (2.8%) patients was diagnosed with autism in the US (CDC 2024). The latest global reports show a remarkably increased prevalence estimated that at least 1 in 100 children on Earth have been diagnosed with autism (Zeidan et al. 2022; Talantseva et al. 2023). Although much of the existing literature focuses on Individuals with ASD in traffic as drivers (D. Y. Chee et al. 2017; D. Y. T. Chee et al. 2019), few studies have examined their role as pedestrians (Earl et al. 2016; Earl et al. 2018; Wilmut and Purcell 2021). This gap is critical because pedestrian activity is more common and exposes individuals to greater vulnerability to traffic (Global Status Report on Road Safety 2023 2023).
The clinical characteristics of ASD include deficits in social interaction and non-verbal communication, whereas social cues and non-verbal negotiations with the driver actually serve an important role in road-crossing when traffic lights are not available (Hamilton-Baillie 2004; Bishop, Biasini, and Stavrinos 2017). These social elements in road crossing exacerbate pedestrian vulnerability among autistic individuals. When traffic lights are absent in a crossing, an effective driver-pedestrian interaction becomes primal. Body language, eye contact, or even facial expressions can make a lot of difference in this negotiation that happens in the blinking of an eye (Hamilton-Baillie 2004; D. Y. T. Chee et al. 2019; Curry et al. 2021).
In driver-pedestrian interactions, intent is often conveyed through subtle, non-verbal cues such as gaze and eye contact, creating an unspoken negotiation as both parties assess who will proceed first. The ability to recognize and interpret these cues is crucial for safe road crossings. This cognitive skill—variously referred to as theory of mind (Baron-Cohen, Leslie, and Frith 1985), intention attribution (Dennett 1971), or mentalizing (Frith 2001)—plays a fundamental role in predicting others’ behavior in social interactions.
Neuroimaging studies have identified the temporoparietal junction (TPJ) as a key brain region involved in reasoning about others’ intentions, particularly in tasks requiring an understanding of beliefs, desires, or goals (Saxe and Kanwisher 2003; Schurz et al. 2014). However, individuals with ASD often exhibit atypical TPJ activation during theory of mind tasks, which may contribute to difficulties in real-time social interactions (Lombardo et al. 2011; Nijhof et al. 2018). In traffic environments, where rapid intent attribution is essential for pedestrian safety, these challenges could pose significant obstacles when engaging with human drivers.
With the rise of autonomous vehicles (AVs), these traditional interaction dynamics are set to change. The AV market is projected to expand to $21 billion by 2035, leading to increasing integration with human road users (Rezwana and Lownes 2024). Unlike human drivers, AVs remove the element of social interaction from pedestrian interactions, lowering significantly the uncertainty level (Wilmut and Purcell 2021; Rezwana and Lownes 2024). While this shift may create challenges for typically developed pedestrians, who rely heavily on social cues (Van Brummelen et al. 2018), it may offer a relative advantage for autistic individuals. Driving-related studies suggest that autistic drivers outperform their typically developed counterparts in rule-based tasks, such as using indicators at roundabouts and checking for cross-traffic at intersections (D. Y. Chee et al. 2017). However, AV-pedestrian interactions involve more than just recognizing patterns and following traffic rules; they introduce a broader cognitive question: Can humans accurately infer AV intentions based solely on movement and adjust their behavior accordingly? Research suggests that this is indeed possible, as humans are capable of detecting and classifying abstract agents’ action goals based solely on motion cues (Surden and Williams 2016; Mahadevan, Somanath, and Sharlin 2018).
In the current context, while typically developed pedestrians often rely on non-verbal cues from human drivers to assess crossing opportunities, autistic pedestrians may be less dependent on such social signals—even when interacting with human-driven vehicles (Earl et al. 2016). Instead, they may have already developed and refined a strategy that relies on analyzing the vehicle’s kinematic features, such as speed, trajectory, and indicator lights, to infer its intentions) (Zach Noonan et al. 2023). This would suggest that, even before the advent of AVs, autistic pedestrians might have been accustomed to interpreting machine behavior analytically, rather than relying on human social cues.
A substantial body of recent research has focused on improving AV-to-human communication by embedding external visual cues—such as light signals, gaze simulations, or symbolic displays—into autonomous vehicles to make their intentions clearer to pedestrians (Driggs-Campbell and Bajcsy 2016; Rezwana and Lownes 2024). These studies implicitly assume that such social cues are both effective and necessary, particularly for neurotypical (NT) individuals, who rely heavily on socially conventional signals in interpreting intent. However, fewer studies have investigated how different individuals—especially those with neurodivergent profiles—perceive and interpret the behavior of autonomous vehicles. That is, while much attention has been devoted to how AVs can better express intent, less is known about how pedestrians vary in their ability to extract intention from AV motion, even in the absence of explicit signaling.
In a seminal visuomotor priming experiment, Pierno et al. (2008) found that ASD children outperformed their NT peers when asked to imitate a reach-to-grasp action previously executed by a robotic arm, but not when the action was performed by a human agent. This result suggests that the mechanisms supporting action understanding in ASD may be selectively enhanced in response to artificial, predictable systems.
More broadly, this finding aligns with a growing body of literature suggesting that ASD individuals often demonstrate a distinctive affinity with technology, which may allow them to engage more effectively with structured, predictable systems than with socially complex human counterparts. For example, enhanced performance has been observed in ASD individuals in tasks involving robot-mediated tutoring, computer-assisted learning, and collaborative problem-solving with virtual agents (Diehl et al. 2012; Scassellati, Admoni, and Matarić 2012; Dubois-Sage et al. 2024).
This privileged relationship with artificial systems may stem from their lower reliance on implicit social cues and greater sensitivity to systematic patterns, allowing for more efficient interaction with technologies that offer consistent and transparent input-output dynamics. In the context of autonomous vehicles, this raises the hypothesis that ASD pedestrians may strategically leverage motion-based parameters to infer AV intent, potentially outperforming NT individuals in tasks that rely less on socially framed communication and more on structural motion analysis.
Both NT and ASD individuals attribute intentions to robots and moving objects (+ref). The movement of these objects provides additional information about their intentions, which can be crucial for safe interaction in shared zones and zebra-crossing. Intention attribution involves understanding the purpose of an object’s movement, often linked to low-level motor intentions, such as moving forward or stopping. Individuals with ASD might, therefore, have a stronger capability in human-machine theory of mind (P. C. Pantelis et al. 2014), which would allow them to read kinematic social cues when interacting with machines (P. Pantelis et al. 2011).
Previous observations on autistic advantage in human-robot interaction suggest that autistic pedestrians may have an edge in interacting with AVs in the near future. By investigating the influence of automation within a social perception framework, especially in simulated real-world traffic scenarios, this study offers a unique opportunity to test the theoretical autistic advantage beyond predictability. In practice, such studies could pave the way to improve safety of pedestrians with autism and help design AVs that are more accessible and supportive for individuals with ASD, ultimately contributing to safer and more inclusive road environments.
Pedestrians account for nearly one-fourth of all traffic-related deaths worldwide, with individuals possessing invisible disabilities such as autism facing heightened risk. Despite increasing prevalence, the behaviors of autistic pedestrians in dynamic traffic contexts—particularly those involving AVs—remain under-explored. This study bridges that gap by investigating how autistic versus NT adults interpret and respond to AV movement cues in a simulated street-crossing task, using multimodal neurocognitive and behavioral metrics.
Functional near-infrared spectroscopy (fNIRS) (Villringer and Chance 1997) has been recognized as a promising tool for investigating social engagement in children with autism because of its portability and noninvasiveness (Liu et al. 2019; Zhang and Roeyers 2019). Given that road crossing is a highly motor-related task that requires ecological validity, we consider fNIRS to be an equally suitable approach for assessing neural activity in autistic and neurotypical adults in this context.
2. Methods
2.1 Participants
A total of 53 participants took part in the study, including 12 adults with autism spectrum disorder (ASD group) and 41 neurotypical adults (NT group). All participants had normal or corrected-to-normal vision and provided informed written consent prior to participation. The study was approved by the ethics committee of the Université de Lyon.
Participants in the ASD group (9 females, 3 males; mean age = 37.6 years, SD = 14.2) were recruited through autism-related communities and forums. All individuals self-identified as autistic and reported having received a formal diagnosis of autism spectrum disorder (ASD) from a healthcare professional, with no history of attention deficit with or without hyperactivity (ADHD). While clinical documentation was not independently verified, participants’ autistic characteristics were further assessed using the Aspie Quiz (https://rdos.net/fr/), a widely used self-report instrument designed to profile neurodivergent and neurotypical traits. The questionnaire provides separate scores for autistic and neurotypical dimensions (range: 0–200). All included participants showed high neurodivergent scores (mean = 136, SD = 20.4) and low neurotypical scores (mean = 67.4, SD = 21.9), in line with typical profiles observed in diagnosed autistic populations (Ekblad 2013).
Participants in the NT group (33 females, 7 males; mean age = 20.6 years, SD = 1.6) were undergraduate students recruited from the Université de Lyon. Although no autism-specific screening instrument was administered to this group, participants reported no history of developmental or psychiatric disorders. Prior to the experiment, all NT participants completed the Smart Tools Proneness Questionnaire (STP-Q; (Navarro et al. 2022)), a 32-item scale assessing the individual tendency to engage with smart tools. Participants were selected if their scores fell at least one standard deviation above or below the population mean, forming a NT group with extreme smart-tool pronenesses.
2.2 Apparatus and Procedure
Participants performed a road-crossing task in a virtual 3D pedestrian simulator presented on a PC, using a Gamepad for realistic movement control, being able to move and view in limited range (see Figure 1).
This task was designed to investigate participants’ behavioral strategies and physiological responses during interactions with autonomous vehicles (AVs). The experimental environment consisted of a gently curved road, providing a clear line-of-sight for participants, along which virtual AVs approached continuously. Traffic scenarios were presented in three distinct experimental conditions, each differing systematically in how the available physical crossing gap evolved over time, while maintaining an identical temporal crossing window of 3 seconds at the crossing point (illustrated in Figure 2).
Being explicitly informed that the vehicles were all fully-automated, participants encountered three sequential inter-group gaps (Gap #1 to Gap #3) in each trial, simulating recurring crossing opportunities. They were instructed to cross as soon as possible, reflecting real-world pedestrian decision-making behaviors. However, only those trials in which participants initiated crossing during the first gap (Gap #1) were retained for subsequent analyses. The sequence of traffic conditions was randomized between trials to mitigate potential learning effects or carryover biases.
Eye-tracking data were captured at a sampling rate of 1000 Hz using an Eyelink Portable Duo eye tracker in remote tracking mode. Recorded metrics included gaze positions, fixation duration, and pupil diameter changes, precisely synchronized with key traffic events (e.g., the onset of the initial inter-group gap).
Concurrently, cerebral hemodynamic responses were monitored using a functional near-infrared spectroscopy (fNIRS) system with a 12×10 optode array covering frontal, temporal and parietal cortical regions, sampling at approximately 5.5 Hz (see Figure 3). fNIRS signals were acquired using an 26-channel fNIRS system (Cortivision Photo Cap C20).
The placement of fNIRS optodes was determined using a systematic and refined procedure based on an automated meta-analytic approach (Yarkoni et al. 2011). Specifically, we conducted an automated meta-analysis via the Neurosynth database, targeting brain regions consistently activated during tasks involving decision-making and Theory of Mind (ToM). Using the “fNIRS Optodes’ Location Decider” (fOLD) toolbox (Zimeo Morais, Balardin, and Sato 2018), we mapped these identified activation clusters onto the international 10–20 EEG electrode positioning standard.
The simulation environment was developed using Unity 3D, with precise synchronization between the virtual environment, eye-tracking data, and fNIRS recordings achieved via the Lab Streaming Layer (LSL) protocol(“LabStreamingLayer’s Documentation Labstreaminglayer 1.13 Documentation,” n.d.). Prior to experimental trials, each participant underwent individual calibrations for both eye-tracking and fNIRS systems, followed by a brief familiarization phase for vibration feedback if collision happens. The entire experimental session, including calibration, data collection, and participant debriefing, lasted approximately 45 minutes. Each participants went through 6 trials per conditions, resulting in 24 trials per participant.
2.3 Measures
[3.1.1 Success Rate:] Binary indicator of collision avoidance in each trial
[3.1.2 Crossing Start Time:] Time from trial onset to engaging with potential collision area
[3.1.3 Crossing duration:] Time spent in crossing
[3.1.4 Current Deviation:] Time difference between pedestrian and gap center trajectories
[3.1.5 Pedestrian Crossing Speed:] Speed of the pedestrian avatar in meter per second
[3.1.6 Temporal Error:] Temporal offset from ideal crossing moment
[3.2.1 Gaze Preference:] Standardized gaze position projected between gap-related vehicles
[3.2.1 Pupil Dilatation:] Proxy for autonomic arousal
3.3.1 HbO in Brain ROI: Proxy for cortex activation
2.4 Data Analysis
Linear mixed-effects models assessed group and condition effects across behavioral and physiological variables. Eye-tracking data were processed via dynamic dot-product alignment to AVs’ screen positions (see Figure 4).
fNIRS data preprocessing was completed using the MNE package in Python. The raw intensity data were first converted to optical density (OD) changes, then underwent TDDR correction (Fishburn et al. 2019) and bandpass filtering (0.01–0.09 Hz) to filter out motion and other physiological artefacts. Finally the OD data were converted to hemodynamic responses (HDR) using the modified Beer-Lambert Law (Kocsis, Herman, and Eke 2006). The oxygenated hemoglobin (HbO) data of six trials under Baseline traffic condition were averaged for each participant and then used for steady-state control and normalized by z-score methods to eliminate the effect of data units (?) and facilitate comparison between different traffic conditions. Finally, each channel’s HbO amplitude under the six trials of each participant in the three different traffic conditions (Const, VarMinus, VarPlus) was calculated to compare the effect of conditions and groups (ASD, NT).
3. Results
3.1 Behavioral Outcomes
3.1.1 Success Rate:
A linear mixed-effect model showed that the ASD group demonstrated a significantly higher likelihood of collisions compared to the NT group, as indicated by a positive fixed effect estimate for the group comparison (β = 0.206, SE = 0.086, t(51) = 2.379, p = .021).
Notably, in the VarPlus condition, both groups showed a numerical reduction in risk-taking behaviors such as crossing during intra-group gaps, compared to Const and VarMinus conditions. However, this reduction did not reach statistical significance in either group (ASD: z = -1.29, p = .20; NT: z = -0.99, p = .32), suggesting a trend toward more cautious behavior in the presence of physically expanding crossing gaps.
3.1.2 Crossing Start Time:
We examined the effects of traffic condition and group on pedestrian crossing start time using a two-way mixed-design ANOVA with Condition as a within-subjects factor and Group as a between-subjects factor.
Crossing start time revealed a significant main effect on Condition (F(2,96) = 9085.91, p < .001, ηp2=.995), indicating that crossing decisions varied strongly across traffic conditions. Post-hoc tests on Condition showed that the difference majorly came from VarPlus (see Figure 6).
The Group effect was also significant (F(1,48) = 5.997, p = .018, ηp2 = .111), showing that overall, ASD participants started to cross later than NT participants across conditions.
To evaluate potential group differences in motor execution during the road-crossing task, we analyzed also the Crossing duration that participants spent within the potential collision zone—the centered area in crossing, superposed with the trajectory of the approaching vehicles.
3.1.3 Crossing Duration:
Notably, we found no significant group differences in crossing duration (F(1, 669) = 0, p = .99), suggesting intact motor execution in ASD group, once started crossing.
3.1.4 Current Deviation:
We examined current deviation—a behavioral measure reflecting whether pedestrians were adjusting their speed appropriately relative to vehicle positions—using a mixed-design ANOVA with Group as a between-subjects factor and Condition and Time (–5 s to 0 s relative to crossing midpoint) as within-subjects factors. The results revealed no significant main effect of Group ( F(1, 6) = .007, p = .934), no significant main effect of Condition (F(2, 12) = .693, p = .519), and no significant Group × Condition interaction (F(2, 12) = .133, p = .877).
Descriptively, some group- and condition-level differences were observed (see Figure 7), but these did not reach statistical significance. One potential reason for the null results lies in the data quality: the current deviation metric is calculated based on the instantaneous spatial relation between pedestrians and approaching vehicles. In our simulation, vehicle position data were updated at a resolution of 4-meter intervals, which introduced temporal imprecision and resulted in sparse or missing current deviation estimates in many trials, especially in ASD group where we already have limited number of participants.
To circumvent this limitation, we further analyzed pedestrian crossing speed, which does not depend on vehicle position data and thus offers a more continuous behavioral time course.
3.1.5 Pedestrian Crossing Speed:
The pedestrian crossing speed could be interpreted as the slope factor in Figure 7. A mixed-design ANOVA on pedestrian speed revealed a significant main effect of Condition, F(2, 94) = 3.43, p = .036, ηp2=0.068, suggesting that, despite the limitations in current deviation measures, time-resolved modulations of movement speed captured condition-dependent strategies in crossing decisions. follow-up pairwise comparisons did not yield statistically significant differences. This discrepancy may be attributed to the violation of the sphericity assumption (as indicated by Mauchly’s test), which required corrections (e.g., Greenhouse-Geisser), thereby reducing the sensitivity of subsequent post-hoc tests.
3.1.6 Temporal Error:
The temporal error represented the deviation from the ideal mid-gap. For our 3-second gaps, the ideal mid-gap would be at 1.5-second. A mixed-design ANOVA revealed significant main effects of Condition (F(2,96) = 19.323, p < .001, ηp2 = .287) and of Group (F(1,48) = 6.226, p = .016, ηp2 = .115). Figure 8 showed that ASD participants crossed the mid-point significantly later than NT participants. The result was congruent with our previous finding on Crossing Start Time stating that ASD participants started later their crossing.
3.2 Eye-Tracking and Pupillometry
3.2.1 Gaze Preference:
Since the gaze data were sparse, we used a mixed linear model on gaze projections. With Subject as the random effect, Group, Condition, and Time as fixed-effect variables, we found a main effect of Condition (F(2,40.33) = 9.297, p < .01) as shown in Figure 10.
3.2.1 Pupil Dilatation:
Pupillometry (see Figure 11) has been widely used to assess cognitive load and emotional arousal, making it a suitable tool to examine how three traffic conditions translated on participants’ workload in different group.
Pupil dilation response showed that ASD participants were hypo-sensitive across all conditions, supporting hypo-arousal theory (Zhao, Liu, and Wei 2022). However, under VarPlus conditions, ASD participants showed a significant increase in pupil size, suggesting heightened cognitive demand under uncertainty.
3.3 fNIRS Findings
3.3.1 HbO in Brain ROI:
3.3.2 ERP Waveform:
Distinct activation patterns were observed in frontal cortical areas, aligning with behavioral findings.
Inside the ASD group, the VarPlus condition was associated with decreased activation in the left parietal cortex. This pattern may suggest a reduced demand for social inference or conflict monitoring, potentially reflecting the predictability or perceived cooperativeness of the autonomous agent. In contrast, the VarMinus condition elicited stronger activation in the ASD group, possibly due to increased vigilance or uncertainty.
4. Discussion
Findings highlight the need to consider neurodiverse cognitive strategies in AV-pedestrian communication. Autistic pedestrians showed greater hesitation and sensitivity to speed variability, consistent with predictive coding models of autism. Technology design should avoid relying solely on social cues (e.g., gestures) and incorporate dynamic, explicit, and low-ambiguity signals that better match autistic processing preferences.
This study also raises concerns about the generalizability of AV safety assessments if neurodiverse behaviors are not accounted for. Integrating multimodal physiological signals into AV learning algorithms may improve safety and inclusivity.
limits: statistic power in ASD group. Groups’age not matched. NT participants were pre-selected extremists on smart tool usage.
5. Conclusion
Autistic individuals exhibit distinct behavioral and cognitive patterns in crossing decisions involving AVs. Their challenges do not stem from motor execution but from upstream decision processes tied to sensory processing and arousal. Designing AV systems that accommodate such differences can promote safer, more inclusive mobility ecosystems.
Declaration of Competing Interest
The authors declare no competing interests.
Funding
This research was supported by the a public funding from Délégation à la sécurité routière (DSR) (Grant no.xxxxx) for Jordan Navarro.
Declaration of Generative AI in the Writing Process
During the preparation of this article, ChatGPT was used to assist with language refinement and formatting. The final content was reviewed and verified by the authors.
Data Availability
Data underlying this article will be made available upon request due to privacy constraints regarding sensitive participant information.
References
Reuse
Citation
@online{huang2025,
author = {Huang, Wenjie and Gaujoux, Vivien and Fournel, Arnaud and
Cegarra, Julien and Reynaud, Emanuelle and Navarro, Jordan},
title = {Crossing {Differently:} {Divergent} {Cognitive} and {Visual}
{Strategies} in {Autistic} and {Neurotypical} {Pedestrians} {Facing}
{Autonomous} {Vehicles}},
volume = {x},
number = {x},
date = {2025-05-05},
doi = {xxxxx},
langid = {en},
abstract = {The rise of autonomous vehicles (AVs) necessitates a
deeper understanding of how diverse populations, particularly those
with atypical cognitive profiles, perceive and interact with these
systems. This study examined how autistic and neurotypical (NT)
adults approached street crossing in a 3D simulated AV environment.
Behavioral data, eye-tracking, and functional near-infrared
spectroscopy (fNIRS) were integrated to evaluate decision-making,
visual attention, and neural activation. Results revealed
significantly different success rates and crossing strategies, with
autistic individuals exhibiting delayed decision initiation and
reduced autonomic arousal as indexed by pupil dilation. Despite
these differences, crossing duration post-initiation remained
similar. Neuroimaging data supported these findings, indicating
distinct cognitive processing during crossing judgments. This study
underscores the need for inclusive urban mobility design and
highlights potential avenues for tailoring AV communication to
accommodate neurodiversity.}
}










