Kiki van der Heijden

Assistant Professor - Marie-Curie Research Fellow Columbia University

In my research, I investigate the computational mechanisms underlying neural encoding of sound location in naturalistic spatial hearing in normal and hearing-impaired listeners. To achieve this, I use an interdisciplinary approach combining cognitive neuroscience and computational modeling: I develop neurobiological-inspired deep neural network models of processing of sound location in subcortical auditory structures and the auditory cortex. Building on the resulting insights, I aim to optimize signal processing algorithms in cochlear implants to boost neural spatial auditory processing in cochlear implant users.

Heijden, K., & Mehrkanoon, S. (2022). Goal-driven, neurobiological-inspired convolutional neural network models of human spatial hearing. Neurocomputing, 470, 432-442

Abstract taken from Google Scholar:

The human brain effortlessly solves the complex computational task of sound localization using a mixture of spatial cues. How the brain performs this task in naturalistic listening environments (e.g. with reverberation) is not well understood. In the present paper, we build on the success of deep neural networks at solving complex and high-dimensional problems [1] to develop goal-driven, neurobiological-inspired convolutional neural network (CNN) models of human spatial hearing. After training, we visualize and quantify feature representations in intermediate layers to gain insights into the representational mechanisms underlying sound location encoding in CNNs. Our results show that neurobiological-inspired CNN models trained on real-life sounds spatialized with human binaural hearing characteristics can accurately predict sound location in the horizontal plane. CNN localization acuity across the azimuth …

Go to article

Patel, P., Heijden, K., Bickel, S., Herrero, J., Mehta, A., & Mesgarani, N. (2022). Interaction of bottom-up and top-down neural mechanisms in spatial multi-talker speech perception. Current Biology,

Abstract taken from Google Scholar:

How the human auditory cortex represents spatially separated simultaneous talkers and how talkers’ locations and voices modulate the neural representations of attended and unattended speech are unclear. Here, we measured the neural responses from electrodes implanted in neurosurgical patients as they performed single-talker and multi-talker speech perception tasks. We found that spatial separation between talkers caused a preferential encoding of the contralateral speech in Heschl’s gyrus (HG), planum temporale (PT), and superior temporal gyrus (STG). Location and spectrotemporal features were encoded in different aspects of the neural response. Specifically, the talker’s location changed the mean response level, whereas the talker’s spectrotemporal features altered the variation of response around response's baseline. These components were differentially modulated by the attended talker’s voice or …

Go to article

Kuang, S., Heijden, K., & Mehrkanoon, S. (2022). BAST: Binaural Audio Spectrogram Transformer for Binaural Sound Localization. arXiv preprint arXiv:2207.03927,

Abstract taken from Google Scholar:

Accurate sound localization in a reverberation environment is essential for human auditory perception. Recently, Convolutional Neural Networks (CNNs) have been utilized to model the binaural human auditory pathway. However, CNN shows barriers in capturing the global acoustic features. To address this issue, we propose a novel end-to-end Binaural Audio Spectrogram Transformer (BAST) model to predict the sound azimuth in both anechoic and reverberation environments. Two modes of implementation, i.e. BAST-SP and BAST-NSP corresponding to BAST model with shared and non-shared parameters respectively, are explored. Our model with subtraction interaural integration and hybrid loss achieves an angular distance of 1.29 degrees and a Mean Square Error of 1e-3 at all azimuths, significantly surpassing CNN based model. The exploratory analysis of the BAST's performance on the left-right hemifields and anechoic and reverberation environments shows its generalization ability as well as the feasibility of binaural Transformers in sound localization. Furthermore, the analysis of the attention maps is provided to give additional insights on the interpretation of the localization process in a natural reverberant environment.

Go to article

Heijden, K., Formisano, E., Valente, G., Zhan, M., Kupers, R., & Gelder, B. (2020). Reorganization of sound location processing in the auditory cortex of blind humans. Cerebral Cortex, 30(3), 1103-1116

Abstract taken from Google Scholar:

Auditory spatial tasks induce functional activation in the occipital—visual—cortex of early blind humans. Less is known about the effects of blindness on auditory spatial processing in the temporal—auditory—cortex. Here, we investigated spatial (azimuth) processing in congenitally and early blind humans with a phase-encoding functional magnetic resonance imaging (fMRI) paradigm. Our results show that functional activation in response to sounds in general—independent of sound location—was stronger in the occipital cortex but reduced in the medial temporal cortex of blind participants in comparison with sighted participants. Additionally, activation patterns for binaural spatial processing were different for sighted and blind participants in planum temporale. Finally, fMRI responses in the auditory cortex of blind individuals carried less information on sound azimuth position than those in sighted individuals …

Go to article

Lambriks, L., Hoof, M., Debruyne, J., Janssen, M., Chalupper, J., Heijden, K., Hof, J., Hellingman, C., George, E., & Devocht, E. (2020). Evaluating hearing performance with cochlear implants within the same patient using daily randomization and imaging-based fitting-The ELEPHANT study. Trials, 21(1), 1-14

Abstract taken from Google Scholar:

Prospective research in the field of cochlear implants is hampered by methodological issues and small sample sizes. The ELEPHANT study presents an alternative clinical trial design with a daily randomized approach evaluating individualized tonotopical fitting of a cochlear implant (CI). A single-blinded, daily-randomized clinical trial will be implemented to evaluate a new imaging-based CI mapping strategy. A minimum of 20 participants will be included from the start of the rehabilitation process with a 1-year follow-up period. Based on a post-operative cone beam CT scan (CBCT), mapping of electrical input will be aligned to natural place-pitch arrangement in the individual cochlea. The CI’s frequency allocation table will be adjusted to match the electrical stimulation of frequencies as closely as possible to corresponding acoustic locations in the cochlea. A randomization scheme will be implemented whereby the participant, blinded to the intervention allocation, crosses over between the experimental and standard fitting program on a daily basis, and thus effectively acts as his own control, followed by a period of free choice between both maps to incorporate patient preference. With this new approach the occurrence of a first-order carryover effect and a limited sample size is addressed. The experimental fitting strategy is thought to give rise to a steeper learning curve, result in better performance in challenging listening situations, improve sound quality, better complement residual acoustic hearing in the contralateral ear and be preferred by recipients of a CI. Concurrently, the suitability of the novel trial design will be considered in investigating …

Go to article

Heijden, K., & Mehrkanoon, S. (2020). Modelling human sound localization with deep neural networks. Proceedings of the Twenty-Eight European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN).,

Abstract taken from Google Scholar:

How the brain transforms binaural, real-life sounds into a neural representation of sound location is unclear. This paper introduces a deep learning approach to address these neurocomputational mechanisms: We develop a biological-inspired deep neural network model of sound azimuth encoding operating on auditory nerve representations of real-life sounds. We explore two types of loss functions: Euclidean distance and angular distance. Our results show that a network resembling the early stages of the human auditory pathway can predict sound azimuth location. The type of loss function modulates spatial acuity in different ways. Finally, learning is independent of environment-specific acoustic properties.

Go to article

Heijden, K., Rauschecker, J., Gelder, B., & Formisano, E. (2019). Cortical mechanisms of spatial hearing. Nature Publishing Group

Abstract taken from Google Scholar:

Humans and other animals use spatial hearing to rapidly localize events in the environment. However, neural encoding of sound location is a complex process involving the computation and integration of multiple spatial cues that are not represented directly in the sensory organ (the cochlea). Our understanding of these mechanisms has increased enormously in the past few years. Current research is focused on the contribution of animal models for understanding human spatial audition, the effects of behavioural demands on neural sound location encoding, the emergence of a cue-independent location representation in the auditory cortex, and the relationship between single-source and concurrent location encoding in complex auditory scenes. Furthermore, computational modelling seeks to unravel how neural representations of sound source locations are derived from the complex binaural waveforms of real …

Go to article

Vaessen, M., Heijden, K., & Gelder, B. (2019). Decoding of emotion expression in the face, body and voice reveals sensory modality specific representations.. bioRxiv, , 869578

Abstract taken from Google Scholar:

A central issue in affective science is whether the brain represents the emotional expressions of faces, bodies and voices as abstract categories in which auditory and visual information converge in higher order conceptual and amodal representations. This study explores an alternative theory based on the hypothesis that under naturalistic conditions where affective signals are acted upon that rather than reflected upon, major emotion signals (face, body, voice) have sensory specific brain representations. During fMRI recordings, participants were presented naturalistic dynamic stimuli of emotions expressed in videos of either the face or the whole body, or voice fragments. To focus on automatic emotion processing and bypass explicit emotion cognition relying on conceptual processes, participants performed an unrelated target detection task presented in a different modality than the stimulus. Using multivariate analysis to asses neural activity patterns in response to emotion expressions in the different stimuli types, we show a distributed brain organization of affective signals in which distinct emotion signals are closely tied to the sensory origin. Our findings are consistent with the notion that under ecological conditions the various sensory emotion expressions have different functional roles, even when from an abstract conceptual vantage point they all exemplify the same emotion category.

Go to article

Heijden, K., Rauschecker, J., Formisano, E., Valente, G., & Gelder, B. (2018). Active sound localization sharpens spatial tuning in human primary auditory cortex. Journal of Neuroscience, 38(40), 8574-8587

Abstract taken from Google Scholar:

Spatial hearing sensitivity in humans is dynamic and task-dependent, but the mechanisms in human auditory cortex that enable dynamic sound location encoding remain unclear. Using functional magnetic resonance imaging (fMRI), we assessed how active behavior affects encoding of sound location (azimuth) in primary auditory cortical areas and planum temporale (PT). According to the hierarchical model of auditory processing and cortical functional specialization, PT is implicated in sound location (“where”) processing. Yet, our results show that spatial tuning profiles in primary auditory cortical areas (left primary core and right caudo-medial belt) sharpened during a sound localization (“where”) task compared with a sound identification (“what”) task. In contrast, spatial tuning in PT was sharp but did not vary with task performance. We further applied a population pattern decoder to the measured fMRI activity …

Go to article

Derey, K., Rauschecker, J., Formisano, E., Valente, G., & Gelder, B. (2017). Localization of complex sounds is modulated by behavioral relevance and sound category. The Journal of the Acoustical Society of America, 142(4), 1757-1773

Abstract taken from Google Scholar:

Meaningful sounds represent the majority of sounds that humans hear and process in everyday life. Yet studies of human sound localization mainly use artificial stimuli such as clicks, pure tones, and noise bursts. The present study investigated the influence of behavioral relevance, sound category, and acoustic properties on the localization of complex, meaningful sounds in the horizontal plane. Participants localized vocalizations and traffic sounds with two levels of behavioral relevance (low and high) within each category, as well as amplitude-modulated tones. Results showed a small but significant effect of behavioral relevance: localization acuity was higher for complex sounds with a high level of behavioral relevance at several target locations. The data also showed category-specific effects: localization biases were lower, and localization precision higher, for vocalizations than for traffic sounds in central space …

Go to article