Research Priorities

Our auditory sense is important for extracting information about the state of objects and humans and other beings present in the world around us. In an iterative perception-and-action loop, we use the knowledge gained from a scene to interact in a meaningful manner with our environment. In this way, humans can perform various perceptual and cognitive tasks such as object recognition and localization, scene analysis, speech understanding, focusing attention, learning, communication or physical interaction with specific objects. Importantly, in our daily life, acoustic information is often accompanied by visual information that complements performance in many of these perceptual and cognitive tasks.
Much understanding about auditory perception and cognition has been obtained in experiments with very basic stimuli and experimental paradigms. Respective "reduced-reality" lab-based research has greatly supported our understanding of the basic principles involved in auditory perception and cognition. Nevertheless, in daily life, acoustic environments differ substantially from the ones employed in many of the basic studies on auditory perception and cognition. They differ, for example, with respect to the inclusion of individual vs. non-individual Head-Related Transfer Functions (HRTFs), real-life inclusion of room acoustics, the ability to interact with the environment by rendering the effect of head movements and interactions well beyond, and the presence of a more realistic visual environment. Although various studies have included some of these aspects, very few have been conducted with the purpose to test and extend the basic understanding and theories that have been developed in the field of auditory perception and cognition and to apply this knowledge to more realistic daily-life situations and complex scenes.
At the same time, VR technology has matured significantly in the last decade due to new technological and methodical insights and the availability of affordable high-performance hardware. This is true both for the visual rendering of rich virtual reality, for example, in a CAVE-like environment or via Head-Mounted Displays, and for rendering acoustic information of multiple moving sources, including head- and person tracking with real-time adaptation of the scene.

The priority program AUDICTIVE focuses on research efforts of scholars from three different scientific disciplines. To foster the interdisciplinary work, which is one of the core motivations of this priority program, projects should include at least two different disciplines in tandem projects. Since acoustics - in form of acoustical scene capture, processing, rendering, presentation, perception and evaluation- is central in all these efforts, this discipline will need to be included in each of the tandem projects (more detailed description of the structure in "Call for Proposals").
The scope of research questions that could benefit from realistic audiovisual environments is considerable, ranging from basic perceptual issues regarding spatial perception to quality assessment of interactive audiovisual virtual environments. All of these will require knowledge of acoustics and auditory perception to analyze the acoustic stimuli that will be utilized. To achieve benefits and synergies of the projects that will run within the priority program AUDICTIVE, the scope of research questions is limited.
To exploit this potential, the priority program AUDICTIVE will provide a framework targeting three main research priorities:

  1. Auditory cognition

    Recent theories of auditory cognition and extant empirical findings were predominantly derived from studies with “simple” audio reproduction techniques (such as mono/dichotic representation without spatial cues). Projects in AUDICTIVE shall investigate to what extent these empirical findings and the corresponding cognitive psychological models (e.g., on attention, shortterm memory, communication, and scene analysis) hold true - and/or can be validated - in close(r)-to-real-life audiovisual virtual environments. Research questions need to be identified that will benefit from or require the use of interactive VR in experiments on auditory cognition.

      Specific topics integrating acoustics with (a) "auditory cognition":

    • Scene Analysis

      In the research on auditory scene analysis, more complex stimuli have often been used to study, for example, the spatial release from masking in cocktail-party settings. Mostly, these studies have been conducted with non-individualized Head-Related Transfer Functions (HRTFs) or Binaural Room Impulse Responses (BRIRs) in binaural re-synthesis via headphones and without head-tracking. Due to the use of non-individualized HRTFS/BRIRs as well as the lack of appropriate visual information and head tracking, sound sources are not perceived as externalized in most cases. Since auditory scene analysis deals with the ability of humans to focus attention on one source and process acoustic information about that source only, it is expected that a more comprehensive provision of audiovisual spatial cues for the scene elements (within a user-centered frame of reference) will have an effect on the ability to perform research on auditory scene analysis in more real-life-like, multimodal settings. With current virtual environments, it is possible to provide auditory and visual cues that are more realistic and have a better ability to illicit externalization. In AUDICTIVE, research in this area will focus on the effect of audiovisual spatialization cues, enabling a better understanding of auditory scene analysis in more daily-life settings.
    • Basic cognitive functions and VR as a validated research tool

      Experimental findings show, that an individualized and interactive presentation of relevant as well as irrelevant auditory stimuli is decisive for precise and valid statements on auditory selective attention. There is strong evidence that this can also be assumed for short-term memory performance during the mandatory processing of certain background sounds. Other basic cognitive functions (e.g., reasoning, executive control) have not been considered or even tested yet on any dependency of simple vs. advanced acoustic reproduction methods when applying auditory stimulus presentation. Here, AUDICTIVE can contribute to theoretic modelling of basic cognitive functions and the role certain characteristics of the auditory input play. By doing so, critical knowledge is also gained on those sound features and auditory cues that need to be reproduced in interactive acoustic VR to serve as a validated research tool in cognitive psychology for the considered basic cognitive functions.
    • Complex cognitive performances, e.g. communication

      Communication is a complex performance that requires successful interaction of different processes and functions. For example, words and sentences of spoken messages need to be perceived and cognitively processed, affective (emotional) connotations need to be extracted, intended messages of the communication partners and own goals might need to be compared and considered at a social level, similar to personal characteristics and social status of the interaction partners. These aspects and even more might accumulate in speech-based communication of multiple persons in a cocktail-party setting, a business meeting, or in an audio conference, for example. In AUDICTIVE, research might focus on the effect of the auditory and audiovisual conditions on communication behavior, effectiveness and efficiency as well as subjective experience. Interactive auditory VR allows the effects of head orientation, room acoustics and background noise to be explored. As an example, the influence of interaction of the environment with the own-voice production and behavior (e.g., Lombard effect, language register) on speech intelligibility in noisy scenarios might be investigated.
  2. Interactive virtual environments

    AUDICTIVE projects shall investigate how the realism and vibrancy of virtual environments can significantly be improved both, in terms of audiovisual representation as well as user interaction, by means of innovative acoustic or audiovisual technology and findings from auditory perception and cognition research. Projects in AUDICTIVE are expected to raise research on the design, development, and evaluation of audiovisual virtual spaces to a significantly higher level when inspired and validated by auditory cognition science. In particular, projects are welcome that investigate how user experience, behavior and performance in virtual environments are affected by advanced audiovisual cues (e.g. by ultimately involving conversational agents) and results from research on auditory cognition.

      Specific topics integrating acoustics with (b) "interactive virtual environments"

    • Spatial and timbral perception

      One of the principle qualities of virtual environments is that they can provide spatial information about object locations (for both the auditory and visual sense), presenting auditory and visual cues in a close-to natural manner. Coloration is another important feature, as most virtual acoustics systems produce coloration dependent on the virtual source position, which may have a severe impact on perceived quality. Auditory spatial perception has been researched intensively over many decades, but less intensively in virtual environments with both more realistic auditory and visual cues. Here, knowledge of how to connect basic theories of spatial hearing with spatial/timbral perception in complex and more realistic environments is still lacking, and based on this, truly realistic virtual acoustics cannot yet be provided. Hence, the integration of VR-based auditory cognition research is expected to improve basic knowledge of spatial/timbral perception in more realistic environments.
    • Navigation and interaction in audiovisual environments

      Prior research on acoustic virtual reality has led to technical innovations and progress like, dynamic binaural analysis and synthesis techniques, sound field synthesis, real-time room acoustics rendering, among others, and on the integration of these techniques into virtual reality systems and applications that so far especially focused on gaming or military scenarios. Only sporadic, mostly isolated research has been carried out to explore the influence of these techniques and their quality on users' interaction and navigation performance, e.g., in terms of search tasks, object manipulation tasks, or spatial awareness. AUDICTIVE will include projects that will complement computer science perspectives with expertise in technical acoustics and cognitive psychology, with the goal to systematically enhance spatial user interfaces in audiovisual virtual environments and to thoroughly evaluate these in terms of cognitive performances.
    • Enhancing social virtual reality

      Social presence refers to the subjective experience of social interaction with real, living humans while in fact communicating (verbally and by means of body language) with virtual agents or avatars. Due to recent progress in the real-time simulation of virtual humans, the development of embodied conversational agents, which communicate with users via speech, has come into reach. Here, advanced audio techniques, combined with findings from cognitive psychology, promise to significantly contribute to the audiovisual communication and interaction with embodied conversational agents, explicitly considering the interplay of speech, mimics, and gestures. In particular, the effect of audiovisually induced subjective presence of other characters in a scene on different cognitive performances in interactive scenarios needs to be investigated, as well as audiovisual cues that are most relevant for social presence. Ultimately, this will lead to virtual environments and VR applications with a level of realism and illusion of social presence that have not been feasible before, with applications in teleconferencing, psychological therapy, social training, and more.
  3. Quality evaluation methods

    Reliable and valid methods will be used to validate (a) the results obtained in the area of auditory cognition and (b) the suitability of perception- and cognition-optimized VR-systems to be used when addressing certain research questions. Proposed work may adapt or extend existing quality-related measures and develop test methods to evaluate VR systems applied for auditory cognition research. Projects in AUDICTIVE will investigate, how methods from auditory cognition research can be applied to assess the quality of audiovisual VR in a more holistic manner. Furthermore, AUDICTIVE aims to develop methods to quantify the added-value of interactive VR to auditory cognition research.

      Specific topics integrating acoustics and VR with (c) "quality evaluation methods"

    • Methods to assess quality of interactive virtual environments for auditory cognition

      Various partly interacting (VR-) system-, environment-, human- and context-factors affect the cognitive performances investigated in this priority program. For this purpose, the priority program will develop test methods that allow validated research on auditory cognition within interactive virtual environments. Such test methods are needed to ensure that the employed audiovisual VR-technology is up to pace for the targeted auditory cognition research, addressing complex virtual-reality scenarios that are closer to everyday situations. In particular, the evaluation methods used in AUDICTIVE's research must be able to accurately address cognitive performances in complex environments, involving different cognitive functions and processes including short- and longer-term effects (scene analysis, attention, memory etc.), and that VR-technology shall serve as a controllable research tool to achieve this aim. Here, existing research paradigms from classical perception and auditory cognition research will need to be transformed into methods for systematic evaluation of virtual environments. As a consequence, besides the methods of VR-evaluation and qualification for auditory-cognition research, this work will provide insights into the ways in which certain factors in VR systems interact with auditory cognition. Such information is of high value to the VR community.
    • Quality predictors of VR system suitability

      To complement or even partially replace QoE-related tests with human subjects, a number of media-technology research activities have been conducted to develop instrumental methods for quality and QoE-assessment. As a result, different models of audio, speech, image, and video quality assessment are available today. However, even for spatial audio, current models fall short of accurately predicting human perception or quality evaluation. For VR, developing prediction models is an even more challenging task, and currently no instrumental model is available that has been validated to help assessing specific aspects of VR systems. The AUDICTIVE program may provide relevant knowledge and instrumental models or metrics to help validate VR systems. In particular, research on the relation of model predictions or metrics with cognitive performances will provide significant advancement in terms of automatic and technology-aided system evaluation from a perception-and-cognition-oriented perspective and will contribute to translating knowledge from auditory cognition research to instrumental models.
    • Quality requirements of virtual environments

      The AUDICTIVE program will provide knowledge on VR-system limitations with regard to empirical research on different cognitive performances and on the ways in which the systems will need to be designed to actually enable the respective cognition research. A systematic, thorough research on which acoustical factors contribute to the overall quality of experience, presence, and interaction performance in virtual environments and how is still missing. Collectively, studies of the AUDICTIVE project on cognition-centric VR evaluation will contribute to a sustainable taxonomy of audiovisual VR. This will enable VR systems to be used as a validated research component and, at the same time, lead to perceptually and cognitively validated VR technology. The results will be used to create guidelines for VR system design and configuration to validly address specific research tasks.