Ageing is associated with elevated pure-tone thresholds, accompanied by increased difficulties in understanding speech-in-noise. While amplification provides important, but insufficient support, auditory-cognitive training (ACT) might provide support. However, generalized effects have been scarce, highlighting the necessity of training designs targeting the improvement of real-world listening. We addressed this issue by designing a short-term ACT in a purely auditory- and a virtual multisensory environment, targeting the integration of sensory and cognitive processing of natural speech. 40 healthy older participants with varying hearing- and cognitive capacities were exposed to both training (cross-over design), while speech-in-noise perception was measured before and after each session. Immersive ACT exposure resulted in increased speech-in-noise perception, particularly for individuals with more pronounced hearing-loss or reduced auditory working memory capacity. These results demonstrate that training the integration of sensory and cognitive demand, particularly in a multisensory environment, has the potential to improve real-world listening.