To what extent do brain dynamics during music listening support contextual representations that serve continuous prediction? Deep neural networks offer a tool for neuroscience to investigate how high order representations emerge from complex sensory data under simple prediction objectives.
Previous models of musical expectation focus on state transitions between discrete, symbolic musical events. This limits corpus diveristy, the range of musical features, and the possibility of probing higher-order representational structure.
Here, we demonstrate that meaningful high-level representations emerge in deep generative models trained to capture musical statistics from raw audio. These representations become structured across distinct rhythmic timescales, suggesting multiple levels of contextual integration serve next-moment prediction.
The brain seems to distribute similar predictive representations across nested timescales of activity as evidenced by brain-model alignment at multiple harmonics of the musical beat. Given that rhythmic alignment between model and brain increases with longer context sizes, prior information may contribute to neural representations in deltatheta activity during naturalistic music listening over and above acoustic tracking.
By
Arun Asthagiri and Psyche Loui