Language Reconstruction with Brain Predictive Coding from fMRI Data
arXiv:2405.11597v2 Announce Type: replace
Abstract: Many recent studies have shown that the perception of speech can be decoded from brain signals and subsequently reconstructed as continuous language. However, there is a lack of neurological basis for how the semantic information embedded within brain signals can be used more effectively to guide language reconstruction. Predictive coding theory suggests the human brain naturally engages in continuously predicting future words that span multiple timescales. This implies that the decoding of brain signals could potentially be associated with a predictable future. To explore the predictive coding theory within the context of language reconstruction, this paper proposes \textsc{PredFT}~(\textbf{F}MRI-to-\textbf{T}ext decoding with \textbf{Pred}ictive coding). \textsc{PredFT} consists of a main network and a side network. The side network obtains brain predictive representation from related regions of interest~(ROIs) with a self-attention module. The representation is then fused into the main network for continuous language decoding. Experiments on two naturalistic language comprehension fMRI datasets show that \textsc{PredFT} outperforms current decoding models on several evaluation metrics.