Abstract
Recent studies on visual neural encoding and decoding have made significant progress, benefiting from the latest advances in deep neural networks having powerful representations. However, two challenges remain. First, the current decoding algorithms based on deep generative models always struggle with information losses, which may cause blurry reconstruction. Second, most studies model the neural encoding and decoding processes separately, neglecting the inherent dual relationship between the two tasks. In this article, we propose a novel neural encoding and decoding method with a two-stage flow-based invertible generative (FLIG) model to tackle the above issues. First, a convolutional autoencoder (CAE) is trained to bridge the stimuli space and the feature space. Second, an adversarial cross-modal normalizing flow is trained to build up a bijective transformation between image features and neural signals, with local and global constraints imposed on the latent space to render cross-modal alignment. The method eventually achieves bidirectional generation of visual stimuli and neural responses with a combination of the flow-based generator and the autoencoder. The FLIG model can minimize information losses and unify neural encoding and decoding into a single framework. Experimental results on different neural signals containing spike signals and functional magnetic resonance imaging demonstrate that our model achieves the best comprehensive performance among the comparison models.
Original language | English |
---|---|
Pages (from-to) | 724-736 |
Number of pages | 13 |
Journal | IEEE Transactions on Cognitive and Developmental Systems |
Volume | 15 |
Issue number | 2 |
Early online date | 23 May 2022 |
DOIs | |
Publication status | Published - Jun 2023 |
Bibliographical note
Funding:This work was supported in part by the National Natural Science Foundation of China under Grant 61976209 and Grant 61906188; in part by the CAS International Collaboration Key Project under Grant 173211KYSB20190024; and in part by the Strategic Priority Research Program of CAS under Grant XDB32040000
Keywords
- Cross-modal generation
- neural decoding
- neural encoding
- normalizing flow