Enhancing Visual Coding Through Collaborative Perception

Lingling An; Zhen Yan; Weizheng Wang; Jian K. Liu; Keping Yu

doi:10.1109/TCDS.2022.3203422

Enhancing Visual Coding Through Collaborative Perception

Lingling An, Zhen Yan, Weizheng Wang^*, Jian K. Liu, Keping Yu

^*Corresponding author for this work

Computer Science

Research output: Contribution to journal › Article › peer-review

Abstract

A central challenge facing the nature human-computer interaction involves understanding how neural circuits process visual perceptual information to improve the user’s operation ability under complex tasks. Visual coding models aim to explore the biological characteristics of retinal ganglion cells to provide quantitative predictions of responses to a range of visual stimuli. The existing visual coding models lack adaptability in natural and complex scenes. Therefore this paper proposes an enhanced visual coding model through collaborative perception. Our model first extracts the multi-modal spatiotemporal features of the input video to simulate the retinal response characteristics adaptively. Secondly, it uses the basis function to compile the input stimulus into a multi-modal stimulus matrix. Afterward, the upstream and downstream filters reform the stimulus matrix to generate the spike sequence. Experiments show that the proposed model reproduces the physiological characteristics of ganglion cells in the biological retina, leading to the high accuracy, good adaptability, and biological interpretability in comparison with its rivals.

Original language	English
Journal	IEEE Transactions on Cognitive and Developmental Systems
Early online date	1 Sept 2022
DOIs	https://doi.org/10.1109/TCDS.2022.3203422
Publication status	E-pub ahead of print - 1 Sept 2022

Bibliographical note

Funding:
This work was supported by the National Natural Science Foundation of China (Grant Nos. 62072355 and 62072354),the Key Research and Development Program of Shaanxi Province of China (Grant No. 2022KWZ-10) and the Natural Science Foundation of Guangdong Province of China (Grant No. 2022A1515011424).

Keywords

Visual Coding
Multi-modal Stimulus
Feature Compilation
Nonlinearity

Access to Document

10.1109/TCDS.2022.3203422

https://ieeexplore.ieee.org/document/9873941/

Cite this

@article{c0164a7a936b4d70a544f3a04abffc77,

title = "Enhancing Visual Coding Through Collaborative Perception",

abstract = "A central challenge facing the nature human-computer interaction involves understanding how neural circuits process visual perceptual information to improve the user{\textquoteright}s operation ability under complex tasks. Visual coding models aim to explore the biological characteristics of retinal ganglion cells to provide quantitative predictions of responses to a range of visual stimuli. The existing visual coding models lack adaptability in natural and complex scenes. Therefore this paper proposes an enhanced visual coding model through collaborative perception. Our model first extracts the multi-modal spatiotemporal features of the input video to simulate the retinal response characteristics adaptively. Secondly, it uses the basis function to compile the input stimulus into a multi-modal stimulus matrix. Afterward, the upstream and downstream filters reform the stimulus matrix to generate the spike sequence. Experiments show that the proposed model reproduces the physiological characteristics of ganglion cells in the biological retina, leading to the high accuracy, good adaptability, and biological interpretability in comparison with its rivals.",

keywords = "Visual Coding, Multi-modal Stimulus, Feature Compilation, Nonlinearity",

author = "Lingling An and Zhen Yan and Weizheng Wang and Liu, {Jian K.} and Keping Yu",

note = "Funding: This work was supported by the National Natural Science Foundation of China (Grant Nos. 62072355 and 62072354),the Key Research and Development Program of Shaanxi Province of China (Grant No. 2022KWZ-10) and the Natural Science Foundation of Guangdong Province of China (Grant No. 2022A1515011424).",

year = "2022",

month = sep,

day = "1",

doi = "10.1109/TCDS.2022.3203422",

language = "English",

journal = "IEEE Transactions on Cognitive and Developmental Systems",

issn = "2379-8920",

publisher = "IEEE Xplore",

}

TY - JOUR

T1 - Enhancing Visual Coding Through Collaborative Perception

AU - An, Lingling

AU - Yan, Zhen

AU - Wang, Weizheng

AU - Liu, Jian K.

AU - Yu, Keping

N1 - Funding: This work was supported by the National Natural Science Foundation of China (Grant Nos. 62072355 and 62072354),the Key Research and Development Program of Shaanxi Province of China (Grant No. 2022KWZ-10) and the Natural Science Foundation of Guangdong Province of China (Grant No. 2022A1515011424).

PY - 2022/9/1

Y1 - 2022/9/1

N2 - A central challenge facing the nature human-computer interaction involves understanding how neural circuits process visual perceptual information to improve the user’s operation ability under complex tasks. Visual coding models aim to explore the biological characteristics of retinal ganglion cells to provide quantitative predictions of responses to a range of visual stimuli. The existing visual coding models lack adaptability in natural and complex scenes. Therefore this paper proposes an enhanced visual coding model through collaborative perception. Our model first extracts the multi-modal spatiotemporal features of the input video to simulate the retinal response characteristics adaptively. Secondly, it uses the basis function to compile the input stimulus into a multi-modal stimulus matrix. Afterward, the upstream and downstream filters reform the stimulus matrix to generate the spike sequence. Experiments show that the proposed model reproduces the physiological characteristics of ganglion cells in the biological retina, leading to the high accuracy, good adaptability, and biological interpretability in comparison with its rivals.

AB - A central challenge facing the nature human-computer interaction involves understanding how neural circuits process visual perceptual information to improve the user’s operation ability under complex tasks. Visual coding models aim to explore the biological characteristics of retinal ganglion cells to provide quantitative predictions of responses to a range of visual stimuli. The existing visual coding models lack adaptability in natural and complex scenes. Therefore this paper proposes an enhanced visual coding model through collaborative perception. Our model first extracts the multi-modal spatiotemporal features of the input video to simulate the retinal response characteristics adaptively. Secondly, it uses the basis function to compile the input stimulus into a multi-modal stimulus matrix. Afterward, the upstream and downstream filters reform the stimulus matrix to generate the spike sequence. Experiments show that the proposed model reproduces the physiological characteristics of ganglion cells in the biological retina, leading to the high accuracy, good adaptability, and biological interpretability in comparison with its rivals.

KW - Visual Coding

KW - Multi-modal Stimulus

KW - Feature Compilation

KW - Nonlinearity

U2 - 10.1109/TCDS.2022.3203422

DO - 10.1109/TCDS.2022.3203422

M3 - Article

SN - 2379-8920

JO - IEEE Transactions on Cognitive and Developmental Systems

JF - IEEE Transactions on Cognitive and Developmental Systems

ER -

Enhancing Visual Coding Through Collaborative Perception

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this