Abstract
Single-stage multi-person human pose estimation (MPPE) methods have shown great performance improvements, but existing methods fail to disentangle features by individual instances under crowded scenes. In this paper, we propose a bounding box-level instance representation learning called BoIR, which simultaneously solves instance detection, instance disentanglement and instance-keypoint association problems. Our new instance embedding loss provides learning signal on the entire area of the image with bounding box annotations, achieving globally consistent and disentangled instance representation. Our method exploits multi-task learning of bottom-up keypoint estimation, bounding box regression and contrastive instance embedding learning, without additional computational cost during inference. We demonstrate that BoIR outperforms state-of-the-arts on COCO (0.5 AP), CrowdPose (4.9 AP) and OCHuman (3.5 AP).
Original language | English |
---|---|
Title of host publication | The 34th British Machine Vision Conference Proceedings |
Publisher | British Machine Vision Association |
Publication status | Accepted/In press - 25 Aug 2023 |
Event | The 34th British Machine Vision Conference - Aberdeen, United Kingdom Duration: 20 Nov 2023 → 24 Nov 2023 |
Conference
Conference | The 34th British Machine Vision Conference |
---|---|
Country/Territory | United Kingdom |
City | Aberdeen |
Period | 20/11/23 → 24/11/23 |