Geometry Constrained Weakly Supervised Object Localization

Weizeng Lu; Xi Jia; Weicheng Xie; Linlin Shen; Yicong Zhou; Jinming Duan

Geometry Constrained Weakly Supervised Object Localization

Weizeng Lu, Xi Jia, Weicheng Xie, Linlin Shen, Yicong Zhou, Jinming Duan

Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

96 Downloads (Pure)

Abstract

We propose a geometry constrained network, termed GC-Net, for weakly supervised object localization (WSOL). GC-Net consists of three modules: a detector, a generator and a classifier. The detector predicts the object location defined by a set of coefficients describing a geometric shape (i.e. ellipse or rectangle), which is geometrically constrained by the mask produced by the generator. The classifier takes the resulting masked images as input and performs two complementary classification tasks for the object and background. To make the mask more compact and more complete, we propose a novel multi-task loss function that takes into account area of the geometric shape, the categorical cross-entropy and the negative entropy. In contrast to previous approaches, GC-Net is trained end-to-end and predict object location without any post-processing (e.g. thresholding) that may require additional tuning. Extensive experiments on the CUB-200-2011 and ILSVRC2012 datasets show that GC-Net outperforms state-of-the-art methods by a large margin. Our source code is available at https://github.com/lwzeng/GC-Net.

Original language	English
Title of host publication	16th European Conference Computer Vision – ECCV 2020
Publisher	Springer
Pages	481-496
Publication status	Published - 19 Jul 2020

Bibliographical note

This paper (ID 5424) is accepted to ECCV 2020

Keywords

cs.CV

Access to Document

2007.09727v1

Cite this

@inproceedings{6a5bc95716354684ab8994592f05b1cc,

title = "Geometry Constrained Weakly Supervised Object Localization",

abstract = " We propose a geometry constrained network, termed GC-Net, for weakly supervised object localization (WSOL). GC-Net consists of three modules: a detector, a generator and a classifier. The detector predicts the object location defined by a set of coefficients describing a geometric shape (i.e. ellipse or rectangle), which is geometrically constrained by the mask produced by the generator. The classifier takes the resulting masked images as input and performs two complementary classification tasks for the object and background. To make the mask more compact and more complete, we propose a novel multi-task loss function that takes into account area of the geometric shape, the categorical cross-entropy and the negative entropy. In contrast to previous approaches, GC-Net is trained end-to-end and predict object location without any post-processing (e.g. thresholding) that may require additional tuning. Extensive experiments on the CUB-200-2011 and ILSVRC2012 datasets show that GC-Net outperforms state-of-the-art methods by a large margin. Our source code is available at https://github.com/lwzeng/GC-Net. ",

keywords = "cs.CV",

author = "Weizeng Lu and Xi Jia and Weicheng Xie and Linlin Shen and Yicong Zhou and Jinming Duan",

note = "This paper (ID 5424) is accepted to ECCV 2020",

year = "2020",

month = jul,

day = "19",

language = "English",

pages = "481--496",

booktitle = "16th European Conference Computer Vision – ECCV 2020",

publisher = "Springer",

}

TY - GEN

T1 - Geometry Constrained Weakly Supervised Object Localization

AU - Lu, Weizeng

AU - Jia, Xi

AU - Xie, Weicheng

AU - Shen, Linlin

AU - Zhou, Yicong

AU - Duan, Jinming

N1 - This paper (ID 5424) is accepted to ECCV 2020

PY - 2020/7/19

Y1 - 2020/7/19

N2 - We propose a geometry constrained network, termed GC-Net, for weakly supervised object localization (WSOL). GC-Net consists of three modules: a detector, a generator and a classifier. The detector predicts the object location defined by a set of coefficients describing a geometric shape (i.e. ellipse or rectangle), which is geometrically constrained by the mask produced by the generator. The classifier takes the resulting masked images as input and performs two complementary classification tasks for the object and background. To make the mask more compact and more complete, we propose a novel multi-task loss function that takes into account area of the geometric shape, the categorical cross-entropy and the negative entropy. In contrast to previous approaches, GC-Net is trained end-to-end and predict object location without any post-processing (e.g. thresholding) that may require additional tuning. Extensive experiments on the CUB-200-2011 and ILSVRC2012 datasets show that GC-Net outperforms state-of-the-art methods by a large margin. Our source code is available at https://github.com/lwzeng/GC-Net.

AB - We propose a geometry constrained network, termed GC-Net, for weakly supervised object localization (WSOL). GC-Net consists of three modules: a detector, a generator and a classifier. The detector predicts the object location defined by a set of coefficients describing a geometric shape (i.e. ellipse or rectangle), which is geometrically constrained by the mask produced by the generator. The classifier takes the resulting masked images as input and performs two complementary classification tasks for the object and background. To make the mask more compact and more complete, we propose a novel multi-task loss function that takes into account area of the geometric shape, the categorical cross-entropy and the negative entropy. In contrast to previous approaches, GC-Net is trained end-to-end and predict object location without any post-processing (e.g. thresholding) that may require additional tuning. Extensive experiments on the CUB-200-2011 and ILSVRC2012 datasets show that GC-Net outperforms state-of-the-art methods by a large margin. Our source code is available at https://github.com/lwzeng/GC-Net.

KW - cs.CV

M3 - Conference contribution

SP - 481

EP - 496

BT - 16th European Conference Computer Vision – ECCV 2020

PB - Springer

ER -

Geometry Constrained Weakly Supervised Object Localization

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this