Quantifying the use of domain randomization

Mohammad Ani; Hector Basevi; Ales Leonardis

Quantifying the use of domain randomization

Mohammad Ani, Hector Basevi, Ales Leonardis

Computer Science

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

388 Downloads (Pure)

Abstract

Synthetic image generation provides the ability to efficiently produce large quantities of labeled data, which addresses both the data volume requirements of state-of-the-art vision systems and the expense of manually labeling data. However, systems trained on synthetic data typically under-perform systems trained on realistic data due to mismatch between the synthetic and realistic data distributions. Domain Randomization (DR) is a method of broadening a synthetic data distribution to encompass a realistic data distribution and provides better performance when the exact characteristics of the realistic data distribution are not known or cannot be simulated. However, there is no consensus in the literature on the best method of performing DR. We propose a novel method of ranking DR methods by directly measuring the difference between realistic and DR data distributions. This avoids the need to measure task-specific performance and the associated expense of training and evaluation. We compare different methods for measuring distribution differences, including the Wasserstein and Fréchet Inception distances. We also examine the effect of performing this evaluation directly on images and features generated by an image classification backbone. Finally, we show that the ranking generated by our method is reflected in actual task performance.

Original language	English
Title of host publication	25th International Conference on Pattern Recognition (ICPR 2020)
Publication status	Accepted/In press - 10 Dec 2020
Event	25th International Conference on Pattern Recognition - Virtual, Milan, Italy Duration: 10 Jan 2021 → 15 Jan 2021 https://www.micc.unifi.it/icpr2020/

Conference

Conference	25th International Conference on Pattern Recognition
Abbreviated title	ICPR 2020
Country/Territory	Italy
City	Milan
Period	10/01/21 → 15/01/21
Internet address	https://www.micc.unifi.it/icpr2020/

Access to Document

Quantifying_the_Use_of_Domain_Randomization-PreprintAccepted author manuscript, 8.54 MBLicence: None: All rights reserved

Cite this

@inproceedings{c289cf4310c747598e208a909b306db4,

title = "Quantifying the use of domain randomization",

abstract = "Synthetic image generation provides the ability to efficiently produce large quantities of labeled data, which addresses both the data volume requirements of state-of-the-art vision systems and the expense of manually labeling data. However, systems trained on synthetic data typically under-perform systems trained on realistic data due to mismatch between the synthetic and realistic data distributions. Domain Randomization (DR) is a method of broadening a synthetic data distribution to encompass a realistic data distribution and provides better performance when the exact characteristics of the realistic data distribution are not known or cannot be simulated. However, there is no consensus in the literature on the best method of performing DR. We propose a novel method of ranking DR methods by directly measuring the difference between realistic and DR data distributions. This avoids the need to measure task-specific performance and the associated expense of training and evaluation. We compare different methods for measuring distribution differences, including the Wasserstein and Fr{\'e}chet Inception distances. We also examine the effect of performing this evaluation directly on images and features generated by an image classification backbone. Finally, we show that the ranking generated by our method is reflected in actual task performance.",

author = "Mohammad Ani and Hector Basevi and Ales Leonardis",

year = "2020",

month = dec,

day = "10",

language = "English",

booktitle = "25th International Conference on Pattern Recognition (ICPR 2020)",

note = "25th International Conference on Pattern Recognition, ICPR 2020 ; Conference date: 10-01-2021 Through 15-01-2021",

url = "https://www.micc.unifi.it/icpr2020/",

}

TY - GEN

T1 - Quantifying the use of domain randomization

AU - Ani, Mohammad

AU - Basevi, Hector

AU - Leonardis, Ales

PY - 2020/12/10

Y1 - 2020/12/10

N2 - Synthetic image generation provides the ability to efficiently produce large quantities of labeled data, which addresses both the data volume requirements of state-of-the-art vision systems and the expense of manually labeling data. However, systems trained on synthetic data typically under-perform systems trained on realistic data due to mismatch between the synthetic and realistic data distributions. Domain Randomization (DR) is a method of broadening a synthetic data distribution to encompass a realistic data distribution and provides better performance when the exact characteristics of the realistic data distribution are not known or cannot be simulated. However, there is no consensus in the literature on the best method of performing DR. We propose a novel method of ranking DR methods by directly measuring the difference between realistic and DR data distributions. This avoids the need to measure task-specific performance and the associated expense of training and evaluation. We compare different methods for measuring distribution differences, including the Wasserstein and Fréchet Inception distances. We also examine the effect of performing this evaluation directly on images and features generated by an image classification backbone. Finally, we show that the ranking generated by our method is reflected in actual task performance.

AB - Synthetic image generation provides the ability to efficiently produce large quantities of labeled data, which addresses both the data volume requirements of state-of-the-art vision systems and the expense of manually labeling data. However, systems trained on synthetic data typically under-perform systems trained on realistic data due to mismatch between the synthetic and realistic data distributions. Domain Randomization (DR) is a method of broadening a synthetic data distribution to encompass a realistic data distribution and provides better performance when the exact characteristics of the realistic data distribution are not known or cannot be simulated. However, there is no consensus in the literature on the best method of performing DR. We propose a novel method of ranking DR methods by directly measuring the difference between realistic and DR data distributions. This avoids the need to measure task-specific performance and the associated expense of training and evaluation. We compare different methods for measuring distribution differences, including the Wasserstein and Fréchet Inception distances. We also examine the effect of performing this evaluation directly on images and features generated by an image classification backbone. Finally, we show that the ranking generated by our method is reflected in actual task performance.

M3 - Conference contribution

BT - 25th International Conference on Pattern Recognition (ICPR 2020)

T2 - 25th International Conference on Pattern Recognition

Y2 - 10 January 2021 through 15 January 2021

ER -

Quantifying the use of domain randomization

Abstract

Conference

Access to Document

Fingerprint

Cite this