Quantifying the use of domain randomization

Research output: Chapter in Book/Report/Conference proceedingConference contribution

343 Downloads (Pure)


Synthetic image generation provides the ability to efficiently produce large quantities of labeled data, which addresses both the data volume requirements of state-of-the-art vision systems and the expense of manually labeling data. However, systems trained on synthetic data typically under-perform systems trained on realistic data due to mismatch between the synthetic and realistic data distributions. Domain Randomization (DR) is a method of broadening a synthetic data distribution to encompass a realistic data distribution and provides better performance when the exact characteristics of the realistic data distribution are not known or cannot be simulated. However, there is no consensus in the literature on the best method of performing DR. We propose a novel method of ranking DR methods by directly measuring the difference between realistic and DR data distributions. This avoids the need to measure task-specific performance and the associated expense of training and evaluation. We compare different methods for measuring distribution differences, including the Wasserstein and Fréchet Inception distances. We also examine the effect of performing this evaluation directly on images and features generated by an image classification backbone. Finally, we show that the ranking generated by our method is reflected in actual task performance.
Original languageEnglish
Title of host publication25th International Conference on Pattern Recognition (ICPR 2020)
Publication statusAccepted/In press - 10 Dec 2020
Event25th International Conference on Pattern Recognition - Virtual, Milan, Italy
Duration: 10 Jan 202115 Jan 2021


Conference25th International Conference on Pattern Recognition
Abbreviated titleICPR 2020
Internet address


Dive into the research topics of 'Quantifying the use of domain randomization'. Together they form a unique fingerprint.

Cite this