TY - GEN
T1 - Information-theoretic analysis of epistemic uncertainty in Bayesian meta-learning
AU - Jose, Sharu
AU - Park, Sangwoo
AU - Simeone, Osvaldo
PY - 2022/5/3
Y1 - 2022/5/3
N2 - The overall predictive uncertainty of a trained predictor can be decomposed into separate contributions due to epistemic and aleatoric uncertainty. Under a Bayesian formulation, assuming a well-specified model, the two contributions can be exactly expressed (for the log-loss) or bounded (for more general losses) in terms of information-theoretic quantities (Xu and Raginsky [2020]). This paper addresses the study of epistemic uncertainty within an information-theoretic framework in the broader setting of Bayesian meta-learning. A general hierarchical Bayesian model is assumed in which hyperparameters determine the per-task priors of the model parameters. Exact characterizations (for the log-loss) and bounds (for more general losses) are derived for the epistemic uncertainty – quantified by the minimum excess meta-risk (MEMR)– of optimal meta-learning rules. This characterization is leveraged to bring insights into the dependence of the epistemic uncertainty on the number of tasks and on the amount of per-task training data. Experiments are presented that use the proposed information-theoretic bounds, evaluated via neural mutual information estimators, to compare the performance of conventional learning and meta-learning as the number of meta-learning tasks increases.
AB - The overall predictive uncertainty of a trained predictor can be decomposed into separate contributions due to epistemic and aleatoric uncertainty. Under a Bayesian formulation, assuming a well-specified model, the two contributions can be exactly expressed (for the log-loss) or bounded (for more general losses) in terms of information-theoretic quantities (Xu and Raginsky [2020]). This paper addresses the study of epistemic uncertainty within an information-theoretic framework in the broader setting of Bayesian meta-learning. A general hierarchical Bayesian model is assumed in which hyperparameters determine the per-task priors of the model parameters. Exact characterizations (for the log-loss) and bounds (for more general losses) are derived for the epistemic uncertainty – quantified by the minimum excess meta-risk (MEMR)– of optimal meta-learning rules. This characterization is leveraged to bring insights into the dependence of the epistemic uncertainty on the number of tasks and on the amount of per-task training data. Experiments are presented that use the proposed information-theoretic bounds, evaluated via neural mutual information estimators, to compare the performance of conventional learning and meta-learning as the number of meta-learning tasks increases.
UR - http://aistats.org/aistats2022/
UR - https://www.scopus.com/pages/publications/85163117364
M3 - Conference contribution
T3 - Proceedings of Machine Learning Research
SP - 9758
EP - 9775
BT - International Conference on Artificial Intelligence and Statistics, 28-30 March 2022, A Virtual Conference
A2 - Camps-Valls, Gustau
A2 - Ruiz, Francisco J. R.
A2 - Valera, Isabel
PB - PMLR
T2 - The 25th International Conference on Artificial Intelligence and Statistics
Y2 - 28 March 2022 through 30 March 2022
ER -