Abstract
Reproducibility package for "A Generic Methodology for the Statistically Reproducible Evaluation of Automated Trading Platform Components" paper.
Abstract
Introduction. Although machine learning approaches have been widely used in the field of finance, to very successful degrees, these approaches remain bespoke to specific investigations and opaque in terms of explainability and reproducibility.
Objectives. The primary objective of this research is to shed light on this field by providing a generic methodology that is investigation agnostic and interpretable to a human trader, thus reducing barriers to entry and increasing the reproducibility of experiments.
Methods. This is a proposal of a generic methodology, usable across markets, the methodology relies on hypothesis testing, which is widely applied in other social and scientific disciplines to effectively evaluate the concrete results beyond simple classification accuracy.
To demonstrate the application of the methodology, we propose a feature space and trading pattern design methods and assess their statistical value.
Hypotheses were designed to evaluate, firstly, whether the selected trading pattern is a meaningful indicator of potential profitability, secondly, whether the selected feature space is sufficiently descriptive of the underlying marker, furthermore, using traditional statistics blended with a machine learning approach we obtain comparable and reproducible results beyond what is usually obtainable. Although we also look at traditional metrics such as precision and profitability. Explainability comparisons are also conducted using Shapely values, a recent advancement in explainable AI, that can assess how each feature has contributed to the model prediction on a per-entry basis and provide valuable insights to tune and understand model performance.
Results. Experiments were conducted across, 10 contracts, 3 feature spaces, and 3 rebound configurations (for feature extraction), resulting in 90 experiments. Across the experiments, we find that the selected trading pattern is a meaningful indicator of potential profitability with insignificant effect sizes (Rebound 7 - 0.64 \(\pm\) 1.02, Rebound 11 0.38 \(\pm\) 0.98, and rebound 15 - 1.05 \(\pm\) 1.16) indicating a lack of generalisability of the results for the considered trading pattern to the unseen data. Although, statistical tests lead to the rejection of the null hypothesis (Rebound 7 - p<0.001, Rebound 11 p=0.053, and rebound 15 p=0.049). While the 2-step feature extraction looks better, at first sight, statistics do not indicate support for this, validating the usefulness of the proposed methodology. Shapley values analyses are provided showing with feature contributed most to profits, and providing insights for adjustments to the strategy.
Conclusion: We showcase the generic methodology on S&P500 instruments and provide evidence that through this methodology we can easily obtain informative metrics of performance beyond the more traditional precision and profitability metrics. The interpretability of these results allows the everyday practitioner to construct more effective automated trading pipelines by analysing their strategies in an intuitive and statistically grounded methodology. This work is one of the first approaches applying this rigorous statistically-backed approach to the field of financial markets and we hope this may be a springboard for more research using this approach. A full reproducibility package is shared. Please follow the instructions in README.md
Abstract
Introduction. Although machine learning approaches have been widely used in the field of finance, to very successful degrees, these approaches remain bespoke to specific investigations and opaque in terms of explainability and reproducibility.
Objectives. The primary objective of this research is to shed light on this field by providing a generic methodology that is investigation agnostic and interpretable to a human trader, thus reducing barriers to entry and increasing the reproducibility of experiments.
Methods. This is a proposal of a generic methodology, usable across markets, the methodology relies on hypothesis testing, which is widely applied in other social and scientific disciplines to effectively evaluate the concrete results beyond simple classification accuracy.
To demonstrate the application of the methodology, we propose a feature space and trading pattern design methods and assess their statistical value.
Hypotheses were designed to evaluate, firstly, whether the selected trading pattern is a meaningful indicator of potential profitability, secondly, whether the selected feature space is sufficiently descriptive of the underlying marker, furthermore, using traditional statistics blended with a machine learning approach we obtain comparable and reproducible results beyond what is usually obtainable. Although we also look at traditional metrics such as precision and profitability. Explainability comparisons are also conducted using Shapely values, a recent advancement in explainable AI, that can assess how each feature has contributed to the model prediction on a per-entry basis and provide valuable insights to tune and understand model performance.
Results. Experiments were conducted across, 10 contracts, 3 feature spaces, and 3 rebound configurations (for feature extraction), resulting in 90 experiments. Across the experiments, we find that the selected trading pattern is a meaningful indicator of potential profitability with insignificant effect sizes (Rebound 7 - 0.64 \(\pm\) 1.02, Rebound 11 0.38 \(\pm\) 0.98, and rebound 15 - 1.05 \(\pm\) 1.16) indicating a lack of generalisability of the results for the considered trading pattern to the unseen data. Although, statistical tests lead to the rejection of the null hypothesis (Rebound 7 - p<0.001, Rebound 11 p=0.053, and rebound 15 p=0.049). While the 2-step feature extraction looks better, at first sight, statistics do not indicate support for this, validating the usefulness of the proposed methodology. Shapley values analyses are provided showing with feature contributed most to profits, and providing insights for adjustments to the strategy.
Conclusion: We showcase the generic methodology on S&P500 instruments and provide evidence that through this methodology we can easily obtain informative metrics of performance beyond the more traditional precision and profitability metrics. The interpretability of these results allows the everyday practitioner to construct more effective automated trading pipelines by analysing their strategies in an intuitive and statistically grounded methodology. This work is one of the first approaches applying this rigorous statistically-backed approach to the field of financial markets and we hope this may be a springboard for more research using this approach. A full reproducibility package is shared. Please follow the instructions in README.md
Original language | English |
---|---|
Type | Reproducibility Package |
Media of output | Python and Dataset |
Publisher | Zenodo |
DOIs | |
Publication status | Published - 20 Aug 2022 |