Abstract
We live in the era of large data analysis, where processing vast datasets has become essential for uncovering valuable insights across various domains of our lives. Machine learning (ML) algorithms offer powerful tools for processing and analyzing this abundance of information. However, the considerable time and computational resources needed for training ML models pose significant challenges, especially within cascade schemes, due to the iterative nature of training algorithms, the complexity of feature extraction and transformation processes, and the large sizes of the datasets involved. This paper proposes a modification to the existing ML-based cascade scheme for analyzing large biomedical datasets by incorporating principal component analysis (PCA) at each level of the cascade. We selected the number of principal components to replace the initial inputs so that it ensured 95% variance retention. Furthermore, we enhanced the training and application algorithms and demonstrated the effectiveness of the modified cascade scheme through comparative analysis, which showcased a significant reduction in training time while improving the generalization properties of the method and the accuracy of the large data analysis. The improved enhanced generalization properties of the scheme stemmed from the reduction in nonsignificant independent attributes in the dataset, which further enhanced its performance in intelligent large data analysis.
| Original language | English |
|---|---|
| Article number | 4762 |
| Number of pages | 14 |
| Journal | Sensors |
| Volume | 24 |
| Issue number | 15 |
| Early online date | 23 Jul 2024 |
| DOIs | |
| Publication status | Published - Aug 2024 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2024 by the authors.
Keywords
- cascade scheme
- Kolmogorov–Gabor polynomial
- large data analysis
- machine learning
- PCA
- training time
ASJC Scopus subject areas
- Analytical Chemistry
- Information Systems
- Atomic and Molecular Physics, and Optics
- Biochemistry
- Instrumentation
- Electrical and Electronic Engineering
Fingerprint
Dive into the research topics of 'A Method for Reducing Training Time of ML-Based Cascade Scheme for Large-Volume Data Analysis'. Together they form a unique fingerprint.-
ZEBAI: Innovative methodologies to design Zero-Emission and cost-effective Buildings based on Artificial Intelligence
Mitoulis, S. (Principal Investigator), Faramarzi, A. (Co-Investigator), Freer, M. (Co-Investigator), Fraga, B. (Co-Investigator) & Sharifi, S. (Co-Investigator)
UKRI Horizon Europe Underwriting Innovate UK, European Commission
1/01/24 → 31/12/28
Project: Research
-
Massive Open Online Course (MOOC) – Sustainability, resilience and digitalisation in critical interdependent infrastructure
Mitoulis, S. (Principal Investigator) & Ninic, J. (Researcher)
1/01/23 → 31/12/26
Project: EU
-
ReCharged - Climate-aware Resilience for Sustainable Critical and interdependent Infrastructure Systems enhanced by emerging Digital Technologies
Mitoulis, S. (Principal Investigator) & Ninic, J. (Co-Investigator)
UKRI Horizon Europe Underwriting EPSRC
1/01/23 → 31/12/26
Project: Research Councils
-
RISKADAPT - Asset Level Modelling of RISKs In the Face of Climate Induced Extreme Events and ADAPTtation
Mitoulis, S. (Principal Investigator)
UKRI Horizon Europe Underwriting Innovate UK
1/01/23 → 31/12/25
Project: Other Government Departments
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver