Data Collection and Staging Process Automation Precision, Speed and Scalability for Machine Learning Modelling of Algorithmic Trading Stocks-Price Prediction
Time & Date: 9:00am – 11:00am , Wednesday, April 22, 2026
Location: Room E-301 and via Zoom (see email and registration information), Computer Science Department, Okanagan College
Registration is open now: https://events.vtools.ieee.org/m/555692
This project brings together three teams: Data Collection, Data Warehousing, and Machine Learning into a unified, end-to-end system for high-frequency stock price prediction.
Learn how we designed a scalable pipeline using distributed computing and XGBoost, covering system architecture, data engineering, and real-world ML applications in algorithmic trading.
This work also establishes a foundation for ongoing research and extended large-scale evaluation.
Open to students, faculty, and anyone interested in machine learning, data systems, or fintech.
Abstract:
This presentation discusses an automated data collection and staging pipeline for high-frequency stock price prediction using machine learning. The system integrates scalable ELT processes, data deduplication, and distributed training with XGBoost on high-performance computing infrastructure. Designed for precision, speed, and scalability, the framework enables efficient handling of large financial time-series datasets while maintaining robust predictive performance and optimized resource utilization.
Contributors:
This project was developed through a collaborative effort across three specialized teams:
Data Collection Team
Responsible for sourcing, aggregating, and preprocessing raw financial and market data.
- Andrew Johnson
- Emilio Iturbide
- Reilly Mager
- Lian Heckrodt
- Cade Dempsey
- Kristina Cormier
Data Warehouse Team
Designed and implemented the data storage architecture, ETL/ELT pipelines, and database systems.
- Alex Anthony
- Hayden Nikkel
- Daemon Lewis
- John Cortez
- Jackson Rosco
Machine Learning Team (XGBoost)
Developed, trained, and evaluated machine learning models for predictive analytics and trading strategies.
- Harsh Saw
- Zane Tessmer
- Kavaljeet Singh
- Dante Bertolutti
- Guntash Brar
- Parag Jindal
Acknowledgements
We thank all contributors and collaborators who supported the development, testing, and deployment of this system.
For further information please contact: Youry Khmelevsky (email: Youry at IEEE.org)
Refreshments will be provided
