Data Collection and Staging Process Automation for Machine Learning in Algorithmic Trading – Students Presentation

Data Collection and Staging Process Automation Precision, Speed and Scalability for Machine Learning Modelling of Algorithmic Trading Stocks-Price Prediction

Time & Date: 9:00am – 11:00am , Wednesday, April 22, 2026
Location: Room E-301 and via Zoom (see email and registration information), Computer Science Department, Okanagan College
Registration is open now: https://events.vtools.ieee.org/m/555692

This project brings together three teams: Data Collection, Data Warehousing, and Machine Learning into a unified, end-to-end system for high-frequency stock price prediction.
Learn how we designed a scalable pipeline using distributed computing and XGBoost, covering system architecture, data engineering, and real-world ML applications in algorithmic trading.
This work also establishes a foundation for ongoing research and extended large-scale evaluation.
Open to students, faculty, and anyone interested in machine learning, data systems, or fintech.

Abstract:

This presentation discusses an automated data collection and staging pipeline for high-frequency stock price prediction using machine learning. The system integrates scalable ELT processes, data deduplication, and distributed training with XGBoost on high-performance computing infrastructure. Designed for precision, speed, and scalability, the framework enables efficient handling of large financial time-series datasets while maintaining robust predictive performance and optimized resource utilization.

Contributors:

This project was developed through a collaborative effort across three specialized teams:

Data Collection Team

Responsible for sourcing, aggregating, and preprocessing raw financial and market data.

  • Andrew Johnson
  • Emilio Iturbide
  • Reilly Mager
  • Lian Heckrodt
  • Cade Dempsey
  • Kristina Cormier

Data Warehouse Team

Designed and implemented the data storage architecture, ETL/ELT pipelines, and database systems.

  • Alex Anthony
  • Hayden Nikkel
  • Daemon Lewis
  • John Cortez
  • Jackson Rosco

Machine Learning Team (XGBoost)

Developed, trained, and evaluated machine learning models for predictive analytics and trading strategies.

  • Harsh Saw
  • Zane Tessmer
  • Kavaljeet Singh
  • Dante Bertolutti
  • Guntash Brar
  • Parag Jindal

Acknowledgements

We thank all contributors and collaborators who supported the development, testing, and deployment of this system.

 

For further information please contact: Youry Khmelevsky (email: Youry at IEEE.org)
Refreshments will be provided