Page 74 - eBook_Proceedings of the International Conference on Digital Manufacturing V1
P. 74
Proceedings of the International Conference on Digital Manufacturing –
Volume 1
Following the above stages, the models were implemented on
Google Colab using Python, incorporating libraries such as Scikit-
learn, TensorFlow, and Pandas. Iterative hyperparameter tuning
and feature engineering were employed to refine model
performance. XGBoost achieved an initial accuracy of 95% on the
Kaggle dataset after multiple optimisations, but Random Forest
demonstrated superior adaptability on the industry dataset,
achieving a final R² of 0.98, corroborating findings by Shaik and
Agrawal (2022) on Random Forest’s resilience to noise and data
irregularities. The outcomes validate the hypothesis that machine
learning significantly enhances demand forecasting accuracy,
enabling real-time, data-driven supply chain optimisation. Future
research should integrate real-time IoT data streams and explore
reinforcement learning frameworks to further elevate
responsiveness and predictive intelligence (Wang et al., 2023)
DATA PROCESSING AND MODEL DEVELOPMENT
In this study, data collection was conducted using two primary
sources to ensure a comprehensive analysis of supply chain
optimisation. The first source was a publicly available dataset
from Kaggle, specifically the Unilever Supply Chain Analysis
dataset, which includes data on sales trends, inventory levels,
order processing times, and logistics operations. The second
source was proprietary data from ABC Industry, a real-world
industry providing relevant operational data. This dataset includes
transactional data from the company’s Enterprise Resource
Planning (ERP) system, SAP, offering detailed information about
orders, shipments, and inventory. By combining Kaggle’s
publicly available data with the real-world data from ABC, this
study explores supply chain optimisation from multiple angles,
providing a solid foundation for developing machine learning-
based solutions.
58

