Anomaly Detection in Walking Data Using Isolation Forest: An Unsupervised Learning Approach
Main Article Content
Abstract
Detecting anomalies in walking data is crucial for ensuring data quality in wearable devices and understanding irregular physical activity patterns. Traditional methods often rely on labeled data, which is scarce in real-world applications. This study presents an unsupervised learning approach using Isolation Forest to detect anomalies in walking datasets. The data, comprising features such as step count, distance, and time, was preprocessed and analyzed to identify patterns and deviations. Isolation Forest was employed due to its efficiency in handling high-dimensional data and its ability to separate anomalies without prior labeling. The model successfully detected 5 anomalous data points out of the dataset, with anomaly scores ranging from -0.15 to 0.2. These outliers corresponded to extreme walking patterns, such as unusually high step counts with disproportionate time and distance. Visualization of anomaly scores and statistical evaluations validated the model's effectiveness, showing clear distinctions between normal and abnormal data. The proposed approach highlights the potential of Isolation Forest in improving data quality and enabling real-time anomaly detection in fitness tracking applications. This work contributes to the broader field of unsupervised anomaly detection by demonstrating a scalable and effective method for handling real-world activity data.
Article Details
References
[2] H. Banaee, M. U. Ahmed, and A. Loutfi, “Data mining for wearable sensors in health monitoring systems: a review of recent trends and challenges,” Sensors, vol. 13, no. 12, pp. 17472–17500, 2013.
[3] S. Wang, J. F. Balarezo, S. Kandeepan, A. Al-Hourani, K. G. Chavez, and B. Rubinstein, “Machine learning in network anomaly detection: A survey,” IEEE Access, vol. 9, pp. 152379–152396, 2021.
[4] D. Kwon, H. Kim, J. Kim, S. C. Suh, I. Kim, and K. J. Kim, “A survey of deep learning-based network anomaly detection,” Clust. Comput., vol. 22, pp. 949–961, 2019.
[5] M. T. R. Laskar et al., “Extending isolation forest for anomaly detection in big data via K-means,” ACM Trans. Cyber-Phys. Syst. TCPS, vol. 5, no. 4, pp. 1–26, 2021.
[6] F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in 2008 eighth ieee international conference on data mining, IEEE, 2008, pp. 413–422.
[7] H. Tabrizchi and J. Razmara, “Credit card fraud detection using hybridization of isolation forest with grey wolf optimizer algorithm,” Soft Comput., pp. 1–19, 2024.
[8] S. K. Devi, R. Thenmozhi, and D. S. Kumar, “Self-Healing IoT Sensor Networks with Isolation Forest Algorithm for Autonomous Fault Detection and Recovery,” in 2024 International Conference on Automation and Computation (AUTOCOM), IEEE, 2024, pp. 451–456.
[9] C. Aldrich and X. Liu, “Monitoring of Mineral Processing Operations with Isolation Forests,” Minerals, vol. 14, no. 1, p. 76, 2024.
[10] M. M. Khan and M. Alkhathami, “Anomaly detection in IoT-based healthcare: machine learning for enhanced security,” Sci. Rep., vol. 14, no. 1, p. 5872, 2024.
[11] F. Zorriassatine, A. Naser, and A. Lotfi, “Gait Anomaly Detection with Low Cost and Low Resolution Infrared Sensor Arrays,” IEEE Sens. J., 2024.
[12] N. Alamsyah, B. Budiman, T. P. Yoga, and R. Y. R. Alamsyah, “XGBOOST HYPERPARAMETER OPTIMIZATION USING RANDOMIZEDSEARCHCV FOR ACCURATE FOREST FIRE DROUGHT CONDITION PREDICTION,” J. Pilar Nusa Mandiri, vol. 20, no. 2, pp. 103–110, 2024.
[13] N. Alamsyah, B. Budiman, T. P. Yoga, and R. Y. R. Alamsyah, “COMPARISON LINEAR REGRESSION AND RANDOM FOREST MODELS FOR PREDICTION OF UNDERGROUND DROUGHT LEVELS IN FOREST FIRES,” J. Techno Nusa Mandiri, vol. 21, no. 2, pp. 81–86, 2024.
[14] A. G. Putrada, I. D. Oktaviani, M. N. Fauzan, and N. Alamsyah, “CNN-LSTM for MFCC-based Speech Recognition on Smart Mirrors for Edge Computing Command,” J. Dinda Data Sci. Inf. Technol. Data Anal., vol. 4, no. 2, pp. 63–74, 2024.
[15] N. Alamsyah, A. P. Kurniati, and others, “A Novel Airfare Dataset To Predict Travel Agent Profits Based On Dynamic Pricing,” in 2023 11th International Conference on Information and Communication Technology (ICoICT), IEEE, 2023, pp. 575–581.
[16] H. Xu, G. Pang, Y. Wang, and Y. Wang, “Deep isolation forest for anomaly detection,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 12, pp. 12591–12604, 2023.