Enhancing Human Activity Recognition with Attention-Based Stacked Sparse Autoencoders
Abstract
This study presents the development of an intelligent system for the classification of respiratory diseases using lung sound visualizations and deep learning. A hybrid Convolutional Neural Network and Bidirectional Long Short-Term Memory (CNN–BiLSTM) model was designed to classify four conditions: asthma, bronchitis, tuberculosis, and normal (healthy). Lung sound recordings were converted into time-frequency representations (e.g., mel-spectrograms), enabling spatial-temporal feature extraction. The system achieved an overall classification accuracy of 99.5%, with F1-scores above 0.93 for all classes. The confusion matrix revealed minimal misclassifications, primarily between asthma and bronchitis. These results suggest that the proposed model can effectively support real-time, non-invasive respiratory screening, particularly in telemedicine environments. Future work includes clinical validation, integration of patient metadata, and adoption of transformer-based models to further enhance diagnostic performance.
References
Anguita, D., Ghio, A., & … L. O. (2013). A public domain dataset for human activity recognition using smartphones. Upcommons.Upc.Edu. https://upcommons.upc.edu/handle/2117/20897
Balaha, H. M., & Hassan, A. E. S. (2025). Advances in human activity recognition: Harnessing machine learning and deep learning with topological data analysis. Brain-Computer Interfaces, 1–30. https://doi.org/10.1016/B978-0-323-95439-6.00005-3
Buffelli, D., & Vandin, F. (2021). Attention-based deep learning framework for human activity recognition with user adaptation. IEEE Sensors Journal, 21(12), 13474–13483. https://doi.org/10.1109/JSEN.2021.3067690
Gjoreski, H., Luštrek, M., & Gams, M. (2011). Accelerometer placement for posture recognition and fall detection. Proceedings - 2011 7th International Conference on Intelligent Environments, IE 2011, 47–54. https://doi.org/10.1109/IE.2011.11
Ha, S., & Choi, S. (2016). Convolutional neural networks for human activity recognition using multiple accelerometer and gyroscope sensors. Proceedings of the International Joint Conference on Neural Networks, 2016-October, 381–388. https://doi.org/10.1109/IJCNN.2016.7727224
Hammerla, N. Y., Halloran, S., & Plötz, T. (2016). Deep, Convolutional, and Recurrent Models for Human Activity Recognition using Wearables. IJCAI International Joint Conference on Artificial Intelligence, 2016-January, 1533–1540. https://arxiv.org/pdf/1604.08880
Kwapisz, J. R., Weiss, G. M., & Moore, S. A. (2011). Activity recognition using cell phone accelerometers. ACM SIGKDD Explorations Newsletter, 12(2), 74–82. https://doi.org/10.1145/1964897.1964918
Lara, Ó. D., & Labrador, M. A. (2013). A survey on human activity recognition using wearable sensors. IEEE Communications Surveys and Tutorials, 15(3), 1192–1209. https://doi.org/10.1109/SURV.2012.110112.00192
Morales, F. J. O., & Roggen, D. (2016). Deep convolutional feature transfer across mobile activity recognition domains, sensor modalities and locations. International Symposium on Wearable Computers, Digest of Papers, 12-16-September-2016, 92–99. https://doi.org/10.1145/2971763.2971764
Pang, H., Zheng, L., & Fang, H. (2024). Cross-Attention Enhanced Pyramid Multi-Scale Networks for Sensor-Based Human Activity Recognition. IEEE Journal of Biomedical and Health Informatics, 28(5), 2733–2744. https://doi.org/10.1109/JBHI.2024.3377353
Ramanujam, E., Perumal, T., & Padmavathi, S. (2021). Human Activity Recognition with Smartphone and Wearable Sensors Using Deep Learning Techniques: A Review. IEEE Sensors Journal, 21(12), 1309–13040. https://doi.org/10.1109/JSEN.2021.3069927
Reiss, A., & Stricker, D. (2012). Creating and benchmarking a new dataset for physical activity monitoring. ACM International Conference Proceeding Series. https://doi.org/10.1145/2413097.2413148;JOURNAL:JOURNAL:ACMOTHERCONFERENCES;CTYPE:STRING:BOOK
Ripoll, V. R., Romero, E., Ruiz-Rodríguez, J. C., & Vellido, A. (2023). A Public Domain Dataset for Human Activity Recognition using Smartphones. ESANN 2013 Proceedings, 21st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 437–442. https://arpi.unipi.it/handle/11568/962613
Vaswani, A., Brain, G., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is All you Need. Advances in Neural Information Processing Systems, 30.
Wang, J., Chen, Y., Hao, S., Peng, X., & Hu, L. (2019). Deep learning for sensor-based activity recognition: A survey. Pattern Recognition Letters, 119, 3–11. https://doi.org/10.1016/J.PATREC.2018.02.010
Yan, S., Smith, J. S., Lu, W., & Zhang, B. (2018). Hierarchical Multi-scale Attention Networks for action recognition. Signal Processing: Image Communication, 61, 73–84. https://doi.org/10.1016/J.IMAGE.2017.11.005
Zhang, S., Li, Y., Zhang, S., Shahabi, F., Xia, S., Deng, Y., & Alshurafa, N. (2022). Deep Learning in Human Activity Recognition with Wearable Sensors: A Review on Advances. Sensors 2022, Vol. 22, Page 1476, 22(4), 1476. https://doi.org/10.3390/S22041476





