Voice-Operated UAV System Using AI with Live Two directional Communication

Download Paper

Modern Drones are rapidly transforming into intelligent, interactive and smart aerial systems that are capable of communicating in real-time and make autonomous decisions. This paper investigates the mechanism behind voice command recognition, two-way communication, and virtual assistant integration to boost UAV capabilities. The application of Hidden Markov Models (HMM) and Maximum Likelihood Linear Regression (MLLR) techniques in the field of speech recognition allows drones to understand and process voice instructions.

Sharvi Aggarwal¹, Sneha Desai², Saarthak Kamra³, Lalit Agarwal⁴

^1,2,4Department of Electronics and Communication Engineering, Maharaja Agrasen Institute of Technology, New Delhi, India

³Department of Computer Science and Engineering, Maharaja Agrasen Institute of Technology, New Delhi, India

* Corresponding Author. E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Abstract: Modern Drones are rapidly transforming into intelligent, interactive and smart aerial systems that are capable of communicating in real-time and make autonomous decisions. This paper investigates the mechanism behind voice command recognition, two-way communication, and virtual assistant integration to boost UAV capabilities. The application of Hidden Markov Models (HMM) and Maximum Likelihood Linear Regression (MLLR) techniques in the field of speech recognition allows drones to understand and process voice instructions. The two-way voice communication is facilitated through a WebRTC-based communication system using LTE, and thus drones can serve as flying communication stations. Furthermore, the use of a Raspberry Pi system helps delivers real-time telemetry, weather information, and flight diagnostics.

Keywords: Voice-controlled UAVs, two-way communication systems, virtual assistant implementation, WebRTC technology, Drone voice recognition, intelligent aerial systems.

REFRENCES

[1] N. M. Nair, P. Kale, K. Narayanan, and S. Salian, “Voice Controlled Quadcopter,” International Journal ofResearch in Advent Technology, vol. 6, no. 3, March 2018.
[2]A. R. Fayjie, D. Oualid, A. Ramezani, and D. J. Lee, “Voice Enabled Smart Drone Control,” Proc. IEEEInternational Conference on Consumer Electronics, 2019.
[3] Y. Kalkan, O. Avcı, T. Ulutaş, E. C. Akar, and B. Koksal, “Simple Design and Implementation of TwoWayCommunication System through UAV,” Balkan Journal of Electrical & Computer Engineering, vol. 11, no. 1,January 2023.
[4] A. Kishorekumar, E. Ezhilarasan, and R. Parthiban, “Intelligent Drone based Personal Assistant using Artificial Intelligence (AI),” Int. Journal of Trend in Scientific Research and Development, vol. 2, Issue 3, Mar-Apr2018..
[5] S. S. Anand and R. Mathiyazaghan, “Design and Fabrication of Voice Controlled Unmanned Aerial Vehicle,”International Journal of Robotics and Automation (IJRA), Vol. 5, No. 3, September 2016.
[6] S. M. Rahim, A. Das, N. Kumar, S. Mukherjee, and R. Mishra, “VOC-Drone: AI Powered Voice ControlledAerial System,” JETIR, Volume 11, Issue 12, December 2024.
[7] L. R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, 1989.
[8] Practical Cryptography, “Mel Frequency Cepstral Coefficient (MFCC) tutorial,” Available:https://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstralcoefficients-mfccs/
[9] GeeksforGeeks, “Hidden Markov Model in Machine Learning,” Available:https://www.geeksforgeeks.org/hidden-markov-model-in-machine-learning/
[10] C. J. Leggetter and P. C. Woodland, “Maximum Likelihood Linear Regression for Speaker Adaptation ofContinuous Density Hidden Markov Models,” Computer Speech & Language, vol. 9, pp. 171–185, 1995.
[11] S. Young et al., “The HTK Book (for HTK Version 3.4),” Cambridge University Engineering Department,2009.
[12] WebRTC Official Documentation, Available: https://webrtc.org/getting-started/overview
[13] spaCy NLP Library Documentation, Available: https://spacy.io/usage
[14] A. Hannun et al., “Deep Speech: Scaling up end-to-end speech recognition,” arXiv preprint arXiv:1412.5567,2014. [15] A. Radford et al., “Robust Speech Recognition via Large-Scale Weak SupeSrvision,” in Proc. ICML, 2023.

ISSN (Online)	Will be updated soon
Frequency	Yearly
Starting Year	2024
Subject	Engineering & Technology
Language	English
Format	Online
Publishing Model	Open Access & Subscription

All Articles

Voice-Operated UAV System Using AI with Live Two directional Communication

Quick Links

Journal Information

Editor & Publisher

About Journal