Published: 2025-08-01
Hands-Free Video Player: Enhancing Accessibility with Voice-Controlled Navigation
DOI: 10.35870/ijsecs.v5i2.4100
V. Karthika, A. Siva Ganesh
- V. Karthika: Mepco Schlenk Engineering College , India
- A. Siva Ganesh: Mepco Schlenk Engineering College
Article Metrics
- Views 0
- Downloads 0
- Scopus Citations
- Google Scholar
- Crossref Citations
- Semantic Scholar
- DataCite Metrics
-
If the link doesn't work, copy the DOI or article title for manual search (API Maintenance).
Abstract
The research develops a technology-driven solution to enhance Over-The-Top (OTT) services for Smart TVs by leveraging advanced speech recognition, video analysis, and natural language processing technologies. The system incorporates TransNetV2 for AI-based scene boundary detection, Porcupine for hotword detection, and cutting-edge Automatic Speech Recognition (ASR) engines including Vosk, Whisper, and DeepSpeech for real-time speech-to-text conversion. Natural Language Processing (NLP) employs BERT and spaCy to interpret user intent and temporal commands from spoken instructions. Video content undergoes processing through FFmpeg and OpenCV for frame manipulation and visualization, while implementing intelligent content classification and scene understanding via YOLO and ResNet. The platform architecture combines Flutter for cross-platform deployment across Smart devices with a Python Flask backend ensuring seamless module integration and operational functionality. Testing results demonstrate the system's capability to execute real-time, hands-free media control while delivering an intuitive and accessible user experience for contemporary OTT applications.
Keywords
Voice Control ; Speech Recognition ; Natural Language Processing ; Hands-Free Video Playback ; AI-Powered Navigation ; Smart TV Interaction
Article Metadata
Peer Review Process
This article has undergone a double-blind peer review process to ensure quality and impartiality.
Indexing Information
Discover where this journal is indexed at our indexing page to understand its reach and credibility.
Open Science Badges
This journal supports transparency in research and encourages authors to meet criteria for Open Science Badges by sharing data, materials, or preregistered studies.
How to Cite
Article Information
This article has been peer-reviewed and published in the International Journal Software Engineering and Computer Science (IJSECS). The content is available under the terms of the Creative Commons Attribution 4.0 International License.
-
Issue: Vol. 5 No. 2 (2025)
-
Section: Articles
-
Published: %750 %e, %2025
-
License: CC BY 4.0
-
Copyright: © 2025 Authors
-
DOI: 10.35870/ijsecs.v5i2.4100
AI Research Hub
This article is indexed and available through various AI-powered research tools and citation platforms. Our AI Research Hub ensures that scholarly work is discoverable, accessible, and easily integrated into the global research ecosystem. By leveraging artificial intelligence for indexing, recommendation, and citation analysis, we enhance the visibility and impact of published research.
-
Singh, A. K., Sinha, S., Jagyasi, B., Abinaya, C. C., Mishra, S., Mylavarapu, R., ... & Contractor, G. (2019, December). Voice Controlled Media Player: A Use Case to Demonstrate an On-premise Speech Command Recognition System. In International Symposium on Signal Processing and Intelligent Recognition Systems (pp. 186-197). Singapore: Springer Singapore. https://doi.org/10.1007/978-981-15-4828-4_16.
-
-
Schalkwyk, J., Beeferman, D., Beaufays, F., Byrne, B., Chelba, C., Cohen, M., ... & Strope, B. (2010). “Your word is my command”: Google search by voice: A case study. In Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics (pp. 61-90). Boston, MA: Springer US. https://doi.org/10.1007/978-1-4419-5951-5_4.
-
Avuçlu, E., Özçifçi, A., & Elen, A. (2020). An application to control media player with voice commands. Politeknik Dergisi, 23(4), 1311-1315. https://doi.org/10.2339/politeknik.646675.
-
-
R, J. A., & C, J. A. B. (2019). Media player using voice recognition. International Journal of Emerging Technology and Innovative Engineering, 5(6), 401-408. Retrieved from https://ssrn.com/abstract=3411979.
-
Vu, M. D., Wang, H., Li, Z., Haffari, G., Xing, Z., & Chen, C. (2023). Voicify your ui: Towards android app control with voice commands. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 7(1), 1-22. https://doi.org/10.1145/3581998.
-
-
Soucek, T., & Lokoc, J. (2024, October). Transnet v2: An effective deep network architecture for fast shot transition detection. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 11218-11221). https://doi.org/10.1145/3664647.3685517.
-
Soni, A. A. (2025). Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit. arXiv preprint arXiv:2503.21025. https://doi.org/10.48550/arXiv.2503.21025.
-
Al-Hajri, A., Miller, G., Fong, M., & Fels, S. S. (2014, April). Visualization of personal history for video navigation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1187-1196). https://doi.org/10.1145/2556288.2557106

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
1. Copyright Retention and Open Access License
Authors retain copyright of their work and grant the journal non-exclusive right of first publication under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
This license allows unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
2. Rights Granted Under CC BY 4.0
Under this license, readers are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, including commercial use
- No additional restrictions — the licensor cannot revoke these freedoms as long as license terms are followed
3. Attribution Requirements
All uses must include:
- Proper citation of the original work
- Link to the Creative Commons license
- Indication if changes were made to the original work
- No suggestion that the licensor endorses the user or their use
4. Additional Distribution Rights
Authors may:
- Deposit the published version in institutional repositories
- Share through academic social networks
- Include in books, monographs, or other publications
- Post on personal or institutional websites
Requirement: All additional distributions must maintain the CC BY 4.0 license and proper attribution.
5. Self-Archiving and Pre-Print Sharing
Authors are encouraged to:
- Share pre-prints and post-prints online
- Deposit in subject-specific repositories (e.g., arXiv, bioRxiv)
- Engage in scholarly communication throughout the publication process
6. Open Access Commitment
This journal provides immediate open access to all content, supporting the global exchange of knowledge without financial, legal, or technical barriers.