Hands-Free Video Player: Enhancing Accessibility with Voice-Controlled Navigation

V. Karthika; A. Siva Ganesh

doi:10.35870/ijsecs.v5i2.4100

Published: 2025-08-01

Hands-Free Video Player: Enhancing Accessibility with Voice-Controlled Navigation

DOI: 10.35870/ijsecs.v5i2.4100

V. Karthika, A. Siva Ganesh

Affiliation Details

V. Karthika: Mepco Schlenk Engineering College , India
A. Siva Ganesh: Mepco Schlenk Engineering College

Front Cover IJSECS VOLUME 5 NOMOR 2 AGUSTUS 2025

Downloads

PDF

Article Metrics

Views 126
Downloads 602
Scopus Citations
Google Scholar
Crossref Citations
Semantic Scholar
DataCite Metrics
If the link doesn't work, copy the DOI or article title for manual search (API Maintenance).

Abstract

The research develops a technology-driven solution to enhance Over-The-Top (OTT) services for Smart TVs by leveraging advanced speech recognition, video analysis, and natural language processing technologies. The system incorporates TransNetV2 for AI-based scene boundary detection, Porcupine for hotword detection, and cutting-edge Automatic Speech Recognition (ASR) engines including Vosk, Whisper, and DeepSpeech for real-time speech-to-text conversion. Natural Language Processing (NLP) employs BERT and spaCy to interpret user intent and temporal commands from spoken instructions. Video content undergoes processing through FFmpeg and OpenCV for frame manipulation and visualization, while implementing intelligent content classification and scene understanding via YOLO and ResNet. The platform architecture combines Flutter for cross-platform deployment across Smart devices with a Python Flask backend ensuring seamless module integration and operational functionality. Testing results demonstrate the system's capability to execute real-time, hands-free media control while delivering an intuitive and accessible user experience for contemporary OTT applications.

Keywords

Voice Control ; Speech Recognition ; Natural Language Processing ; Hands-Free Video Playback ; AI-Powered Navigation ; Smart TV Interaction

Peer Review Process

This article has undergone a double-blind peer review process to ensure quality and impartiality.

Indexing Information

Discover where this journal is indexed at our indexing page to understand its reach and credibility.

Open Science Badges

This journal supports transparency in research and encourages authors to meet criteria for Open Science Badges by sharing data, materials, or preregistered studies.

How to Cite

Karthika, V., & Siva Ganesh, A. (2025). Hands-Free Video Player: Enhancing Accessibility with Voice-Controlled Navigation. International Journal Software Engineering and Computer Science (IJSECS), 5(2), 706-714. https://doi.org/10.35870/ijsecs.v5i2.4100

Article Information

This article has been peer-reviewed and published in the International Journal Software Engineering and Computer Science (IJSECS). The content is available under the terms of the Creative Commons Attribution 4.0 International License.

Issue: Vol. 5 No. 2 (2025)
Section: Articles
Published: %750 %e, %2025

License: CC BY 4.0
Copyright: © 2025 Authors
DOI: 10.35870/ijsecs.v5i2.4100

AI Research Hub

This article is indexed and available through various AI-powered research tools and citation platforms. Our AI Research Hub ensures that scholarly work is discoverable, accessible, and easily integrated into the global research ecosystem. By leveraging artificial intelligence for indexing, recommendation, and citation analysis, we enhance the visibility and impact of published research.

Scholarly Connection Platforms

Dimensions

Connected Papers

Scite

Google Scholar

Semantic Scholar

Garuda

Scilit

Crossref

BASE

Zenodo

Unpaywall

OpenCitations

Author Biographies

V. Karthika

Department of MCA, Mepco Schlenk Engineering College, Sivakasi, Tamil Nadu, India

A. Siva Ganesh

Department of MCA, Mepco Schlenk Engineering College, Sivakasi, Tamil Nadu, India

References

Singh, A. K., Sinha, S., Jagyasi, B., Abinaya, C. C., Mishra, S., Mylavarapu, R., ... & Contractor, G. (2019, December). Voice Controlled Media Player: A Use Case to Demonstrate an On-premise Speech Command Recognition System. In International Symposium on Signal Processing and Intelligent Recognition Systems (pp. 186-197). Singapore: Springer Singapore. https://doi.org/10.1007/978-981-15-4828-4_16.
Raj. (2024). Control video with gesture and voice recognition. International Journal of Computer Science and Programming, 1-8.
Schalkwyk, J., Beeferman, D., Beaufays, F., Byrne, B., Chelba, C., Cohen, M., ... & Strope, B. (2010). “Your word is my command”: Google search by voice: A case study. In Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics (pp. 61-90). Boston, MA: Springer US. https://doi.org/10.1007/978-1-4419-5951-5_4.
Avuçlu, E., Özçifçi, A., & Elen, A. (2020). An application to control media player with voice commands. Politeknik Dergisi, 23(4), 1311-1315. https://doi.org/10.2339/politeknik.646675.
Dhukate, R. A., Dheb, R. V., & Patil, M. P. (2023). A study on use of voice control in day to day life. SPDC Dnyankosh, 1(1), 10-15.
R, J. A., & C, J. A. B. (2019). Media player using voice recognition. International Journal of Emerging Technology and Innovative Engineering, 5(6), 401-408. Retrieved from https://ssrn.com/abstract=3411979.
Vu, M. D., Wang, H., Li, Z., Haffari, G., Xing, Z., & Chen, C. (2023). Voicify your ui: Towards android app control with voice commands. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 7(1), 1-22. https://doi.org/10.1145/3581998.
Pathrabe, D. A., Gosavi, A. A., & Kumar, Y. (2022). Conversational voice controlled news application. International Research Journal of Engineering and Technology (IRJET), 9(6), 1531-1535.
Soucek, T., & Lokoc, J. (2024, October). Transnet v2: An effective deep network architecture for fast shot transition detection. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 11218-11221). https://doi.org/10.1145/3664647.3685517.
Soni, A. A. (2025). Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit. arXiv preprint arXiv:2503.21025. https://doi.org/10.48550/arXiv.2503.21025.
Al-Hajri, A., Miller, G., Fong, M., & Fels, S. S. (2014, April). Visualization of personal history for video navigation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1187-1196). https://doi.org/10.1145/2556288.2557106

License & Copyright

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors who publish with this journal agree to the following terms:

1. Copyright Retention and Open Access License

Authors retain copyright of their work and grant the journal non-exclusive right of first publication under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

This license allows unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

2. Rights Granted Under CC BY 4.0

Under this license, readers are free to:

Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material for any purpose, including commercial use
No additional restrictions — the licensor cannot revoke these freedoms as long as license terms are followed

3. Attribution Requirements

All uses must include:

Proper citation of the original work
Link to the Creative Commons license
Indication if changes were made to the original work
No suggestion that the licensor endorses the user or their use

4. Additional Distribution Rights

Authors may:

Deposit the published version in institutional repositories
Share through academic social networks
Include in books, monographs, or other publications
Post on personal or institutional websites

Requirement: All additional distributions must maintain the CC BY 4.0 license and proper attribution.

5. Self-Archiving and Pre-Print Sharing

Authors are encouraged to:

Share pre-prints and post-prints online
Deposit in subject-specific repositories (e.g., arXiv, bioRxiv)
Engage in scholarly communication throughout the publication process

6. Open Access Commitment

This journal provides immediate open access to all content, supporting the global exchange of knowledge without financial, legal, or technical barriers.

Published: 2025-08-01

Hands-Free Video Player: Enhancing Accessibility with Voice-Controlled Navigation

DOI: 10.35870/ijsecs.v5i2.4100

V. Karthika, A. Siva Ganesh

Downloads

Article Metrics

Share:

Abstract

Keywords

Article Metadata

Peer Review Process

Indexing Information

Open Science Badges

How to Cite

Article Information

Issue: Vol. 5 No. 2 (2025)

Section: Articles

Published: %750 %e, %2025

License: CC BY 4.0

Copyright: © 2025 Authors

DOI: 10.35870/ijsecs.v5i2.4100

AI Research Hub

V. Karthika

A. Siva Ganesh

1. Copyright Retention and Open Access License

2. Rights Granted Under CC BY 4.0

3. Attribution Requirements

4. Additional Distribution Rights

5. Self-Archiving and Pre-Print Sharing

6. Open Access Commitment

Similar Articles

Browse

For Authors

For Reviewers

About

FEATURED JOURNAL

Published: 2025-08-01

Hands-Free Video Player: Enhancing Accessibility with Voice-Controlled Navigation

DOI: 10.35870/ijsecs.v5i2.4100

V. Karthika, A. Siva Ganesh

Downloads

Article Metrics

Share:

Abstract

Keywords

Article Metadata

Peer Review Process

Indexing Information

Open Science Badges

How to Cite

Article Information

Issue: Vol. 5 No. 2 (2025)

Section: Articles

Published: %750 %e, %2025

License: CC BY 4.0

Copyright: © 2025 Authors

DOI: 10.35870/ijsecs.v5i2.4100

AI Research Hub

V. Karthika

A. Siva Ganesh

1. Copyright Retention and Open Access License

2. Rights Granted Under CC BY 4.0

3. Attribution Requirements

4. Additional Distribution Rights

5. Self-Archiving and Pre-Print Sharing

6. Open Access Commitment

Similar Articles