A professional desktop application enabling automatic transcription of audio and video content using OpenAI Whisper. Built entirely in Python with a modern GUI powered by CustomTkinter, and integrated with FFmpeg for media processing. Features a real-time progress bar, splash screen, and dark/light theme with smooth transitions.

Note: This is the first base version of the software, designed to work efficiently with small and medium-sized video or audio files. The current limitation is due to CPU-only processing without GPU acceleration. We are already developing a new cloud-based version to handle larger files and improve overall performance.
Distributed as a standalone Windows package built using Inno Setup. Users can install and run the app instantly without extra setup. The latest stable release is available for download below!
A structured workflow combining design thinking and software engineering best practices.
Initial design phase focusing on UI/UX with Figma and CustomTkinter layout mapping. Established color palette, splash screen, and overall application flow.
Integrated OpenAI Whisper for high-accuracy transcription of multilingual audio and video files. Optimized model usage for efficient performance on desktop.
Implemented a fully responsive interface using CustomTkinter with progress bar synchronization, dark/light theme, and smooth error handling.
Packaged the project into an executable using PyInstaller, embedding FFmpeg and resource assets for standalone deployment.
Conducted extensive debugging across multiple file formats (MP3, WAV, MP4, AVI). Ensured system stability and consistent performance across Windows versions.
Created the final installer using Inno Setup, including versioning, custom icons, and auto-cleanup. Deployed as a ready-to-use installer for end users.