From 8eb83cdd989d84b09ccfa6f7b292d00e85340fb1 Mon Sep 17 00:00:00 2001 From: Alex Liang Date: Thu, 11 Jun 2026 13:58:40 +0800 Subject: [PATCH 1/2] feat: v3.0.2 - i18n, audio-only flow, task control, UX polish MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bumps version to 3.0.2. 版本号升至 3.0.2。 主要内容 / Major themes: i18n - Browser language auto-detection (Accept-Language) on first load, with a top-right language selector overriding the choice per session. 首次加载根据浏览器 Accept-Language 自动识别语言;右上角语言选择器可按会话覆盖。 - Sidebar duplicate language selector removed. 移除侧边栏重复的语言选择器。 - Routes display_language through query params + session_state, with config.yaml as a fallback only. display_language 改为优先走 query params + session_state,config.yaml 仅作兜底。 - Adds normalize_language_code() to map zh / zh-CN / zh-HK / zh-Hant / variants to the supported set. 新增 normalize_language_code(),把 zh / zh-CN / zh-HK / zh-Hant 等变体统一映射到受支持的语言。 - Translates previously hard-coded UI strings: WhisperX runtime, TTS engine names, Voice / 302ai API / ElevenLabs API labels, "Star on GitHub" button, YouTube resolution "Best". 翻译之前硬编码的 UI 文案:WhisperX runtime、TTS 引擎名、Voice / 302ai API / ElevenLabs API 标签、"Star on GitHub" 按钮、YouTube 分辨率 "Best"。 - Fixes 'here' link text leaking English in zh-CN / zh-HK welcome string. 修复欢迎语在简中/繁中里 "here" 链接文字仍为英文的问题。 - Adds CSS overlay for the file_uploader internals (Streamlit has no official i18n for these) covering "Drag and drop file here", "Limit ... per file" and "Browse files" labels. 通过 CSS 覆盖 file_uploader 内部文案(Streamlit 官方未提供 i18n),包括 "Drag and drop file here"、"Limit ... per file"、"Browse files"。 - Hides Streamlit developer toolbar (client.toolbarMode = "viewer") and disables the file watcher (server.fileWatcherType = "none") so "File change / Rerun / Always rerun" prompts no longer appear. 隐藏 Streamlit 开发者工具栏(client.toolbarMode = "viewer"),关闭文件监听(server.fileWatcherType = "none"),避免出现 "File change / Rerun / Always rerun" 英文提示。 - Fills missing translation keys across en / zh-CN / zh-HK / es / fr / ja / ru. 补全 en / zh-CN / zh-HK / es / fr / ja / ru 七种语言中缺失的翻译键。 Audio-only input flow / 纯音频输入流程 - Adds output/input_manifest.json written by the upload / YouTube download path, recording the original media type. find_media_file() now reads the manifest first, so generated artefacts (dub.mp3, normalized_dub.wav) no longer poison detection. 上传 / YouTube 下载后写入 output/input_manifest.json 记录原始媒体类型。find_media_file() 优先读 manifest,避免生成产物(dub.mp3、normalized_dub.wav)污染识别。 - find_audio_files() now skips generated audio names. find_audio_files() 自动跳过生成产物文件名。 - find_media_file() distinguishes "no media" vs "multiple media" errors instead of silently falling back. find_media_file() 区分"无媒体"和"多个媒体"两种错误,不再静默 fallback。 - Sidebar no longer persistently writes burn_subtitles = false when the input is audio; the toggle is only disabled in the UI. 音频输入时不再持久化写入 burn_subtitles = false,只在 UI 层禁用开关。 - Main pipeline now flows download -> subtitles -> (optional) dubbing, only showing the dubbing section after subtitles are done AND the input is not audio. 主流程改为:下载 → 字幕 →(可选)配音;只有字幕完成且输入不是音频时才显示配音段。 - Adds prepare_audio_for_asr() so audio-only inputs are normalized to 16k / mono / mp3 without going through video conversion. 新增 prepare_audio_for_asr(),纯音频输入直接归一化为 16k / 单声道 / mp3,不再经过视频转换。 - Removes the obsolete convert_audio_to_video() placeholder path. 移除已废弃的 convert_audio_to_video() 占位路径。 Task control (pause / resume / stop) / 任务控制(暂停 / 继续 / 停止) - TaskRunner gains a class-level _current pointer and TaskRunner.check_cancel() so long-running core loops can cooperatively cancel. TaskRunner 新增类级 _current 指针和 TaskRunner.check_cancel(),让长循环可以协作式取消。 - Adds core.utils.check_cancel() wrapper, imported via the existing `from core.utils import *` pattern. 新增 core.utils.check_cancel() 包装,沿用现有 `from core.utils import *` 导入方式。 - Inserts check_cancel() into the hot loops: ASR segment loop, translate parallel loop, TTS warmup / parallel collection / chunk merge, audio segment merge, translate_lines entry. Parallel loops also cancel pending futures on stop. 在热点循环里插入 check_cancel():ASR 分段循环、翻译并行循环、TTS warmup / 并行收集 / chunk 合并、音频段合并、translate_lines 入口。并行循环在 stop 时主动 cancel 未启动的 futures。 Done markers and completion detection / 完成标记与状态判断 - Adds output/.subtitle_done and output/.dubbing_done markers written by the runner as the last step of each stage. TaskRunner 在每个阶段最后一步写入 output/.subtitle_done 和 output/.dubbing_done 标记。 - text_done / audio_done now prefer the marker, falling back to a full outputs-present check, so half-failed runs are no longer mistaken for completion. text_done / audio_done 优先读标记,没有则回落到"所有最终产物齐全"检查,避免半失败被误判为完成。 - Subtitle length tuning controls (max_split_length, subtitle.max_length) surfaced in an expandable section above "Start Processing Subtitles", with suggested ranges and a "Restore defaults" button. 在"开始处理字幕"上方加入折叠的字幕长度微调(max_split_length、subtitle.max_length),含建议范围和"恢复默认"按钮。 Robustness fixes / 健壮性修复 - ElevenLabs ASR: elev2whisper() now always emits word-level timestamps; process_transcription() tolerates segments without `words` by synthesizing one from the segment text. ElevenLabs ASR:elev2whisper() 始终输出词级时间戳;process_transcription() 对没有 `words` 的 segment 用 segment 文本合成一个,避免 KeyError。 - download_video_section now surfaces detection errors (e.g. multiple media files in output/) with a clear message and a "Clear output and reselect" button, instead of silently falling back to the upload view. download_video_section 在媒体识别失败(例如 output/ 里有多个媒体)时显示明确错误和"清空输出并重新选择"按钮,不再静默回到上传界面。 - Re-upload of the same file is detected via session_state, avoiding an infinite rerun loop. 通过 session_state 识别同一文件的重复上传,避免无限 rerun 循环。 - give_star_button rewritten to a plain string template; previous f-string broke on the literal `{` in the embedded CSS. give_star_button 改写为普通字符串模板;原 f-string 因内嵌 CSS 里的 `{` 报错。 Tooling and config / 工具与配置 - OneKeyStart.bat consolidated: auto-detects .venv (uv install) or falls back to the legacy Conda env "videolingo"; OneKeyStart_uv.bat removed. 合并 OneKeyStart.bat:自动检测 .venv(uv 安装)或回落到旧的 Conda 环境 "videolingo";删除 OneKeyStart_uv.bat。 - Logs now go to logs/videolingo_.log instead of the project root. 日志写入 logs/videolingo_.log,不再散落在项目根目录。 - .streamlit/config.toml: client.toolbarMode = "viewer", server.fileWatcherType = "none", server.maxUploadSize preserved. .streamlit/config.toml:client.toolbarMode = "viewer",server.fileWatcherType = "none",保留 server.maxUploadSize。 - .gitignore: ignores logs/, videolingo_*.log, AGENTS.md, pr-body.md. .gitignore:忽略 logs/、videolingo_*.log、AGENTS.md、pr-body.md。 - setup.py + config.yaml header bumped to 3.0.2. setup.py 与 config.yaml 顶部版本号统一升至 3.0.2。 No new dependencies. No CLI behavior changes. 未引入新依赖。CLI 行为无变更。 --- .gitignore | 5 +- .streamlit/config.toml | 6 +- OneKeyStart_uv.bat | 19 --- config.yaml | 10 +- core/_10_gen_audio.py | 24 ++-- core/_11_merge_audio.py | 1 + core/_12_dub_to_vid.py | 5 + core/_1_ytdlp.py | 72 ++++++++++ core/_2_asr.py | 16 ++- core/_4_2_translate.py | 12 +- core/_7_sub_into_vid.py | 7 +- core/asr_backend/audio_preprocess.py | 41 +++++- core/asr_backend/elevenlabs_asr.py | 3 +- core/st_utils/download_video_section.py | 169 +++++++++++++++++------- core/st_utils/imports_and_utils.py | 10 +- core/st_utils/sidebar_setting.py | 69 ++++++---- core/st_utils/task_runner.py | 29 ++++ core/translate_lines.py | 1 + core/utils/__init__.py | 25 +++- core/utils/models.py | 11 +- core/utils/onekeycleanup.py | 11 +- setup.py | 2 +- st.py | 154 +++++++++++++++++++-- translations/en.json | 39 +++++- translations/es.json | 32 ++++- translations/fr.json | 32 ++++- translations/ja.json | 32 ++++- translations/ru.json | 32 ++++- translations/translations.py | 90 ++++++++++++- translations/zh-CN.json | 41 +++++- translations/zh-HK.json | 41 +++++- 31 files changed, 882 insertions(+), 159 deletions(-) delete mode 100644 OneKeyStart_uv.bat diff --git a/.gitignore b/.gitignore index b646718e..de46b8a4 100644 --- a/.gitignore +++ b/.gitignore @@ -172,4 +172,7 @@ config.backup.yaml runtime/ dev/ installer_files/ -logs/ \ No newline at end of file + +# Streamlit runtime logs from OneKeyStart.bat +logs/ +videolingo_*.log diff --git a/.streamlit/config.toml b/.streamlit/config.toml index 8c550bea..38567ef9 100644 --- a/.streamlit/config.toml +++ b/.streamlit/config.toml @@ -1,2 +1,6 @@ [server] -maxUploadSize = 4096 \ No newline at end of file +maxUploadSize = 4096 +fileWatcherType = "none" + +[client] +toolbarMode = "viewer" diff --git a/OneKeyStart_uv.bat b/OneKeyStart_uv.bat deleted file mode 100644 index 8cb2c70c..00000000 --- a/OneKeyStart_uv.bat +++ /dev/null @@ -1,19 +0,0 @@ -@echo off -cd /D "%~dp0" - -:: Log file with timestamp -for /f "tokens=2 delims==" %%I in ('wmic os get localdatetime /value') do set dt=%%I -set LOGFILE=videolingo_%dt:~0,8%_%dt:~8,6%.log - -echo [%date% %time%] VideoLingo starting... > "%LOGFILE%" -echo Log file: %LOGFILE% - -if exist ".venv\Scripts\streamlit.exe" ( - .venv\Scripts\streamlit run st.py 2>&1 | powershell -Command "$input | Tee-Object -FilePath '%LOGFILE%' -Append" -) else if exist ".venv\Scripts\python.exe" ( - .venv\Scripts\python -m streamlit run st.py 2>&1 | powershell -Command "$input | Tee-Object -FilePath '%LOGFILE%' -Append" -) else ( - echo ERROR: .venv not found. Please run setup first: | powershell -Command "$input | Tee-Object -FilePath '%LOGFILE%' -Append" - echo python setup_env.py -) -pause diff --git a/config.yaml b/config.yaml index c4b98cb2..a91bacd0 100644 --- a/config.yaml +++ b/config.yaml @@ -1,7 +1,7 @@ # * Settings marked with * are advanced settings that won't appear in the Streamlit page and can only be modified manually in config.py # recommend to set in streamlit page # ------------------- -# version: "3.0.0" +# version: "3.0.2" # author: "Huanshere" # ------------------- @@ -11,9 +11,9 @@ display_language: "zh-CN" # API settings api: - key: 'your-api-key' + key: 'YOUR_API_KEY' base_url: 'https://yunwu.ai' - model: '' + model: 'gpt-5.5' llm_support_json: false # *Number of LLM multi-threaded accesses, set to 1 if using local LLM max_workers: 4 @@ -22,7 +22,7 @@ max_workers: 4 target_language: '简体中文' # Whether to use Demucs for vocal separation before transcription -demucs: true +demucs: false whisper: # ["large-v3", "large-v3-turbo"]. Note: for zh model will force to use Belle/large-v3 @@ -38,7 +38,7 @@ whisper: elevenlabs_api_key: 'your_elevenlabs_api_key' # Whether to burn subtitles into the video -burn_subtitles: true +burn_subtitles: false ## ======================== Advanced Settings ======================== ## # *🔬 h264_nvenc GPU acceleration for ffmpeg, make sure your GPU supports it diff --git a/core/_10_gen_audio.py b/core/_10_gen_audio.py index 94e52454..a80a650e 100644 --- a/core/_10_gen_audio.py +++ b/core/_10_gen_audio.py @@ -85,6 +85,7 @@ def generate_tts_audio(tasks_df: pd.DataFrame) -> pd.DataFrame: warmup_size = min(WARMUP_SIZE, len(tasks_df)) for _, row in tasks_df.head(warmup_size).iterrows(): try: + check_cancel() number, real_dur = process_row(row, tasks_df) tasks_df.loc[tasks_df['number'] == number, 'real_dur'] = real_dur progress.advance(task) @@ -103,14 +104,20 @@ def generate_tts_audio(tasks_df: pd.DataFrame) -> pd.DataFrame: for _, row in remaining_tasks.iterrows() ] - for future in as_completed(futures): - try: - number, real_dur = future.result() - tasks_df.loc[tasks_df['number'] == number, 'real_dur'] = real_dur - progress.advance(task) - except Exception as e: - rprint(f"[red]❌ Error: {str(e)}[/red]") - raise e + try: + for future in as_completed(futures): + check_cancel() + try: + number, real_dur = future.result() + tasks_df.loc[tasks_df['number'] == number, 'real_dur'] = real_dur + progress.advance(task) + except Exception as e: + rprint(f"[red]❌ Error: {str(e)}[/red]") + raise e + except BaseException: + for f in futures: + f.cancel() + raise rprint("[bold green]✨ TTS audio generation completed![/bold green]") return tasks_df @@ -149,6 +156,7 @@ def merge_chunks(tasks_df: pd.DataFrame) -> pd.DataFrame: for index, row in tasks_df.iterrows(): if row['cut_off'] == 1: + check_cancel() chunk_df = tasks_df.iloc[chunk_start:index+1].reset_index(drop=True) speed_factor, keep_gaps = process_chunk(chunk_df, accept, min_speed) diff --git a/core/_11_merge_audio.py b/core/_11_merge_audio.py index 41c8ac16..014cb952 100644 --- a/core/_11_merge_audio.py +++ b/core/_11_merge_audio.py @@ -58,6 +58,7 @@ def merge_audio_segments(audios, new_sub_times, sample_rate): merge_task = progress.add_task("🎵 Merging audio segments...", total=len(audios)) for i, (audio_file, time_range) in enumerate(zip(audios, new_sub_times)): + check_cancel() if not os.path.exists(audio_file): console.print(f"[bold yellow]⚠️ Warning: File {audio_file} does not exist, skipping...[/bold yellow]") progress.advance(merge_task) diff --git a/core/_12_dub_to_vid.py b/core/_12_dub_to_vid.py index da7b2895..4bbdbe58 100644 --- a/core/_12_dub_to_vid.py +++ b/core/_12_dub_to_vid.py @@ -30,6 +30,11 @@ def merge_video_audio(): """Merge video and audio, and reduce video volume""" + from core._1_ytdlp import is_audio_only_input + if is_audio_only_input(): + rprint("[bold green]🎵 Audio-only input: skipping dubbing video merge. Dubbed audio is in the `output` directory.[/bold green]") + return + VIDEO_FILE = find_video_files() background_file = _BACKGROUND_AUDIO_FILE diff --git a/core/_1_ytdlp.py b/core/_1_ytdlp.py index 6064ae27..c0511029 100644 --- a/core/_1_ytdlp.py +++ b/core/_1_ytdlp.py @@ -1,9 +1,14 @@ import os,sys import glob +import json import re import subprocess from core.utils import * +OUTPUT_DIR = "output" +INPUT_MANIFEST = "input_manifest.json" +GENERATED_AUDIO_NAMES = {"dub.mp3", "normalized_dub.wav"} + def sanitize_filename(filename): # Remove or replace illegal characters filename = re.sub(r'[<>:"/\\|?*]', '', filename) @@ -51,6 +56,27 @@ def download_video_ytdlp(url, save_path='output', resolution='1080'): new_filename = sanitize_filename(filename) if new_filename != filename: os.rename(os.path.join(save_path, file), os.path.join(save_path, new_filename + ext)) + media_file = find_video_files(save_path) + write_input_manifest(media_file, "video", save_path) + +def write_input_manifest(media_file: str, media_type: str, save_path='output'): + os.makedirs(save_path, exist_ok=True) + manifest_path = os.path.join(save_path, INPUT_MANIFEST) + media_path = media_file.replace("\\", "/") if sys.platform.startswith('win') else media_file + with open(manifest_path, "w", encoding="utf-8") as f: + json.dump({"path": media_path, "type": media_type}, f, ensure_ascii=False, indent=2) + +def _read_input_manifest(save_path='output'): + manifest_path = os.path.join(save_path, INPUT_MANIFEST) + if not os.path.exists(manifest_path): + return None + with open(manifest_path, "r", encoding="utf-8") as f: + data = json.load(f) + media_file = data.get("path") + media_type = data.get("type") + if media_type not in {"video", "audio"} or not media_file or not os.path.exists(media_file): + return None + return media_file.replace("\\", "/") if sys.platform.startswith('win') else media_file, media_type def find_video_files(save_path='output'): video_files = [file for file in glob.glob(save_path + "/*") if os.path.splitext(file)[1][1:].lower() in load_key("allowed_video_formats")] @@ -62,6 +88,52 @@ def find_video_files(save_path='output'): raise ValueError(f"Number of videos found {len(video_files)} is not unique. Please check.") return video_files[0] +def find_audio_files(save_path='output'): + audio_files = [file for file in glob.glob(save_path + "/*") if os.path.splitext(file)[1][1:].lower() in load_key("allowed_audio_formats")] + if sys.platform.startswith('win'): + audio_files = [file.replace("\\", "/") for file in audio_files] + audio_files = [file for file in audio_files if os.path.basename(file) not in GENERATED_AUDIO_NAMES] + if len(audio_files) != 1: + raise ValueError(f"Number of audio files found {len(audio_files)} is not unique. Please check.") + return audio_files[0] + +def _safe_find_video_file(save_path='output'): + try: + return find_video_files(save_path) + except ValueError as e: + if "found 0" in str(e): + return None + raise + +def _safe_find_audio_file(save_path='output'): + try: + return find_audio_files(save_path) + except ValueError as e: + if "found 0" in str(e): + return None + raise + +def find_media_file(save_path='output'): + manifest = _read_input_manifest(save_path) + if manifest: + return manifest + video_file = _safe_find_video_file(save_path) + if video_file: + return video_file, "video" + audio_file = _safe_find_audio_file(save_path) + if audio_file: + return audio_file, "audio" + raise ValueError("No media file found. Please download or upload a media file first.") + +def is_audio_only_input(save_path='output'): + # True when the input is a standalone audio file (no video present). + # In this case VideoLingo only produces subtitle files; no video output. + try: + _, media_type = find_media_file(save_path) + return media_type == "audio" + except Exception: + return False + if __name__ == '__main__': # Example usage url = input('Please enter the URL of the video you want to download: ') diff --git a/core/_2_asr.py b/core/_2_asr.py index f54e8b10..24e071ec 100644 --- a/core/_2_asr.py +++ b/core/_2_asr.py @@ -1,14 +1,17 @@ from core.utils import * from core.asr_backend.demucs_vl import demucs_audio -from core.asr_backend.audio_preprocess import process_transcription, convert_video_to_audio, split_audio, save_results, normalize_audio_volume -from core._1_ytdlp import find_video_files +from core.asr_backend.audio_preprocess import process_transcription, convert_video_to_audio, prepare_audio_for_asr, split_audio, save_results, normalize_audio_volume +from core._1_ytdlp import find_media_file from core.utils.models import * @check_file_exists(_2_CLEANED_CHUNKS) def transcribe(): - # 1. video to audio - video_file = find_video_files() - convert_video_to_audio(video_file) + # 1. prepare audio + media_file, media_type = find_media_file() + if media_type == "video": + convert_video_to_audio(media_file) + else: + prepare_audio_for_asr(media_file) # 2. Demucs vocal separation: if load_key("demucs"): @@ -34,6 +37,7 @@ def transcribe(): rprint("[cyan]🎤 Transcribing audio with ElevenLabs API...[/cyan]") for start, end in segments: + check_cancel() result = ts(_RAW_AUDIO_FILE, vocal_audio, start, end) all_results.append(result) @@ -47,4 +51,4 @@ def transcribe(): save_results(df) if __name__ == "__main__": - transcribe() \ No newline at end of file + transcribe() diff --git a/core/_4_2_translate.py b/core/_4_2_translate.py index 1376f88f..dca7ffcb 100644 --- a/core/_4_2_translate.py +++ b/core/_4_2_translate.py @@ -67,9 +67,15 @@ def translate_all(): future = executor.submit(translate_chunk, chunk, chunks, theme_prompt, i) futures.append(future) results = [] - for future in concurrent.futures.as_completed(futures): - results.append(future.result()) - progress.update(task, advance=1) + try: + for future in concurrent.futures.as_completed(futures): + check_cancel() + results.append(future.result()) + progress.update(task, advance=1) + except BaseException: + for f in futures: + f.cancel() + raise results.sort(key=lambda x: x[0]) # Sort results based on original order diff --git a/core/_7_sub_into_vid.py b/core/_7_sub_into_vid.py index 7a2e253f..239c10b4 100644 --- a/core/_7_sub_into_vid.py +++ b/core/_7_sub_into_vid.py @@ -41,6 +41,11 @@ def check_gpu_available(): return False def merge_subtitles_to_video(): + from core._1_ytdlp import is_audio_only_input + if is_audio_only_input(): + rprint("[bold green]🎵 Audio-only input: skipping video merge. Subtitle files are ready in the `output` directory.[/bold green]") + return + video_file = find_video_files() os.makedirs(os.path.dirname(OUTPUT_VIDEO), exist_ok=True) @@ -103,4 +108,4 @@ def merge_subtitles_to_video(): process.kill() if __name__ == "__main__": - merge_subtitles_to_video() \ No newline at end of file + merge_subtitles_to_video() diff --git a/core/asr_backend/audio_preprocess.py b/core/asr_backend/audio_preprocess.py index 0d0db2ff..19738d83 100644 --- a/core/asr_backend/audio_preprocess.py +++ b/core/asr_backend/audio_preprocess.py @@ -52,6 +52,27 @@ def convert_video_to_audio(video_file: str): subprocess.run(cmd, check=True, stderr=subprocess.PIPE) rprint(f"[green]🎬➡️🎵 Converted <{video_file}> to <{_RAW_AUDIO_FILE}> with FFmpeg\n[/green]") +def prepare_audio_for_asr(audio_file: str): + os.makedirs(_AUDIO_DIR, exist_ok=True) + if not os.path.exists(_RAW_AUDIO_FILE): + rprint(f"[blue]🎵 Preparing uploaded audio for ASR with FFmpeg ......[/blue]") + if _ffmpeg_has_encoder('libmp3lame'): + cmd = [ + 'ffmpeg', '-y', '-i', audio_file, '-vn', + '-c:a', 'libmp3lame', '-b:a', '32k', + '-ar', '16000', '-ac', '1', + '-metadata', 'encoding=UTF-8', _RAW_AUDIO_FILE + ] + else: + rprint("[yellow]⚠️ libmp3lame not found in ffmpeg, falling back to WAV (PCM) encoding[/yellow]") + cmd = [ + 'ffmpeg', '-y', '-i', audio_file, '-vn', + '-c:a', 'pcm_s16le', '-ar', '16000', '-ac', '1', + '-f', 'wav', _RAW_AUDIO_FILE + ] + subprocess.run(cmd, check=True, stderr=subprocess.PIPE) + rprint(f"[green]🎵 Prepared <{audio_file}> as <{_RAW_AUDIO_FILE}>\n[/green]") + def get_audio_duration(audio_file: str) -> float: """Get the duration of an audio file using ffmpeg.""" cmd = ['ffmpeg', '-i', audio_file] @@ -111,8 +132,22 @@ def process_transcription(result: Dict) -> pd.DataFrame: for segment in result['segments']: # Get speaker_id, if not exists, set to None speaker_id = segment.get('speaker_id', None) - - for word in segment['words']: + + words = segment.get('words') + if not words: + # Some ASR backends (e.g. ElevenLabs without word-level timestamps) + # return segments without per-word entries. Synthesize a single + # word from the segment text so downstream alignment still works. + seg_text = (segment.get('text') or '').strip() + if not seg_text: + continue + words = [{ + 'word': seg_text, + 'start': segment.get('start'), + 'end': segment.get('end'), + }] + + for word in words: # Check word length if len(word["word"]) > 30: rprint(f"[yellow]⚠️ Warning: Detected word longer than 30 characters, skipping: {word['word']}[/yellow]") @@ -178,4 +213,4 @@ def save_results(df: pd.DataFrame): rprint(f"[green]📊 Excel file saved to {_2_CLEANED_CHUNKS}[/green]") def save_language(language: str): - update_key("whisper.detected_language", language) \ No newline at end of file + update_key("whisper.detected_language", language) diff --git a/core/asr_backend/elevenlabs_asr.py b/core/asr_backend/elevenlabs_asr.py index 5a4c5dda..af151b63 100644 --- a/core/asr_backend/elevenlabs_asr.py +++ b/core/asr_backend/elevenlabs_asr.py @@ -124,7 +124,8 @@ def transcribe_audio_elevenlabs(raw_audio_path, vocal_audio_path, start = None, word['end'] += start rprint(f"[green]✓ Transcription completed in {time.time() - start_time:.2f} seconds[/green]") - parsed_result = elev2whisper(result) + # Keep word-level timestamps so downstream process_transcription has `words`. + parsed_result = elev2whisper(result, word_level_timestamp=True) os.makedirs(os.path.dirname(LOG_FILE), exist_ok=True) with open(LOG_FILE, "w", encoding="utf-8") as f: json.dump(parsed_result, f, indent=4, ensure_ascii=False) diff --git a/core/st_utils/download_video_section.py b/core/st_utils/download_video_section.py index 5f3023e5..c31ac909 100644 --- a/core/st_utils/download_video_section.py +++ b/core/st_utils/download_video_section.py @@ -1,76 +1,145 @@ import os import re import shutil -import subprocess from time import sleep import streamlit as st -from core._1_ytdlp import download_video_ytdlp, find_video_files +from core._1_ytdlp import download_video_ytdlp, find_media_file, write_input_manifest from core.utils import * from translations.translations import translate as t OUTPUT_DIR = "output" + +def _css_text(value): + return str(value).replace("\\", "\\\\").replace('"', '\\"') + + +def _inject_file_uploader_i18n(): + # Streamlit does not expose official i18n for file_uploader internals. + # Streamlit 1.49 DOM: + # div[data-testid="stFileUploaderDropzoneInstructions"] + # > span (cloud icon, must keep) + # > div (column flex) + # > span (1st: "Drag and drop ... here") + # > span (2nd: "Limit ... · MP4, MOV ...") + # So we target ONLY the two direct child spans of the inner div, leaving + # the icon and other elements untouched. + drag_text = _css_text(t("Drag and drop file here")) + limit_text = _css_text(t("Limit 4GB per file · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A")) + browse_text = _css_text(t("Browse files")) + st.markdown( + f""" + + """, + unsafe_allow_html=True, + ) + def download_video_section(): st.header(t("a. Download or Upload Video")) with st.container(border=True): try: - video_file = find_video_files() - st.video(video_file) + media_file, media_type = find_media_file() + if media_type == "video": + st.video(media_file) + else: + st.audio(media_file) if st.button(t("Delete and Reselect"), key="delete_video_button"): - os.remove(video_file) + os.remove(media_file) if os.path.exists(OUTPUT_DIR): shutil.rmtree(OUTPUT_DIR) + st.session_state.pop("_processed_upload_id", None) sleep(1) st.rerun() return True - except: - col1, col2 = st.columns([3, 1]) - with col1: - url = st.text_input(t("Enter YouTube link:")) - with col2: - res_dict = { - "360p": "360", - "1080p": "1080", - "Best": "best" - } - target_res = load_key("ytb_resolution") - res_options = list(res_dict.keys()) - default_idx = list(res_dict.values()).index(target_res) if target_res in res_dict.values() else 0 - res_display = st.selectbox(t("Resolution"), options=res_options, index=default_idx) - res = res_dict[res_display] - if st.button(t("Download Video"), key="download_button", width="stretch"): - if url: - with st.spinner("Downloading video..."): - download_video_ytdlp(url, resolution=res) + except ValueError as e: + if "No media file found" not in str(e): + st.error(t("Media file detection failed: {error}").replace("{error}", str(e))) + if st.button(t("Clear output and reselect"), key="clear_output_button"): + if os.path.exists(OUTPUT_DIR): + shutil.rmtree(OUTPUT_DIR) + st.session_state.pop("_processed_upload_id", None) st.rerun() + return False + except Exception: + pass - uploaded_file = st.file_uploader(t("Or upload video"), type=load_key("allowed_video_formats") + load_key("allowed_audio_formats")) - if uploaded_file: - if os.path.exists(OUTPUT_DIR): - shutil.rmtree(OUTPUT_DIR) - os.makedirs(OUTPUT_DIR, exist_ok=True) + col1, col2 = st.columns([3, 1]) + with col1: + url = st.text_input(t("Enter YouTube link:")) + with col2: + res_dict = { + "360p": "360", + "1080p": "1080", + t("Best"): "best" + } + target_res = load_key("ytb_resolution") + res_options = list(res_dict.keys()) + default_idx = list(res_dict.values()).index(target_res) if target_res in res_dict.values() else 0 + res_display = st.selectbox(t("Resolution"), options=res_options, index=default_idx) + res = res_dict[res_display] + if st.button(t("Download Video"), key="download_button", width="stretch"): + if url: + with st.spinner(t("Downloading video...")): + download_video_ytdlp(url, resolution=res) + st.rerun() + + _inject_file_uploader_i18n() + uploaded_file = st.file_uploader(t("Upload local media file"), type=load_key("allowed_video_formats") + load_key("allowed_audio_formats")) + if uploaded_file: + upload_id = f"{uploaded_file.name}:{uploaded_file.size}" + if st.session_state.get("_processed_upload_id") == upload_id: + try: + find_media_file() + st.warning(t("Upload was already processed. Delete and reselect to upload again.")) + return False + except Exception: + st.session_state.pop("_processed_upload_id", None) + + if os.path.exists(OUTPUT_DIR): + shutil.rmtree(OUTPUT_DIR) + os.makedirs(OUTPUT_DIR, exist_ok=True) + + raw_name = uploaded_file.name.replace(' ', '_') + name, ext = os.path.splitext(raw_name) + clean_name = re.sub(r'[^\w\-_\.]', '', name) + ext.lower() - raw_name = uploaded_file.name.replace(' ', '_') - name, ext = os.path.splitext(raw_name) - clean_name = re.sub(r'[^\w\-_\.]', '', name) + ext.lower() - - with open(os.path.join(OUTPUT_DIR, clean_name), "wb") as f: - f.write(uploaded_file.getbuffer()) + with open(os.path.join(OUTPUT_DIR, clean_name), "wb") as f: + f.write(uploaded_file.getbuffer()) - if ext.lower() in load_key("allowed_audio_formats"): - convert_audio_to_video(os.path.join(OUTPUT_DIR, clean_name)) - st.rerun() - else: - return False + media_path = os.path.join(OUTPUT_DIR, clean_name) + media_ext = ext.lower().lstrip(".") + media_type = "video" if media_ext in load_key("allowed_video_formats") else "audio" + write_input_manifest(media_path, media_type) -def convert_audio_to_video(audio_file: str) -> str: - output_video = os.path.join(OUTPUT_DIR, 'black_screen.mp4') - if not os.path.exists(output_video): - print(f"🎵➡️🎬 Converting audio to video with FFmpeg ......") - ffmpeg_cmd = ['ffmpeg', '-y', '-f', 'lavfi', '-i', 'color=c=black:s=640x360', '-i', audio_file, '-shortest', '-c:v', 'libx264', '-c:a', 'aac', '-pix_fmt', 'yuv420p', output_video] - subprocess.run(ffmpeg_cmd, check=True, capture_output=True, text=True, encoding='utf-8') - print(f"🎵➡️🎬 Converted <{audio_file}> to <{output_video}> with FFmpeg\n") - # delete audio file - os.remove(audio_file) - return output_video + st.session_state["_processed_upload_id"] = upload_id + st.rerun() + else: + return False diff --git a/core/st_utils/imports_and_utils.py b/core/st_utils/imports_and_utils.py index c6929094..23569e92 100644 --- a/core/st_utils/imports_and_utils.py +++ b/core/st_utils/imports_and_utils.py @@ -26,7 +26,7 @@ def download_subtitle_zip_button(text: str): ) # st.markdown -give_star_button = """ +_GIVE_STAR_BUTTON_TEMPLATE = """
- Star on GitHub 🌟 + __STAR_LABEL__
""" + +def give_star_button(): + return _GIVE_STAR_BUTTON_TEMPLATE.replace("__STAR_LABEL__", t("Star on GitHub 🌟")) + button_style = """ -""" \ No newline at end of file +""" diff --git a/core/st_utils/sidebar_setting.py b/core/st_utils/sidebar_setting.py index a300de5b..df462cce 100644 --- a/core/st_utils/sidebar_setting.py +++ b/core/st_utils/sidebar_setting.py @@ -1,7 +1,6 @@ import streamlit as st import requests from translations.translations import translate as t -from translations.translations import DISPLAY_LANGUAGES from core.utils import * @@ -52,15 +51,6 @@ def page_setting(): unsafe_allow_html=True, ) - display_language = st.selectbox( - "Display Language 🌐", - options=list(DISPLAY_LANGUAGES.keys()), - index=list(DISPLAY_LANGUAGES.values()).index(load_key("display_language")), - ) - if DISPLAY_LANGUAGES[display_language] != load_key("display_language"): - update_key("display_language", DISPLAY_LANGUAGES[display_language]) - st.rerun() - # with st.expander(t("Youtube Settings"), expanded=True): # config_input(t("Cookies Path"), "youtube.cookies_path") @@ -181,6 +171,11 @@ def page_setting(): t("WhisperX Runtime"), options=["local", "cloud", "elevenlabs"], index=["local", "cloud", "elevenlabs"].index(load_key("whisper.runtime")), + format_func=lambda x: { + "local": t("Local"), + "cloud": t("Cloud"), + "elevenlabs": t("ElevenLabs"), + }[x], help=t( "Local runtime requires >8GB GPU, cloud runtime requires 302ai API key, elevenlabs runtime requires ElevenLabs API key" ), @@ -191,7 +186,7 @@ def page_setting(): if runtime == "cloud": config_input(t("WhisperX 302ai API"), "whisper.whisperX_302_api_key") if runtime == "elevenlabs": - config_input(("ElevenLabs API"), "whisper.elevenlabs_api_key") + config_input(t("ElevenLabs API"), "whisper.elevenlabs_api_key") with c2: target_language = st.text_input( @@ -216,16 +211,26 @@ def page_setting(): update_key("demucs", demucs) st.rerun() - burn_subtitles = st.toggle( - t("Burn-in Subtitles"), - value=load_key("burn_subtitles"), - help=t( - "Whether to burn subtitles into the video, will increase processing time" - ), - ) - if burn_subtitles != load_key("burn_subtitles"): - update_key("burn_subtitles", burn_subtitles) - st.rerun() + from core._1_ytdlp import is_audio_only_input + audio_only = is_audio_only_input() + if audio_only: + st.toggle( + t("Burn-in Subtitles"), + value=False, + disabled=True, + help=t("Audio-only input produces subtitle files only; no video is generated."), + ) + else: + burn_subtitles = st.toggle( + t("Burn-in Subtitles"), + value=load_key("burn_subtitles"), + help=t( + "Whether to burn subtitles into the video, will increase processing time" + ), + ) + if burn_subtitles != load_key("burn_subtitles"): + update_key("burn_subtitles", burn_subtitles) + st.rerun() with st.expander(t("Dubbing Settings"), expanded=True): tts_methods = [ "azure_tts", @@ -238,10 +243,22 @@ def page_setting(): "sf_cosyvoice2", "f5tts", ] + tts_method_labels = { + "azure_tts": t("Azure TTS"), + "openai_tts": t("OpenAI TTS"), + "fish_tts": t("Fish TTS"), + "sf_fish_tts": t("SiliconFlow Fish TTS"), + "edge_tts": t("Edge TTS"), + "gpt_sovits": t("GPT-SoVITS"), + "custom_tts": t("Custom TTS"), + "sf_cosyvoice2": t("SiliconFlow CosyVoice2"), + "f5tts": t("F5-TTS"), + } select_tts = st.selectbox( t("TTS Method"), options=tts_methods, index=tts_methods.index(load_key("tts_method")), + format_func=lambda x: tts_method_labels[x], ) if select_tts != load_key("tts_method"): update_key("tts_method", select_tts) @@ -269,14 +286,14 @@ def page_setting(): update_key("sf_fish_tts.mode", selected_mode) st.rerun() if selected_mode == "preset": - config_input("Voice", "sf_fish_tts.voice") + config_input(t("Voice"), "sf_fish_tts.voice") elif select_tts == "openai_tts": - config_input("302ai API", "openai_tts.api_key") + config_input(t("302ai API"), "openai_tts.api_key") config_input(t("OpenAI Voice"), "openai_tts.voice") elif select_tts == "fish_tts": - config_input("302ai API", "fish_tts.api_key") + config_input(t("302ai API"), "fish_tts.api_key") fish_tts_character = st.selectbox( t("Fish TTS Character"), options=list(load_key("fish_tts.character_id_dict").keys()), @@ -289,7 +306,7 @@ def page_setting(): st.rerun() elif select_tts == "azure_tts": - config_input("302ai API", "azure_tts.api_key") + config_input(t("302ai API"), "azure_tts.api_key") config_input(t("Azure Voice"), "azure_tts.voice") elif select_tts == "gpt_sovits": @@ -321,7 +338,7 @@ def page_setting(): config_input(t("SiliconFlow API Key"), "sf_cosyvoice2.api_key") elif select_tts == "f5tts": - config_input("302ai API", "f5tts.302_api") + config_input(t("302ai API"), "f5tts.302_api") def check_api(): diff --git a/core/st_utils/task_runner.py b/core/st_utils/task_runner.py index 8a38248f..dba8f301 100644 --- a/core/st_utils/task_runner.py +++ b/core/st_utils/task_runner.py @@ -40,9 +40,32 @@ class TaskRunner: _thread: threading.Thread | None = None _steps: list = field(default_factory=list) + # Class-level pointer to the currently executing runner so that long-running + # core functions can call ``TaskRunner.check_cancel()`` without needing a + # direct reference. The pointer is only meaningful inside the background + # thread that ``start()`` launches. + _current: "TaskRunner | None" = None + def __post_init__(self): self._pause_event.set() # not paused initially + # ------ Cancellation helpers (called from core code) ------ + + @classmethod + def check_cancel(cls) -> None: + """Block while paused and raise :class:`StopTask` if a stop was requested. + + Safe to call from any thread; becomes a no-op when no runner is active + (e.g. when core scripts are invoked from the CLI). + """ + runner = cls._current + if runner is None: + return + # Block while paused so long loops freeze on pause too. + runner._pause_event.wait() + if runner._stop_event.is_set(): + raise StopTask() + # ------ Singleton per session_state ------ @staticmethod def get(session_state, key: str = "_task_runner") -> "TaskRunner": @@ -121,6 +144,7 @@ def progress(self) -> float: def _run(self): """Execute steps sequentially in background thread.""" + type(self)._current = self try: for i, (label, func) in enumerate(self._steps): # Check stop before each step @@ -141,7 +165,12 @@ def _run(self): func() self.state = "completed" + except StopTask: + self.state = "stopped" except Exception as e: self.error_msg = str(e) self.state = "error" traceback.print_exc() + finally: + if type(self)._current is self: + type(self)._current = None diff --git a/core/translate_lines.py b/core/translate_lines.py index fd03da4c..11c0fe63 100644 --- a/core/translate_lines.py +++ b/core/translate_lines.py @@ -19,6 +19,7 @@ def valid_translate_result(result: dict, required_keys: list, required_sub_keys: return {"status": "success", "message": "Translation completed"} def translate_lines(lines, previous_content_prompt, after_cotent_prompt, things_to_note_prompt, summary_prompt, index = 0): + check_cancel() shared_prompt = generate_shared_prompt(previous_content_prompt, after_cotent_prompt, summary_prompt, things_to_note_prompt) # Retry translation if the length of the original text and the translated text are not the same, or if the specified key is missing diff --git a/core/utils/__init__.py b/core/utils/__init__.py index b864f4e4..2dc062eb 100644 --- a/core/utils/__init__.py +++ b/core/utils/__init__.py @@ -7,4 +7,27 @@ except ImportError: pass -__all__ = ["ask_gpt", "except_handler", "check_file_exists", "load_key", "update_key", "rprint", "get_joiner"] \ No newline at end of file + +def check_cancel(): + """Cooperative cancellation hook for long-running core loops. + + Imports lazily to avoid coupling core scripts to the Streamlit-side + TaskRunner. Becomes a no-op when no runner is active (CLI usage). + """ + try: + from core.st_utils.task_runner import TaskRunner + except Exception: + return + TaskRunner.check_cancel() + + +__all__ = [ + "ask_gpt", + "except_handler", + "check_file_exists", + "load_key", + "update_key", + "rprint", + "get_joiner", + "check_cancel", +] \ No newline at end of file diff --git a/core/utils/models.py b/core/utils/models.py index 8e15c829..73a358d5 100644 --- a/core/utils/models.py +++ b/core/utils/models.py @@ -25,6 +25,13 @@ _AUDIO_SEGS_DIR = "output/audio/segs" _AUDIO_TMP_DIR = "output/audio/tmp" +# ------------------------------------------ +# Done markers (written by st.py task runner after a stage finishes +# cleanly; absence implies the stage did not complete). +# ------------------------------------------ +_TEXT_DONE_MARKER = "output/.subtitle_done" +_AUDIO_DONE_MARKER = "output/.dubbing_done" + # ------------------------------------------ # 导出 # ------------------------------------------ @@ -45,5 +52,7 @@ "_BACKGROUND_AUDIO_FILE", "_AUDIO_REFERS_DIR", "_AUDIO_SEGS_DIR", - "_AUDIO_TMP_DIR" + "_AUDIO_TMP_DIR", + "_TEXT_DONE_MARKER", + "_AUDIO_DONE_MARKER", ] diff --git a/core/utils/onekeycleanup.py b/core/utils/onekeycleanup.py index 286a8e76..f5199bcc 100644 --- a/core/utils/onekeycleanup.py +++ b/core/utils/onekeycleanup.py @@ -1,13 +1,12 @@ import os import glob -from core._1_ytdlp import find_video_files +from core._1_ytdlp import find_media_file import shutil def cleanup(history_dir="history"): - # Get video file name - video_file = find_video_files() - video_name = video_file.split("/")[1] - video_name = os.path.splitext(video_name)[0] + # Get input media file name + media_file, _ = find_media_file() + video_name = os.path.splitext(os.path.basename(media_file))[0] video_name = sanitize_filename(video_name) # Create required folders @@ -77,4 +76,4 @@ def sanitize_filename(filename): return filename if __name__ == "__main__": - cleanup() \ No newline at end of file + cleanup() diff --git a/setup.py b/setup.py index 07bad53e..b1021893 100644 --- a/setup.py +++ b/setup.py @@ -1,7 +1,7 @@ from setuptools import setup, find_packages NAME = 'VideoLingo' -VERSION = '3.0.0' +VERSION = '3.0.2' with open('requirements.txt', encoding='utf-8') as f: requirements = f.read().splitlines() diff --git a/st.py b/st.py index b8486e94..992534c2 100644 --- a/st.py +++ b/st.py @@ -14,6 +14,8 @@ def _configure_utf8_console(): from core.st_utils.imports_and_utils import * from core.st_utils.task_runner import TaskRunner from core import * +from translations.translations import DISPLAY_LANGUAGES, init_display_language, set_display_language +from core.utils.models import _TEXT_DONE_MARKER, _AUDIO_DONE_MARKER # SET PATH current_dir = os.path.dirname(os.path.abspath(__file__)) @@ -103,6 +105,20 @@ def _task_control_panel(runner_key: str): # ─── Text processing ─── +def _touch(path): + os.makedirs(os.path.dirname(path) or ".", exist_ok=True) + with open(path, "w", encoding="utf-8") as f: + f.write("") + + +def _clear_path(path): + if os.path.exists(path): + try: + os.remove(path) + except OSError: + pass + + def _get_text_steps(): """Return the subtitle processing steps as (label, callable) list.""" steps = [ @@ -129,15 +145,90 @@ def _get_text_steps(): t("Merging subtitles into the video"), _7_sub_into_vid.merge_subtitles_to_video, ), + ( + t("Finalize subtitle outputs"), + lambda: _touch(_TEXT_DONE_MARKER), + ), ] return steps +def _subtitle_length_controls(): + """Render inline controls for the two subtitle-length tunables. + + Both values live in config.yaml and are read by: + - max_split_length → core/_3_2_split_meaning.py (first pass NLP cut) + - subtitle.max_length → core/_5_split_sub.py (final subtitle line) + """ + DEFAULT_MAX_SPLIT_LENGTH = 20 + DEFAULT_MAX_LENGTH = 75 + MAX_LENGTH_KEY = "subtitle.max_length" + + with st.expander(t("Subtitle length tuning"), expanded=False): + st.caption( + t( + "These two values control how subtitles are cut. " + "Smaller = more, shorter lines. Larger = fewer, longer lines." + ) + ) + + new_max_split = st.number_input( + t("max_split_length (rough cut, words/tokens per chunk)"), + min_value=8, + max_value=60, + value=int(load_key("max_split_length")), + step=1, + help=t( + "Suggested: 18-25. Below 18 cuts too finely and hurts translation; " + "above 25 makes downstream subtitle splitting hard to align." + ), + key="cfg_max_split_length", + width=220, + ) + new_max_length = st.number_input( + t("max_length (max characters per subtitle line)"), + min_value=20, + max_value=200, + value=int(load_key(MAX_LENGTH_KEY)), + step=1, + help=t( + "Suggested: 50-90. Lower if a subtitle line looks crowded on screen; " + "raise if lines are split too aggressively." + ), + key="cfg_max_length", + width=220, + ) + + changed = False + if new_max_split != load_key("max_split_length"): + update_key("max_split_length", int(new_max_split)) + changed = True + if new_max_length != load_key(MAX_LENGTH_KEY): + update_key(MAX_LENGTH_KEY, int(new_max_length)) + changed = True + + if st.button( + t("Restore defaults ({split}/{length})") + .replace("{split}", str(DEFAULT_MAX_SPLIT_LENGTH)) + .replace("{length}", str(DEFAULT_MAX_LENGTH)), + key="restore_subtitle_length_defaults", + ): + update_key("max_split_length", DEFAULT_MAX_SPLIT_LENGTH) + update_key(MAX_LENGTH_KEY, DEFAULT_MAX_LENGTH) + st.rerun() + + if changed: + st.rerun() + + def text_processing_section(): st.header(t("b. Translate and Generate Subtitles")) runner = TaskRunner.get(st.session_state, "_text_runner") + from core._1_ytdlp import is_audio_only_input + audio_only = is_audio_only_input() with st.container(border=True): + final_text_step = t("Generate subtitle files") if audio_only else t("Merging subtitles into the video") st.markdown( f"""

@@ -147,26 +238,33 @@ def text_processing_section(): 2. {t("Sentence segmentation using NLP and LLM")}
3. {t("Summarization and multi-step translation")}
4. {t("Cutting and aligning long subtitles")}
- 5. {t("Generating timeline and subtitles")}
- 6. {t("Merging subtitles into the video")} + 5. {final_text_step} """, unsafe_allow_html=True, ) - if not os.path.exists(SUB_VIDEO): + text_done = os.path.exists(_TEXT_DONE_MARKER) or ( + os.path.exists("output/trans.srt") + and os.path.exists("output/src.srt") + and (audio_only or os.path.exists(SUB_VIDEO)) + ) + + if not text_done: if runner.is_active: _task_control_panel("_text_runner") elif runner.is_done: _task_control_panel("_text_runner") else: + _subtitle_length_controls() if st.button( t("Start Processing Subtitles"), key="text_processing_button" ): + _clear_path(_TEXT_DONE_MARKER) steps = _get_text_steps() runner.start(steps) st.rerun() else: - if load_key("burn_subtitles"): + if not audio_only and load_key("burn_subtitles") and os.path.exists(SUB_VIDEO): st.video(SUB_VIDEO) download_subtitle_zip_button(text=t("Download All Srt Files")) @@ -193,11 +291,17 @@ def _get_audio_steps(): (t("Generate and merge audio files"), _10_gen_audio.gen_audio), (t("Merge full audio"), _11_merge_audio.merge_full_audio), (t("Merge final audio into video"), _12_dub_to_vid.merge_video_audio), + ( + t("Finalize dubbing outputs"), + lambda: _touch(_AUDIO_DONE_MARKER), + ), ] return steps def audio_processing_section(): + from core._1_ytdlp import is_audio_only_input + audio_only = is_audio_only_input() st.header(t("c. Dubbing")) runner = TaskRunner.get(st.session_state, "_audio_runner") @@ -210,12 +314,17 @@ def audio_processing_section(): 1. {t("Generate audio tasks and chunks")}
2. {t("Extract reference audio")}
3. {t("Generate and merge audio files")}
- 4. {t("Merge final audio into video")} + 4. {t("Merge full audio")}
+ 5. {t("Merge final audio into video")} """, unsafe_allow_html=True, ) - if not os.path.exists(DUB_VIDEO): + audio_done = os.path.exists(_AUDIO_DONE_MARKER) or ( + os.path.exists("output/dub.mp3") + and (audio_only or os.path.exists(DUB_VIDEO)) + ) + if not audio_done: if runner.is_active: _task_control_panel("_audio_runner") elif runner.is_done: @@ -224,6 +333,7 @@ def audio_processing_section(): if st.button( t("Start Audio Processing"), key="audio_processing_button" ): + _clear_path(_AUDIO_DONE_MARKER) steps = _get_audio_steps() runner.start(steps) st.rerun() @@ -233,9 +343,10 @@ def audio_processing_section(): "Audio processing is complete! You can check the audio files in the `output` folder." ) ) - if load_key("burn_subtitles"): + if not audio_only and load_key("burn_subtitles") and os.path.exists(DUB_VIDEO): st.video(DUB_VIDEO) if st.button(t("Delete dubbing files"), key="delete_dubbing_files"): + _clear_path(_AUDIO_DONE_MARKER) delete_dubbing_files() st.rerun() if st.button(t("Archive to 'history'"), key="cleanup_in_audio_processing"): @@ -247,9 +358,26 @@ def audio_processing_section(): def main(): - logo_col, _ = st.columns([1, 1]) + init_display_language() + st.set_option("client.toolbarMode", "viewer") + + logo_col, lang_col = st.columns([3, 1]) with logo_col: st.image("docs/logo.png", width="stretch") + with lang_col: + language_values = list(DISPLAY_LANGUAGES.values()) + current_language = init_display_language() + selected_language = st.selectbox( + t("Display Language 🌐"), + options=list(DISPLAY_LANGUAGES.keys()), + index=language_values.index(current_language) if current_language in language_values else 0, + key="display_language_selector", + ) + new_language = DISPLAY_LANGUAGES[selected_language] + if new_language != current_language: + set_display_language(new_language) + st.rerun() + st.markdown(button_style, unsafe_allow_html=True) welcome_text = t( 'Hello, welcome to VideoLingo. If you encounter any issues, feel free to get instant answers with our Free QA Agent here! You can also try out our SaaS website at videolingo.io for free!' @@ -261,10 +389,12 @@ def main(): # add settings with st.sidebar: page_setting() - st.markdown(give_star_button, unsafe_allow_html=True) - download_video_section() - text_processing_section() - audio_processing_section() + st.markdown(give_star_button(), unsafe_allow_html=True) + if download_video_section(): + text_done = text_processing_section() + from core._1_ytdlp import is_audio_only_input + if text_done and not is_audio_only_input(): + audio_processing_section() if __name__ == "__main__": diff --git a/translations/en.json b/translations/en.json index 3a99c0de..21a90b62 100644 --- a/translations/en.json +++ b/translations/en.json @@ -122,5 +122,42 @@ "Task completed!": "Task completed!", "Task stopped": "Task stopped", "Task error": "Task error", - "OK": "OK" + "OK": "OK", + "Display Language 🌐": "Display Language 🌐", + "Downloading video...": "Downloading video...", + "Best": "Best", + "Upload local media file": "Upload local media file", + "Upload was already processed. Delete and reselect to upload again.": "Upload was already processed. Delete and reselect to upload again.", + "Audio-only input produces subtitle files only; no video is generated.": "Audio-only input produces subtitle files only; no video is generated.", + "Local": "Local", + "Cloud": "Cloud", + "ElevenLabs": "ElevenLabs", + "ElevenLabs API": "ElevenLabs API", + "Voice": "Voice", + "302ai API": "302ai API", + "Azure TTS": "Azure TTS", + "OpenAI TTS": "OpenAI TTS", + "Fish TTS": "Fish TTS", + "SiliconFlow Fish TTS": "SiliconFlow Fish TTS", + "Edge TTS": "Edge TTS", + "GPT-SoVITS": "GPT-SoVITS", + "Custom TTS": "Custom TTS", + "SiliconFlow CosyVoice2": "SiliconFlow CosyVoice2", + "F5-TTS": "F5-TTS", + "Star on GitHub 🌟": "Star on GitHub 🌟", + "Generate subtitle files": "Generate subtitle files", + "Drag and drop file here": "Drag and drop file here", + "Limit 4GB per file · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A": "Limit 4GB per file · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A", + "Browse files": "Browse files", + "Media file detection failed: {error}": "Media file detection failed: {error}", + "Clear output and reselect": "Clear output and reselect", + "Finalize subtitle outputs": "Finalize subtitle outputs", + "Finalize dubbing outputs": "Finalize dubbing outputs", + "Subtitle length tuning": "Subtitle length tuning", + "These two values control how subtitles are cut. Smaller = more, shorter lines. Larger = fewer, longer lines.": "These two values control how subtitles are cut. Smaller = more, shorter lines. Larger = fewer, longer lines.", + "max_split_length (rough cut, words/tokens per chunk)": "max_split_length (rough cut, words/tokens per chunk)", + "Suggested: 18-25. Below 18 cuts too finely and hurts translation; above 25 makes downstream subtitle splitting hard to align.": "Suggested: 18-25. Below 18 cuts too finely and hurts translation; above 25 makes downstream subtitle splitting hard to align.", + "max_length (max characters per subtitle line)": "max_length (max characters per subtitle line)", + "Suggested: 50-90. Lower if a subtitle line looks crowded on screen; raise if lines are split too aggressively.": "Suggested: 50-90. Lower if a subtitle line looks crowded on screen; raise if lines are split too aggressively.", + "Restore defaults ({split}/{length})": "Restore defaults ({split}/{length})" } diff --git a/translations/es.json b/translations/es.json index 469fba84..6d79e172 100644 --- a/translations/es.json +++ b/translations/es.json @@ -122,5 +122,35 @@ "Task completed!": "¡Tarea completada!", "Task stopped": "Tarea detenida", "Task error": "Error en la tarea", - "OK": "OK" + "OK": "OK", + "Display Language 🌐": "Idioma de visualización 🌐", + "Downloading video...": "Descargando video...", + "Best": "Mejor", + "Upload local media file": "Subir archivo local de audio o video", + "Upload was already processed. Delete and reselect to upload again.": "La subida ya fue procesada. Elimine y vuelva a seleccionar para subir de nuevo.", + "Audio-only input produces subtitle files only; no video is generated.": "La entrada solo de audio produce únicamente archivos de subtítulos; no se genera video.", + "Local": "Local", + "Cloud": "Nube", + "ElevenLabs": "ElevenLabs", + "ElevenLabs API": "API de ElevenLabs", + "Voice": "Voz", + "302ai API": "API de 302ai", + "Azure TTS": "Azure TTS", + "OpenAI TTS": "OpenAI TTS", + "Fish TTS": "Fish TTS", + "SiliconFlow Fish TTS": "SiliconFlow Fish TTS", + "Edge TTS": "Edge TTS", + "GPT-SoVITS": "GPT-SoVITS", + "Custom TTS": "TTS personalizado", + "SiliconFlow CosyVoice2": "SiliconFlow CosyVoice2", + "F5-TTS": "F5-TTS", + "Star on GitHub 🌟": "Dar estrella en GitHub 🌟", + "Generate subtitle files": "Generar archivos de subtítulos", + "Drag and drop file here": "Arrastre y suelte el archivo aquí", + "Limit 4GB per file · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A": "Límite 4GB por archivo · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A", + "Browse files": "Buscar archivos", + "Media file detection failed: {error}": "Error al detectar el archivo multimedia: {error}", + "Clear output and reselect": "Limpiar salida y volver a seleccionar", + "Finalize subtitle outputs": "Finalizar resultados de subtítulos", + "Finalize dubbing outputs": "Finalizar resultados de doblaje" } diff --git a/translations/fr.json b/translations/fr.json index dcdac7af..d05f48c6 100644 --- a/translations/fr.json +++ b/translations/fr.json @@ -122,5 +122,35 @@ "Task completed!": "Tâche terminée !", "Task stopped": "Tâche arrêtée", "Task error": "Erreur de tâche", - "OK": "OK" + "OK": "OK", + "Display Language 🌐": "Langue d'affichage 🌐", + "Downloading video...": "Téléchargement de la vidéo...", + "Best": "Meilleure", + "Upload local media file": "Importer un fichier audio ou vidéo local", + "Upload was already processed. Delete and reselect to upload again.": "Ce fichier importé a déjà été traité. Supprimez-le et sélectionnez-le à nouveau pour le réimporter.", + "Audio-only input produces subtitle files only; no video is generated.": "Une entrée audio seule produit uniquement des fichiers de sous-titres ; aucune vidéo n'est générée.", + "Local": "Local", + "Cloud": "Cloud", + "ElevenLabs": "ElevenLabs", + "ElevenLabs API": "API ElevenLabs", + "Voice": "Voix", + "302ai API": "API 302ai", + "Azure TTS": "Azure TTS", + "OpenAI TTS": "OpenAI TTS", + "Fish TTS": "Fish TTS", + "SiliconFlow Fish TTS": "SiliconFlow Fish TTS", + "Edge TTS": "Edge TTS", + "GPT-SoVITS": "GPT-SoVITS", + "Custom TTS": "TTS personnalisé", + "SiliconFlow CosyVoice2": "SiliconFlow CosyVoice2", + "F5-TTS": "F5-TTS", + "Star on GitHub 🌟": "Mettre une étoile sur GitHub 🌟", + "Generate subtitle files": "Générer les fichiers de sous-titres", + "Drag and drop file here": "Glissez-déposez le fichier ici", + "Limit 4GB per file · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A": "Limite de 4GB par fichier · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A", + "Browse files": "Parcourir les fichiers", + "Media file detection failed: {error}": "Échec de la détection du fichier média : {error}", + "Clear output and reselect": "Effacer la sortie et resélectionner", + "Finalize subtitle outputs": "Finaliser les sorties de sous-titres", + "Finalize dubbing outputs": "Finaliser les sorties de doublage" } diff --git a/translations/ja.json b/translations/ja.json index 4595ba97..8afc3d5a 100644 --- a/translations/ja.json +++ b/translations/ja.json @@ -122,5 +122,35 @@ "Task completed!": "タスク完了!", "Task stopped": "タスクが停止されました", "Task error": "タスクエラー", - "OK": "OK" + "OK": "OK", + "Display Language 🌐": "表示言語 🌐", + "Downloading video...": "動画をダウンロード中...", + "Best": "最高品質", + "Upload local media file": "ローカルの音声/動画ファイルをアップロード", + "Upload was already processed. Delete and reselect to upload again.": "このアップロードはすでに処理済みです。再アップロードするには削除して選択し直してください。", + "Audio-only input produces subtitle files only; no video is generated.": "音声のみの入力では字幕ファイルだけを生成し、動画は生成しません。", + "Local": "ローカル", + "Cloud": "クラウド", + "ElevenLabs": "ElevenLabs", + "ElevenLabs API": "ElevenLabs API", + "Voice": "ボイス", + "302ai API": "302ai API", + "Azure TTS": "Azure TTS", + "OpenAI TTS": "OpenAI TTS", + "Fish TTS": "Fish TTS", + "SiliconFlow Fish TTS": "SiliconFlow Fish TTS", + "Edge TTS": "Edge TTS", + "GPT-SoVITS": "GPT-SoVITS", + "Custom TTS": "カスタム TTS", + "SiliconFlow CosyVoice2": "SiliconFlow CosyVoice2", + "F5-TTS": "F5-TTS", + "Star on GitHub 🌟": "GitHubでスター 🌟", + "Generate subtitle files": "字幕ファイルを生成", + "Drag and drop file here": "ここにファイルをドラッグ&ドロップ", + "Limit 4GB per file · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A": "1ファイル4GBまで · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A", + "Browse files": "ファイルを選択", + "Media file detection failed: {error}": "メディアファイルの検出に失敗しました: {error}", + "Clear output and reselect": "出力をクリアして再選択", + "Finalize subtitle outputs": "字幕出力を仕上げる", + "Finalize dubbing outputs": "吹き替え出力を仕上げる" } diff --git a/translations/ru.json b/translations/ru.json index ec7e97fe..bb27367a 100644 --- a/translations/ru.json +++ b/translations/ru.json @@ -122,5 +122,35 @@ "Task completed!": "Задача выполнена!", "Task stopped": "Задача остановлена", "Task error": "Ошибка задачи", - "OK": "OK" + "OK": "OK", + "Display Language 🌐": "Язык интерфейса 🌐", + "Downloading video...": "Загрузка видео...", + "Best": "Лучшее", + "Upload local media file": "Загрузить локальный аудио- или видеофайл", + "Upload was already processed. Delete and reselect to upload again.": "Этот загруженный файл уже обработан. Удалите и выберите его снова для повторной загрузки.", + "Audio-only input produces subtitle files only; no video is generated.": "При аудио-вводе создаются только файлы субтитров; видео не создается.", + "Local": "Локально", + "Cloud": "Облако", + "ElevenLabs": "ElevenLabs", + "ElevenLabs API": "API ElevenLabs", + "Voice": "Голос", + "302ai API": "API 302ai", + "Azure TTS": "Azure TTS", + "OpenAI TTS": "OpenAI TTS", + "Fish TTS": "Fish TTS", + "SiliconFlow Fish TTS": "SiliconFlow Fish TTS", + "Edge TTS": "Edge TTS", + "GPT-SoVITS": "GPT-SoVITS", + "Custom TTS": "Пользовательский TTS", + "SiliconFlow CosyVoice2": "SiliconFlow CosyVoice2", + "F5-TTS": "F5-TTS", + "Star on GitHub 🌟": "Поставить звезду на GitHub 🌟", + "Generate subtitle files": "Создать файлы субтитров", + "Drag and drop file here": "Перетащите файл сюда", + "Limit 4GB per file · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A": "Лимит 4GB на файл · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A", + "Browse files": "Выбрать файлы", + "Media file detection failed: {error}": "Не удалось определить медиафайл: {error}", + "Clear output and reselect": "Очистить output и выбрать заново", + "Finalize subtitle outputs": "Финализировать вывод субтитров", + "Finalize dubbing outputs": "Финализировать вывод дубляжа" } diff --git a/translations/translations.py b/translations/translations.py index 01e10186..e3538fff 100644 --- a/translations/translations.py +++ b/translations/translations.py @@ -10,6 +10,93 @@ "🇫🇷 Français": "fr", } +SUPPORTED_LANGUAGES = set(DISPLAY_LANGUAGES.values()) + + +def normalize_language_code(language): + if not language: + return None + + code = str(language).replace("_", "-").lower() + if code in {"zh", "zh-cn", "zh-hans", "zh-sg"}: + return "zh-CN" + if code in {"zh-hk", "zh-tw", "zh-mo", "zh-hant"}: + return "zh-HK" + + base_code = code.split("-")[0] + if base_code in SUPPORTED_LANGUAGES: + return base_code + return None + + +def _language_from_accept_language(header): + for item in (header or "").split(","): + code = normalize_language_code(item.split(";")[0].strip()) + if code: + return code + return None + + +def _config_language(): + try: + from core.utils.config_utils import load_key + + return normalize_language_code(load_key("display_language")) + except Exception: + return None + + +def _streamlit_language(): + try: + import streamlit as st + + if "_display_language" in st.session_state: + return normalize_language_code(st.session_state["_display_language"]) + + query_language = st.query_params.get("lang") + if isinstance(query_language, list): + query_language = query_language[0] if query_language else None + query_language = normalize_language_code(query_language) + if query_language: + return query_language + + return _language_from_accept_language(st.context.headers.get("accept-language", "")) + except Exception: + return None + + +def get_current_language(default="en"): + return _streamlit_language() or _config_language() or default + + +def init_display_language(): + language = get_current_language(default="en") + try: + import streamlit as st + + st.session_state.setdefault("_display_language", language) + except Exception: + pass + return language + + +def set_display_language(language): + language = normalize_language_code(language) or "en" + try: + import streamlit as st + + st.session_state["_display_language"] = language + st.query_params["lang"] = language + except Exception: + pass + try: + from core.utils.config_utils import update_key + + update_key("display_language", language) + except Exception: + pass + return language + # Load the language file based on user selection def load_translations(language="en"): with open(f'translations/{language}.json', 'r', encoding='utf-8') as file: @@ -17,9 +104,8 @@ def load_translations(language="en"): # Function to fetch the translation def translate(key): - from core.utils.config_utils import load_key try: - display_language = load_key("display_language") + display_language = get_current_language() translations = load_translations(display_language) translation = translations.get(key) if translation is None: diff --git a/translations/zh-CN.json b/translations/zh-CN.json index 052ab034..df77f89c 100644 --- a/translations/zh-CN.json +++ b/translations/zh-CN.json @@ -75,7 +75,7 @@ "Merge full audio": "合并完整音频", "Merge dubbing to the video": "将配音合并到视频中", "Audio processing complete! 🎇": "音频处理完成! 🎇", - "Hello, welcome to VideoLingo. If you encounter any issues, feel free to get instant answers with our Free QA Agent here! You can also try out our SaaS website at videolingo.io for free!": "欢迎来到VideoLingo。如果遇到任何问题,随时可以通过我们的免费问答助手 here 获取即时解答!还可以免费试用我们的SaaS网站 videolingo.io!", + "Hello, welcome to VideoLingo. If you encounter any issues, feel free to get instant answers with our Free QA Agent here! You can also try out our SaaS website at videolingo.io for free!": "欢迎来到VideoLingo。如果遇到任何问题,随时可以通过我们的免费问答助手 这里 获取即时解答!还可以免费试用我们的SaaS网站 videolingo.io!", "WhisperX Runtime": "WhisperX 运行环境", "Local runtime requires >8GB GPU, cloud runtime requires 302ai API key, elevenlabs runtime requires ElevenLabs API key": "本地运行需要>8GB显存GPU,云端运行需要302ai API密钥,elevenlabs运行需要ElevenLabs API密钥", "WhisperX 302ai API": "WhisperX 302ai API密钥", @@ -122,5 +122,42 @@ "Task completed!": "任务完成!", "Task stopped": "任务已停止", "Task error": "任务出错", - "OK": "确定" + "OK": "确定", + "Display Language 🌐": "显示语言 🌐", + "Downloading video...": "正在下载视频...", + "Best": "最佳", + "Upload local media file": "上传本地音视频文件", + "Upload was already processed. Delete and reselect to upload again.": "这个上传文件已经处理过。如需重新上传,请先删除并重新选择。", + "Audio-only input produces subtitle files only; no video is generated.": "纯音频输入只生成字幕文件,不会生成视频。", + "Local": "本地", + "Cloud": "云端", + "ElevenLabs": "ElevenLabs", + "ElevenLabs API": "ElevenLabs API密钥", + "Voice": "声音", + "302ai API": "302ai API密钥", + "Azure TTS": "Azure TTS", + "OpenAI TTS": "OpenAI TTS", + "Fish TTS": "Fish TTS", + "SiliconFlow Fish TTS": "硅基流动 Fish TTS", + "Edge TTS": "Edge TTS", + "GPT-SoVITS": "GPT-SoVITS", + "Custom TTS": "自定义 TTS", + "SiliconFlow CosyVoice2": "硅基流动 CosyVoice2", + "F5-TTS": "F5-TTS", + "Star on GitHub 🌟": "在 GitHub 点星 🌟", + "Generate subtitle files": "生成字幕文件", + "Drag and drop file here": "将文件拖到这里", + "Limit 4GB per file · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A": "单个文件限制 4GB · MP4、MOV、AVI、MKV、FLV、WMV、WEBM、WAV、MP3、FLAC、M4A", + "Browse files": "浏览文件", + "Media file detection failed: {error}": "媒体文件识别失败:{error}", + "Clear output and reselect": "清空输出并重新选择", + "Finalize subtitle outputs": "完成字幕产出收尾", + "Finalize dubbing outputs": "完成配音产出收尾", + "Subtitle length tuning": "字幕长度微调", + "These two values control how subtitles are cut. Smaller = more, shorter lines. Larger = fewer, longer lines.": "这两个值控制字幕怎么切。值越小,断句越多、每条越短;值越大,断句越少、每条越长。", + "max_split_length (rough cut, words/tokens per chunk)": "max_split_length(粗切,每段词/Token 数)", + "Suggested: 18-25. Below 18 cuts too finely and hurts translation; above 25 makes downstream subtitle splitting hard to align.": "建议 18-25。小于 18 会切得太碎,影响翻译质量;大于 25 会让后续字幕拆分难对齐。", + "max_length (max characters per subtitle line)": "max_length(每行字幕字符数上限)", + "Suggested: 50-90. Lower if a subtitle line looks crowded on screen; raise if lines are split too aggressively.": "建议 50-90。如果一行字幕看着挤就调小;如果一句话被拆得太碎就调大。", + "Restore defaults ({split}/{length})": "恢复默认值({split}/{length})" } diff --git a/translations/zh-HK.json b/translations/zh-HK.json index cdb4e012..56d2ea50 100644 --- a/translations/zh-HK.json +++ b/translations/zh-HK.json @@ -75,7 +75,7 @@ "Merge full audio": "合併完整音頻", "Merge dubbing to the video": "將配音合併到影片中", "Audio processing complete! 🎇": "音頻處理完成! 🎇", - "Hello, welcome to VideoLingo. If you encounter any issues, feel free to get instant answers with our Free QA Agent here! You can also try out our SaaS website at videolingo.io for free!": "歡迎來到VideoLingo。如果遇到任何問題,隨時可以透過我們的免費問答助手 here 獲取即時解答!還可以免費試用我們的SaaS網站 videolingo.io!", + "Hello, welcome to VideoLingo. If you encounter any issues, feel free to get instant answers with our Free QA Agent here! You can also try out our SaaS website at videolingo.io for free!": "歡迎來到VideoLingo。如果遇到任何問題,隨時可以透過我們的免費問答助手 這裡 獲取即時解答!還可以免費試用我們的SaaS網站 videolingo.io!", "WhisperX Runtime": "WhisperX 運行環境", "Local runtime requires >8GB GPU, cloud runtime requires 302ai API key, elevenlabs runtime requires ElevenLabs API key": "本地運行需要>8GB顯存GPU,雲端運行需要302ai API金鑰,elevenlabs運行需要ElevenLabs API金鑰", "WhisperX 302ai API": "WhisperX 302ai API金鑰", @@ -122,5 +122,42 @@ "Task completed!": "任務完成!", "Task stopped": "任務已停止", "Task error": "任務出錯", - "OK": "確定" + "OK": "確定", + "Display Language 🌐": "顯示語言 🌐", + "Downloading video...": "正在下載影片...", + "Best": "最佳", + "Upload local media file": "上傳本地音影片檔案", + "Upload was already processed. Delete and reselect to upload again.": "這個上傳檔案已經處理過。如需重新上傳,請先刪除並重新選擇。", + "Audio-only input produces subtitle files only; no video is generated.": "純音頻輸入只生成字幕檔案,不會生成影片。", + "Local": "本地", + "Cloud": "雲端", + "ElevenLabs": "ElevenLabs", + "ElevenLabs API": "ElevenLabs API金鑰", + "Voice": "聲音", + "302ai API": "302ai API金鑰", + "Azure TTS": "Azure TTS", + "OpenAI TTS": "OpenAI TTS", + "Fish TTS": "Fish TTS", + "SiliconFlow Fish TTS": "矽基流動 Fish TTS", + "Edge TTS": "Edge TTS", + "GPT-SoVITS": "GPT-SoVITS", + "Custom TTS": "自訂 TTS", + "SiliconFlow CosyVoice2": "矽基流動 CosyVoice2", + "F5-TTS": "F5-TTS", + "Star on GitHub 🌟": "在 GitHub 點星 🌟", + "Generate subtitle files": "生成字幕檔案", + "Drag and drop file here": "將檔案拖到這裡", + "Limit 4GB per file · MP4, MOV, AVI, MKV, FLV, WMV, WEBM, WAV, MP3, FLAC, M4A": "單個檔案限制 4GB · MP4、MOV、AVI、MKV、FLV、WMV、WEBM、WAV、MP3、FLAC、M4A", + "Browse files": "瀏覽檔案", + "Media file detection failed: {error}": "媒體檔案識別失敗:{error}", + "Clear output and reselect": "清空輸出並重新選擇", + "Finalize subtitle outputs": "完成字幕產出收尾", + "Finalize dubbing outputs": "完成配音產出收尾", + "Subtitle length tuning": "字幕長度微調", + "These two values control how subtitles are cut. Smaller = more, shorter lines. Larger = fewer, longer lines.": "這兩個值控制字幕怎麼切。值越小,斷句越多、每條越短;值越大,斷句越少、每條越長。", + "max_split_length (rough cut, words/tokens per chunk)": "max_split_length(粗切,每段詞/Token 數)", + "Suggested: 18-25. Below 18 cuts too finely and hurts translation; above 25 makes downstream subtitle splitting hard to align.": "建議 18-25。小於 18 會切得太碎,影響翻譯品質;大於 25 會讓後續字幕拆分難對齊。", + "max_length (max characters per subtitle line)": "max_length(每行字幕字元數上限)", + "Suggested: 50-90. Lower if a subtitle line looks crowded on screen; raise if lines are split too aggressively.": "建議 50-90。如果一行字幕看著擠就調小;如果一句話被拆得太碎就調大。", + "Restore defaults ({split}/{length})": "恢復預設值({split}/{length})" } From e0f39bdca56fcd556adc5d4b0b5c70051b01136e Mon Sep 17 00:00:00 2001 From: Alex Liang Date: Thu, 11 Jun 2026 15:38:50 +0800 Subject: [PATCH 2/2] feat: v3.0.3 installer refactor and shared env MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit English: - Add stage-based installer.py with resumable dependency setup and health checks - Support shared uv venv via setup_env.py and OneKeyStart.bat - Make OneKeyStart.bat run pre-launch checks and repair incomplete environments - Relax spaCy patch pin and make Demucs installation more resilient - Fall back to global HuggingFace cache when project WhisperX cache is incomplete - Bump version metadata to v3.0.3 中文: - 新增分阶段 installer.py,支持可重复执行的依赖安装和环境健康检查 - setup_env.py 与 OneKeyStart.bat 支持共享 uv venv - OneKeyStart.bat 启动前自动检查环境,缺失或过期时自动修复 - 放宽 spaCy patch 版本约束,提升 Demucs 安装容错 - 项目内 WhisperX 缓存不完整时自动回退到全局 HuggingFace 缓存 - 版本元数据升级到 v3.0.3 --- OneKeyStart.bat | 90 ++++++- config.yaml | 2 +- core/asr_backend/whisperX_local.py | 42 +++- install.py | 266 +------------------- installer.py | 389 +++++++++++++++++++++++++++++ requirements.txt | 4 +- setup.py | 2 +- setup_env.py | 288 +++++++++------------ 8 files changed, 651 insertions(+), 432 deletions(-) create mode 100644 installer.py diff --git a/OneKeyStart.bat b/OneKeyStart.bat index f58aff66..ab35955f 100644 --- a/OneKeyStart.bat +++ b/OneKeyStart.bat @@ -1,11 +1,85 @@ @echo off -chcp 65001 >nul 2>&1 -call conda activate videolingo 2>nul -set PYTHONWARNINGS=ignore -python "%~dp0launch.py" -if %errorlevel% neq 0 ( - echo. - echo Pre-flight checks or Streamlit failed. See logs\ for details. - echo. +setlocal EnableExtensions +cd /D "%~dp0" + +for /F "tokens=1,2 delims=#" %%A in ('"prompt #$H#$E# & echo on & for %%B in (1) do rem"') do set "ESC=%%B" +set "C_RESET=%ESC%[0m" +set "C_GREEN=%ESC%[32m" +set "C_YELLOW=%ESC%[33m" +set "C_RED=%ESC%[31m" +set "C_CYAN=%ESC%[36m" +set "C_BOLD=%ESC%[1m" + +if not exist "logs" mkdir "logs" +for /f "tokens=2 delims==" %%I in ('wmic os get localdatetime /value') do set dt=%%I +set "LOGFILE=logs\videolingo_%dt:~0,8%_%dt:~8,6%.log" +set "CHECK_ONLY=" +if /I "%~1"=="--check-only" set "CHECK_ONLY=1" + +echo [%date% %time%] VideoLingo starting... > "%LOGFILE%" +echo %C_CYAN%Log file:%C_RESET% %LOGFILE% + +set "VENV_LABEL=" +set "VENV_PY=" + +set "SHARED_VENV=%USERPROFILE%\.venvs\videolingo" +if exist "%SHARED_VENV%\Scripts\python.exe" ( + set "VENV_LABEL=shared venv" + set "VENV_PY=%SHARED_VENV%\Scripts\python.exe" + goto venv_found ) + +if exist ".venv\Scripts\python.exe" ( + set "VENV_LABEL=project .venv" + set "VENV_PY=.venv\Scripts\python.exe" + goto venv_found +) + +where conda >nul 2>nul +if %errorlevel%==0 ( + echo %C_YELLOW%No uv venv found, falling back to Conda env "videolingo"...%C_RESET% + call conda activate videolingo + python installer.py --check --quiet +if errorlevel 1 ( + echo %C_YELLOW%Conda env is incomplete or outdated. Repairing...%C_RESET% + python installer.py --yes + if errorlevel 1 goto install_failed +) + if defined CHECK_ONLY ( + echo %C_GREEN%Environment check passed. --check-only set, not starting Streamlit.%C_RESET% + goto end + ) + echo %C_GREEN%Starting VideoLingo with Conda...%C_RESET% + python -m streamlit run st.py 2>&1 | powershell -NoProfile -Command "$input | Tee-Object -FilePath '%LOGFILE%' -Append" + goto end +) + +echo %C_RED%ERROR: No usable VideoLingo environment found.%C_RESET% +echo Run one of these first: +echo python setup_env.py --shared +echo python setup_env.py +goto end + +:venv_found +echo %C_GREEN%Detected %VENV_LABEL%:%C_RESET% %VENV_PY% +"%VENV_PY%" installer.py --check --quiet +if errorlevel 1 ( + echo %C_YELLOW%Environment is incomplete or outdated. Repairing with installer.py...%C_RESET% + "%VENV_PY%" installer.py --yes + if errorlevel 1 goto install_failed +) + +if defined CHECK_ONLY ( + echo %C_GREEN%Environment check passed. --check-only set, not starting Streamlit.%C_RESET% + goto end +) + +echo %C_GREEN%Starting VideoLingo with %VENV_LABEL%...%C_RESET% +"%VENV_PY%" -m streamlit run st.py 2>&1 | powershell -NoProfile -Command "$input | Tee-Object -FilePath '%LOGFILE%' -Append" +goto end + +:install_failed +echo %C_RED%Install/repair failed. Check the messages above and the log file.%C_RESET% + +:end pause diff --git a/config.yaml b/config.yaml index a91bacd0..e9d30c44 100644 --- a/config.yaml +++ b/config.yaml @@ -1,7 +1,7 @@ # * Settings marked with * are advanced settings that won't appear in the Streamlit page and can only be modified manually in config.py # recommend to set in streamlit page # ------------------- -# version: "3.0.2" +# version: "3.0.3" # author: "Huanshere" # ------------------- diff --git a/core/asr_backend/whisperX_local.py b/core/asr_backend/whisperX_local.py index da96c7b8..2caab063 100644 --- a/core/asr_backend/whisperX_local.py +++ b/core/asr_backend/whisperX_local.py @@ -4,6 +4,7 @@ import subprocess import torch import functools +from pathlib import Path warnings.filterwarnings("ignore") @@ -34,6 +35,22 @@ def _patched_torch_load(*args, **kwargs): from core.utils import * MODEL_DIR = load_key("model_dir") + +def _hf_cache_dir_for_repo(cache_root, repo_id): + return Path(cache_root) / f"models--{repo_id.replace('/', '--')}" + + +def _has_complete_hf_snapshot(cache_root, repo_id): + repo_dir = _hf_cache_dir_for_repo(cache_root, repo_id) + snapshots = repo_dir / "snapshots" + if not snapshots.exists(): + return False + required_files = {"config.json", "model.bin", "tokenizer.json"} + for snapshot in snapshots.iterdir(): + if snapshot.is_dir() and all((snapshot / name).exists() for name in required_files): + return True + return False + @except_handler("failed to check hf mirror", default_return=None) def check_hf_mirror(): mirrors = {'Official': 'huggingface.co', 'Mirror': 'hf-mirror.com'} @@ -76,6 +93,7 @@ def transcribe_audio(raw_audio_file, vocal_audio_file, start, end): rprint(f"[cyan]📦 Batch size:[/cyan] {batch_size}, [cyan]⚙️ Compute type:[/cyan] {compute_type}") rprint(f"[green]▶️ Starting WhisperX for segment {start:.2f}s to {end:.2f}s...[/green]") + download_root = MODEL_DIR if WHISPER_LANGUAGE == 'zh': model_name = "Huan69/Belle-whisper-large-v3-zh-punct-fasterwhisper" local_model = os.path.join(MODEL_DIR, "Belle-whisper-large-v3-zh-punct-fasterwhisper") @@ -86,14 +104,34 @@ def transcribe_audio(raw_audio_file, vocal_audio_file, start, end): if os.path.exists(local_model): rprint(f"[green]📥 Loading local WHISPER model:[/green] {local_model} ...") model_name = local_model + download_root = None else: rprint(f"[green]📥 Using WHISPER model from HuggingFace:[/green] {model_name} ...") + # If the project-local cache is missing or only partially downloaded, + # let HuggingFace use the default global cache. This avoids getting + # stuck on a half-created ./_model_cache after a network interruption. + repo_id = model_name if "/" in model_name else f"Systran/faster-whisper-{model_name}" + if not _has_complete_hf_snapshot(MODEL_DIR, repo_id): + rprint( + "[yellow]⚠️ Project model cache is incomplete; " + "falling back to the global HuggingFace cache.[/yellow]" + ) + download_root = None vad_options = {"vad_onset": 0.500,"vad_offset": 0.363} asr_options = {"temperatures": [0],"initial_prompt": "",} whisper_language = None if 'auto' in WHISPER_LANGUAGE else WHISPER_LANGUAGE rprint("[bold yellow] You can ignore warning of `Model was trained with torch 1.10.0+cu102, yours is 2.0.0+cu118...`[/bold yellow]") - model = whisperx.load_model(model_name, device, compute_type=compute_type, language=whisper_language, vad_options=vad_options, asr_options=asr_options, download_root=MODEL_DIR) + load_kwargs = dict( + device=device, + compute_type=compute_type, + language=whisper_language, + vad_options=vad_options, + asr_options=asr_options, + ) + if download_root: + load_kwargs["download_root"] = download_root + model = whisperx.load_model(model_name, **load_kwargs) def load_audio_segment(audio_file, start, end): # Use whisperx's ffmpeg-based loader instead of librosa.load() which @@ -147,4 +185,4 @@ def load_audio_segment(audio_file, start, end): word['start'] += start if 'end' in word: word['end'] += start - return result \ No newline at end of file + return result diff --git a/install.py b/install.py index 9e38f035..c0e87953 100644 --- a/install.py +++ b/install.py @@ -1,263 +1,19 @@ -import os, sys -import platform -import subprocess -sys.path.append(os.path.dirname(os.path.abspath(__file__))) +"""Compatibility wrapper for the stage-based installer. -ascii_logo = """ -__ ___ _ _ _ -\ \ / (_) __| | ___ ___ | | (_)_ __ __ _ ___ - \ \ / /| |/ _` |/ _ \/ _ \| | | | '_ \ / _` |/ _ \ - \ V / | | (_| | __/ (_) | |___| | | | | (_| | (_) | - \_/ |_|\__,_|\___|\___/|_____|_|_| |_|\__, |\___/ - |___/ +Historically users ran ``python install.py`` and the app launched at the end. +Keep that behavior here while moving the real installation logic to +``installer.py`` so setup_env.py and launchers can reuse it safely. """ -def install_package(*packages): - subprocess.check_call([sys.executable, "-m", "pip", "install", *packages]) +from __future__ import annotations -def check_nvidia_gpu(): - install_package("nvidia-ml-py") - import pynvml - from translations.translations import translate as t - initialized = False - try: - pynvml.nvmlInit() - initialized = True - device_count = pynvml.nvmlDeviceGetCount() - if device_count > 0: - print(t("Detected NVIDIA GPU(s)")) - for i in range(device_count): - handle = pynvml.nvmlDeviceGetHandleByIndex(i) - name = pynvml.nvmlDeviceGetName(handle) - print(f"GPU {i}: {name}") - return True - else: - print(t("No NVIDIA GPU detected")) - return False - except pynvml.NVMLError: - print(t("No NVIDIA GPU detected or NVIDIA drivers not properly installed")) - return False - finally: - if initialized: - pynvml.nvmlShutdown() +import sys -def check_ffmpeg(): - from rich.console import Console - from rich.panel import Panel - from translations.translations import translate as t - console = Console() +from installer import main - try: - # Check if ffmpeg is installed - subprocess.run(['ffmpeg', '-version'], stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=True) - console.print(Panel(t("✅ FFmpeg is already installed"), style="green")) - except (subprocess.CalledProcessError, FileNotFoundError): - system = platform.system() - install_cmd = "" - - if system == "Windows": - install_cmd = "choco install ffmpeg" - extra_note = t("Install Chocolatey first (https://chocolatey.org/)") - elif system == "Darwin": - install_cmd = "brew install ffmpeg" - extra_note = t("Install Homebrew first (https://brew.sh/)") - elif system == "Linux": - install_cmd = "sudo apt install ffmpeg # Ubuntu/Debian\nsudo yum install ffmpeg # CentOS/RHEL" - extra_note = t("Use your distribution's package manager") - - console.print(Panel.fit( - t("❌ FFmpeg not found\n\n") + - f"{t('🛠️ Install using:')}\n[bold cyan]{install_cmd}[/bold cyan]\n\n" + - f"{t('💡 Note:')}\n{extra_note}\n\n" + - f"{t('🔄 After installing FFmpeg, please run this installer again:')}\n[bold cyan]python install.py[/bold cyan]", - style="red" - )) - raise SystemExit(t("FFmpeg is required. Please install it and run the installer again.")) - - # Warn if ffmpeg lacks libmp3lame (common with conda-forge builds) - try: - result = subprocess.run(['ffmpeg', '-encoders'], capture_output=True, text=True, timeout=10) - if 'libmp3lame' not in result.stdout: - console.print(Panel.fit( - "⚠️ Your ffmpeg does not include [bold]libmp3lame[/bold] (MP3 encoder).\n" - "This is common with conda-forge ffmpeg builds.\n\n" - "VideoLingo will fall back to WAV encoding automatically, but for\n" - "smaller intermediate files, consider installing a full ffmpeg:\n\n" - "[bold cyan]" + ( - "winget install Gyan.FFmpeg" if platform.system() == "Windows" - else "brew install ffmpeg" if platform.system() == "Darwin" - else "sudo apt install ffmpeg" - ) + "[/bold cyan]", - style="yellow" - )) - except Exception: - pass - -def _detect_cuda_version_from_smi(): - """Detect CUDA version from nvidia-smi output (driver's CUDA capability).""" - import re - try: - result = subprocess.run( - ["nvidia-smi"], capture_output=True, text=True, timeout=10 - ) - m = re.search(r"CUDA Version:\s*(\d+)\.(\d+)", result.stdout) - if m: - return (int(m.group(1)), int(m.group(2))) - except Exception: - pass - return None - - -def _detect_cuda_index(): - """Detect the CUDA version and return the best PyTorch wheel index URL. - Falls back to cu126 when detection fails. - - For RTX 50 series (Blackwell architecture, compute capability 10.0+), - we need PyTorch wheels compiled with CUDA 12.8+ that include sm_100 kernels. - - We prefer nvidia-smi (driver CUDA version) over nvcc (toolkit version) because: - - Driver version determines what CUDA features the GPU can run at runtime - - Toolkit version is for compilation, not runtime compatibility - - Blackwell GPUs need cu129+ wheels even if user has older CUDA toolkit installed - """ - cuda_version = _detect_cuda_version_from_smi() - - # Map CUDA major.minor to PyTorch wheel index. - # For CUDA 13.x (RTX 50 series / Blackwell), use cu129 which includes sm_100 kernels. - INDEX = "https://download.pytorch.org/whl" - CU_TAGS = [ - ((13, 0), "cu129"), # CUDA 13.x (Blackwell / RTX 50 series) - ((12, 9), "cu129"), # CUDA 12.9+ - ((12, 8), "cu128"), # CUDA 12.8+ - ((12, 6), "cu126"), # CUDA 12.6+ - ] - - if cuda_version: - for min_ver, tag in CU_TAGS: - if cuda_version >= min_ver: - return f"{INDEX}/{tag}" - - # Default: cu126 is the broadest CUDA 12 index for PyTorch 2.8 - return f"{INDEX}/cu126" - -def main(): - install_package("requests", "rich", "ruamel.yaml", "InquirerPy") - from rich.console import Console - from rich.panel import Panel - from rich.box import DOUBLE - from InquirerPy import inquirer - from translations.translations import translate as t - from translations.translations import DISPLAY_LANGUAGES - from core.utils.config_utils import load_key, update_key - from core.utils.decorator import except_handler - - console = Console() - - width = max(len(line) for line in ascii_logo.splitlines()) + 4 - welcome_panel = Panel( - ascii_logo, - width=width, - box=DOUBLE, - title="[bold green]🌏[/bold green]", - border_style="bright_blue" - ) - console.print(welcome_panel) - # Language selection - current_language = load_key("display_language") - # Find the display name for current language code - current_display = next((k for k, v in DISPLAY_LANGUAGES.items() if v == current_language), "🇬🇧 English") - selected_language = DISPLAY_LANGUAGES[inquirer.select( - message="Select language / 选择语言 / 選擇語言 / 言語を選択 / Seleccionar idioma / Sélectionner la langue / Выберите язык:", - choices=list(DISPLAY_LANGUAGES.keys()), - default=current_display - ).execute()] - update_key("display_language", selected_language) - - console.print(Panel.fit(t("🚀 Starting Installation"), style="bold magenta")) - - # Configure mirrors - # add a check to ask user if they want to configure mirrors - if inquirer.confirm( - message=t("Do you need to auto-configure PyPI mirrors? (Recommended if you have difficulty accessing pypi.org)"), - default=True - ).execute(): - from core.utils.pypi_autochoose import main as choose_mirror - choose_mirror() - - # Detect system and GPU - has_gpu = platform.system() != 'Darwin' and check_nvidia_gpu() - - @except_handler("Failed to install PyTorch", retry=1, delay=5) - def install_pytorch(): - if has_gpu: - console.print(Panel(t("🎮 NVIDIA GPU detected, installing CUDA version of PyTorch..."), style="cyan")) - cuda_index = _detect_cuda_index() - console.print(f"[cyan]📦 Using PyTorch index:[/cyan] {cuda_index}") - subprocess.check_call([sys.executable, "-m", "pip", "install", "torch==2.8.0", "torchaudio==2.8.0", "--index-url", cuda_index]) - else: - system_name = "🍎 MacOS" if platform.system() == 'Darwin' else "💻 No NVIDIA GPU" - console.print(Panel(t(f"{system_name} detected, installing CPU version of PyTorch... Note: it might be slow during whisperX transcription."), style="cyan")) - subprocess.check_call([sys.executable, "-m", "pip", "install", "torch==2.8.0", "torchaudio==2.8.0"]) - - @except_handler("Failed to install project", retry=1, delay=5) - def install_requirements(): - # Install demucs separately with --no-deps to avoid its outdated - # torchaudio<2.2 constraint conflicting with whisperx's torchaudio>=2.5.1. - # demucs works fine with torchaudio 2.6.0 at runtime. - console.print(Panel(t("Installing demucs (--no-deps to avoid torchaudio conflict)..."), style="cyan")) - subprocess.check_call([sys.executable, "-m", "pip", "install", "--no-deps", "demucs[dev]@git+https://github.com/adefossez/demucs"]) - # demucs --no-deps skips its own dependencies; install the ones it - # actually needs at runtime that aren't already pulled in elsewhere. - console.print(Panel(t("Installing demucs runtime dependencies..."), style="cyan")) - subprocess.check_call([sys.executable, "-m", "pip", "install", "dora-search", "openunmix", "lameenc"]) - - console.print(Panel(t("Installing project in editable mode using `pip install -e .`"), style="cyan")) - subprocess.check_call([sys.executable, "-m", "pip", "install", "-e", "."], env={**os.environ, "PYTHONIOENCODING": "utf-8"}) - - @except_handler("Failed to install Noto fonts") - def install_noto_font(): - # Detect Linux distribution type - if os.path.exists('/etc/debian_version'): - # Debian/Ubuntu systems - cmd = ['sudo', 'apt-get', 'install', '-y', 'fonts-noto'] - pkg_manager = "apt-get" - elif os.path.exists('/etc/redhat-release'): - # RHEL/CentOS/Fedora systems - cmd = ['sudo', 'yum', 'install', '-y', 'google-noto*'] - pkg_manager = "yum" - else: - console.print("Warning: Unrecognized Linux distribution, please install Noto fonts manually", style="yellow") - return - - subprocess.run(cmd, check=True) - console.print(f"✅ Successfully installed Noto fonts using {pkg_manager}", style="green") - - if platform.system() == 'Linux': - install_noto_font() - - install_pytorch() - install_requirements() - check_ffmpeg() - - # First panel with installation complete and startup command - panel1_text = ( - t("Installation completed") + "\n\n" + - t("Now I will run this command to start the application:") + "\n" + - "[bold]streamlit run st.py[/bold]\n" + - t("Note: First startup may take up to 1 minute") - ) - console.print(Panel(panel1_text, style="bold green")) - - # Second panel with troubleshooting tips - panel2_text = ( - t("If the application fails to start:") + "\n" + - "1. " + t("Check your network connection") + "\n" + - "2. " + t("Re-run the installer: [bold]python install.py[/bold]") - ) - console.print(Panel(panel2_text, style="yellow")) - - # start the application - subprocess.Popen([sys.executable, "-m", "streamlit", "run", "st.py"]) if __name__ == "__main__": - main() + args = sys.argv[1:] + if "--check" not in args and "--launch" not in args and "--no-launch" not in args: + args.append("--launch") + raise SystemExit(main(args)) diff --git a/installer.py b/installer.py new file mode 100644 index 00000000..aa58e235 --- /dev/null +++ b/installer.py @@ -0,0 +1,389 @@ +"""Resumable VideoLingo installer and environment checker. + +This script is intentionally split from setup_env.py: +- setup_env.py creates/selects the venv. +- installer.py installs packages inside the selected venv. +- OneKeyStart.bat starts the app and can call ``installer.py --check``. + +The installer is stage-based and safe to rerun. Network-sensitive optional +packages (Demucs, spaCy model downloads) warn instead of breaking the whole +installation. +""" + +from __future__ import annotations + +import argparse +import hashlib +import importlib +import importlib.metadata as metadata +import json +import os +import platform +import re +import shutil +import subprocess +import sys +import time +from pathlib import Path + + +ROOT = Path(__file__).resolve().parent +STATE_FILE = Path(sys.prefix) / ".videolingo-install.json" +REQUIREMENTS = ROOT / "requirements.txt" + +TORCH_VERSION = "2.8.0" +TORCH_INDEX = "https://download.pytorch.org/whl" +BOOTSTRAP_PACKAGES = ["requests", "rich", "ruamel.yaml", "InquirerPy", "packaging"] +FILTERED_REQUIREMENTS = {"spacy", "whisperx"} +DEMUX_GIT = "demucs[dev]@git+https://github.com/adefossez/demucs@b9ab48cad45976ba42b2ff17b229c071f0df9390" + + +def run(cmd: list[str], retries: int = 0, env: dict[str, str] | None = None) -> None: + for attempt in range(retries + 1): + print(" > " + " ".join(str(x) for x in cmd), flush=True) + proc = subprocess.run(cmd, cwd=ROOT, env=env) + if proc.returncode == 0: + return + if attempt < retries: + delay = min(20, 3 * (attempt + 1)) + print(f" Command failed, retrying in {delay}s ({attempt + 1}/{retries})...") + time.sleep(delay) + raise subprocess.CalledProcessError(proc.returncode, cmd) + + +def pip_install(packages: list[str], retries: int = 2, extra_args: list[str] | None = None) -> None: + if not packages: + return + cmd = [ + sys.executable, + "-m", + "pip", + "install", + "--disable-pip-version-check", + "--prefer-binary", + "--retries", + "5", + "--timeout", + "120", + ] + if extra_args: + cmd.extend(extra_args) + cmd.extend(packages) + env = os.environ.copy() + env.setdefault("PIP_NO_INPUT", "1") + run(cmd, retries=retries, env=env) + + +def soft_pip_install(packages: list[str], retries: int = 1, extra_args: list[str] | None = None) -> bool: + try: + pip_install(packages, retries=retries, extra_args=extra_args) + return True + except Exception as exc: + print(f" Warning: optional install failed: {exc}") + return False + + +def package_version(name: str) -> str | None: + try: + return metadata.version(name) + except metadata.PackageNotFoundError: + return None + + +def package_ok(name: str, prefix: str | None = None) -> bool: + version = package_version(name) + if version is None: + return False + return prefix is None or version.split("+")[0].startswith(prefix) + + +def import_ok(module: str) -> bool: + try: + importlib.import_module(module) + return True + except Exception: + return False + + +def requirements_hash() -> str: + h = hashlib.sha256() + h.update(REQUIREMENTS.read_bytes()) + h.update(f"torch={TORCH_VERSION}\n".encode()) + h.update(DEMUX_GIT.encode()) + return h.hexdigest() + + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {} + try: + return json.loads(STATE_FILE.read_text(encoding="utf-8")) + except Exception: + return {} + + +def save_state() -> None: + data = { + "requirements_hash": requirements_hash(), + "python": sys.version.split()[0], + "torch": package_version("torch"), + "torchaudio": package_version("torchaudio"), + "spacy": package_version("spacy"), + "whisperx": package_version("whisperx"), + "demucs": package_version("demucs"), + "updated_at": time.strftime("%Y-%m-%d %H:%M:%S"), + } + STATE_FILE.write_text(json.dumps(data, indent=2), encoding="utf-8") + + +def requirement_name(line: str) -> str | None: + line = line.strip() + if not line or line.startswith("#") or line.startswith("-"): + return None + line = line.split(";", 1)[0].strip() + name = re.split(r"\s*(?:==|>=|<=|~=|!=|>|<|\[)", line, maxsplit=1)[0] + return name.strip().lower().replace("_", "-") or None + + +def read_base_requirements() -> list[str]: + reqs: list[str] = [] + for raw in REQUIREMENTS.read_text(encoding="utf-8").splitlines(): + name = requirement_name(raw) + if not name or name in FILTERED_REQUIREMENTS: + continue + reqs.append(raw.strip()) + return reqs + + +def detect_nvidia_gpu() -> bool: + if platform.system() == "Darwin": + return False + try: + result = subprocess.run(["nvidia-smi"], capture_output=True, text=True, timeout=10) + return result.returncode == 0 + except Exception: + return False + + +def detect_cuda_version_from_smi() -> tuple[int, int] | None: + try: + result = subprocess.run(["nvidia-smi"], capture_output=True, text=True, timeout=10) + match = re.search(r"CUDA Version:\s*(\d+)\.(\d+)", result.stdout) + if match: + return int(match.group(1)), int(match.group(2)) + except Exception: + pass + return None + + +def detect_torch_index() -> str: + cuda_version = detect_cuda_version_from_smi() + tags = [ + ((13, 0), "cu129"), + ((12, 9), "cu129"), + ((12, 8), "cu128"), + ((12, 6), "cu126"), + ] + if cuda_version: + for minimum, tag in tags: + if cuda_version >= minimum: + return f"{TORCH_INDEX}/{tag}" + return f"{TORCH_INDEX}/cu126" + + +def install_bootstrap() -> None: + print("\n[1/7] Bootstrap installer packages") + missing = [pkg for pkg in BOOTSTRAP_PACKAGES if package_version(pkg) is None] + if missing: + pip_install(missing) + else: + print(" Bootstrap packages already installed.") + + +def maybe_configure_mirror(auto_mirror: bool) -> None: + if not auto_mirror: + return + print("\n[2/7] Configure PyPI mirror") + try: + from core.utils.pypi_autochoose import main as choose_mirror + + choose_mirror() + except Exception as exc: + print(f" Warning: mirror auto-config failed: {exc}") + + +def install_torch(force: bool = False) -> None: + print("\n[3/7] Install PyTorch / torchaudio") + if not force and package_ok("torch", TORCH_VERSION) and package_ok("torchaudio", TORCH_VERSION): + print(f" torch {package_version('torch')} and torchaudio {package_version('torchaudio')} already installed.") + return + packages = [f"torch=={TORCH_VERSION}", f"torchaudio=={TORCH_VERSION}"] + if detect_nvidia_gpu(): + index = detect_torch_index() + print(f" NVIDIA GPU detected. Using PyTorch index: {index}") + pip_install(packages, retries=3, extra_args=["--index-url", index]) + else: + print(" No NVIDIA GPU detected. Installing CPU PyTorch wheels.") + pip_install(packages, retries=3) + + +def install_base_requirements(force: bool = False) -> None: + print("\n[4/7] Install base requirements") + state = load_state() + current_hash = requirements_hash() + previous_hash = state.get("requirements_hash") + if not force and previous_hash == current_hash and health_check(quiet=True, require_demucs=False, check_state=False) == 0: + print(" Environment already matches requirements hash; skipping base install.") + return + if not force and previous_hash is None and health_check(quiet=True, require_demucs=False, check_state=False) == 0: + print(" Packages are already healthy; writing fresh install state later.") + return + if previous_hash and previous_hash != current_hash: + print(" requirements.txt changed; syncing base requirements.") + pip_install(read_base_requirements(), retries=3) + + +def install_spacy(force: bool = False) -> None: + print("\n[5/7] Install spaCy") + if not force and package_ok("spacy", "3.8."): + print(f" spacy {package_version('spacy')} already installed.") + return + # Keep this flexible. Exact spaCy patch releases can disappear for a Python + # minor version, which made plain `pip install -r requirements.txt` brittle. + pip_install(["spacy>=3.8.7,<3.9"], retries=3) + + +def install_whisperx(force: bool = False) -> None: + print("\n[6/7] Install WhisperX") + if not force and package_version("whisperx") is not None: + print(f" whisperx {package_version('whisperx')} already installed.") + return + pip_install(["whisperx>=3.8.1"], retries=3) + + +def install_demucs(force: bool = False, require: bool = False) -> None: + print("\n[7/7] Install Demucs (optional)") + if not force and package_version("demucs") is not None and import_ok("demucs.api"): + print(f" demucs {package_version('demucs')} already installed.") + return + pip_install(["dora-search", "openunmix", "lameenc"], retries=3) + if soft_pip_install([DEMUX_GIT], retries=2, extra_args=["--no-deps"]): + return + print(" Falling back to PyPI demucs. Demucs is optional; install can continue if this fails.") + ok = soft_pip_install(["demucs==4.0.1"], retries=2, extra_args=["--no-deps"]) + if require and not ok: + raise RuntimeError("Demucs installation failed") + + +def install_project_metadata() -> None: + print("\n[post] Register project metadata (no dependency resolution)") + soft_pip_install(["-e", str(ROOT)], retries=1, extra_args=["--no-deps"]) + + +def check_ffmpeg() -> bool: + if not shutil.which("ffmpeg"): + print(" ERROR: ffmpeg not found in PATH.") + if platform.system() == "Windows": + print(" Install with: winget install Gyan.FFmpeg") + elif platform.system() == "Darwin": + print(" Install with: brew install ffmpeg") + else: + print(" Install with your distribution package manager, e.g. sudo apt install ffmpeg") + return False + return True + + +def health_check(quiet: bool = False, require_demucs: bool = False, check_state: bool = True) -> int: + errors: list[str] = [] + warnings: list[str] = [] + state = load_state() + if check_state: + if state.get("requirements_hash") and state.get("requirements_hash") != requirements_hash(): + errors.append("requirements changed since the last install; rerun installer.py") + elif not state.get("requirements_hash"): + errors.append("install state file is missing; rerun installer.py once to enable change detection") + required = { + "streamlit": None, + "openai": None, + "pandas": None, + "torch": TORCH_VERSION, + "torchaudio": TORCH_VERSION, + "spacy": "3.8.", + "whisperx": None, + } + for package, prefix in required.items(): + version = package_version(package) + if version is None: + errors.append(f"missing package: {package}") + elif prefix and not version.split("+")[0].startswith(prefix): + errors.append(f"{package} version {version} does not match expected {prefix}*") + if require_demucs and package_version("demucs") is None: + errors.append("missing optional package required by flag: demucs") + elif package_version("demucs") is None: + warnings.append("demucs is not installed; vocal separation will be unavailable") + if not shutil.which("ffmpeg"): + errors.append("ffmpeg not found in PATH") + if not quiet: + print("\nEnvironment check") + for package in ["streamlit", "torch", "torchaudio", "spacy", "whisperx", "demucs"]: + print(f" {package}: {package_version(package) or 'missing'}") + for warning in warnings: + print(f" WARN: {warning}") + for error in errors: + print(f" ERROR: {error}") + return 1 if errors else 0 + + +def launch_streamlit() -> int: + env = os.environ.copy() + env["PYTHONWARNINGS"] = "ignore" + return subprocess.run([sys.executable, "-m", "streamlit", "run", "st.py"], cwd=ROOT, env=env).returncode + + +def install_all(args: argparse.Namespace) -> int: + install_bootstrap() + maybe_configure_mirror(args.auto_mirror) + install_torch(force=args.force) + install_base_requirements(force=args.force) + install_spacy(force=args.force) + install_whisperx(force=args.force) + if not args.skip_demucs: + install_demucs(force=args.force, require=args.require_demucs) + install_project_metadata() + ffmpeg_ok = check_ffmpeg() + save_state() + status = health_check(require_demucs=args.require_demucs) + if not ffmpeg_ok or status != 0: + return 1 + if args.launch: + return launch_streamlit() + print("\nInstall complete. Start with OneKeyStart.bat or: python -m streamlit run st.py") + return 0 + + +def build_parser() -> argparse.ArgumentParser: + parser = argparse.ArgumentParser(description="Install or check VideoLingo dependencies") + parser.add_argument("--check", action="store_true", help="check environment health only") + parser.add_argument("--quiet", action="store_true", help="quiet check output") + parser.add_argument("--force", action="store_true", help="force reinstall staged packages") + parser.add_argument("--auto-mirror", action="store_true", help="auto-select and configure a PyPI mirror") + parser.add_argument("--skip-demucs", action="store_true", help="skip optional Demucs install") + parser.add_argument("--require-demucs", action="store_true", help="fail if Demucs cannot be installed") + parser.add_argument("--launch", action="store_true", help="launch Streamlit after a successful install") + parser.add_argument("--yes", action="store_true", help="accepted for non-interactive wrappers") + parser.add_argument("--no-launch", action="store_true", help="compatibility alias; launching is opt-in") + return parser + + +def main(argv: list[str] | None = None) -> int: + parser = build_parser() + args = parser.parse_args(argv) + if args.no_launch: + args.launch = False + if args.check: + return health_check(quiet=args.quiet, require_demucs=args.require_demucs) + return install_all(args) + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/requirements.txt b/requirements.txt index 2050d02c..cd30fb20 100644 --- a/requirements.txt +++ b/requirements.txt @@ -13,7 +13,9 @@ PyYAML==6.0.3 replicate==0.33.0 requests==2.32.5 resampy==0.4.3 -spacy==3.8.11 +# Keep spaCy on the 3.8 line. Exact patch pins can be unavailable for a +# specific Python minor version on some mirrors, which makes setup brittle. +spacy>=3.8.7,<3.9 streamlit==1.49.1 streamlit-searchbox yt-dlp diff --git a/setup.py b/setup.py index b1021893..f122e3e7 100644 --- a/setup.py +++ b/setup.py @@ -1,7 +1,7 @@ from setuptools import setup, find_packages NAME = 'VideoLingo' -VERSION = '3.0.2' +VERSION = '3.0.3' with open('requirements.txt', encoding='utf-8') as f: requirements = f.read().splitlines() diff --git a/setup_env.py b/setup_env.py index da097336..657ab900 100644 --- a/setup_env.py +++ b/setup_env.py @@ -1,221 +1,181 @@ -""" -VideoLingo Environment Setup (No Anaconda Required) - -This script provides a conda-free installation path using `uv` (by Astral). -It automatically: - 1. Installs uv if not found - 2. Creates a .venv with Python 3.10 - 3. Runs install.py inside the venv +"""Create a VideoLingo Python environment, then run the stage-based installer. -Usage: - python setup_env.py # Full setup (any system Python 3.x works) - python setup_env.py --skip-install # Only create venv, don't run install.py +Default behavior creates a project-local ``.venv``. Use ``--shared`` to create +or reuse ``~/.venvs/videolingo`` so multiple VideoLingo checkouts share the same +heavy dependencies (PyTorch, WhisperX, Demucs, etc.). """ +from __future__ import annotations + +import argparse import os -import sys +import platform import shutil import subprocess -import platform +import sys +from pathlib import Path + PYTHON_VERSION = "3.10" -VENV_DIR = ".venv" -SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) +SCRIPT_DIR = Path(__file__).resolve().parent +LOCAL_VENV = SCRIPT_DIR / ".venv" +SHARED_VENV = Path.home() / ".venvs" / "videolingo" -def run(cmd, check=True, **kwargs): - """Run a command and return the CompletedProcess.""" - print(f" > {' '.join(cmd) if isinstance(cmd, list) else cmd}") +def run(cmd: list[str], check: bool = True, **kwargs) -> subprocess.CompletedProcess: + print(" > " + " ".join(str(x) for x in cmd)) return subprocess.run(cmd, check=check, **kwargs) -def is_uv_installed(): - """Check if uv is available on PATH.""" +def is_uv_installed() -> bool: return shutil.which("uv") is not None -def install_uv(): - """Install uv using platform-appropriate method with fallbacks.""" - print("\n[1/3] Installing uv...") - +def install_uv() -> None: + print("\n[1/3] Checking uv") if is_uv_installed(): - ver = subprocess.run( - ["uv", "--version"], capture_output=True, text=True - ).stdout.strip() + ver = subprocess.run(["uv", "--version"], capture_output=True, text=True).stdout.strip() print(f" uv is already installed: {ver}") return - system = platform.system() - if system == "Windows": - _install_uv_windows() + if platform.system() == "Windows": + methods = [ + ["winget", "install", "astral-sh.uv", "--accept-package-agreements", "--accept-source-agreements"], + ["powershell", "-ExecutionPolicy", "ByPass", "-c", "irm https://astral.sh/uv/install.ps1 | iex"], + [sys.executable, "-m", "pip", "install", "uv"], + ] else: - # macOS / Linux - try: - run(["sh", "-c", "curl -LsSf https://astral.sh/uv/install.sh | sh"]) - except subprocess.CalledProcessError: - print(" curl installer failed, trying pip...") - run([sys.executable, "-m", "pip", "install", "uv"]) - - # After installation, uv may not be on PATH in the current session. - if not is_uv_installed(): - _add_uv_to_path() - - if not is_uv_installed(): - print( - "\n*** ERROR: uv was installed but not found on PATH. ***\n" - "Please restart your terminal and run this script again.\n" - "Or install uv manually: https://docs.astral.sh/uv/getting-started/installation/" - ) - sys.exit(1) - - ver = subprocess.run( - ["uv", "--version"], capture_output=True, text=True - ).stdout.strip() - print(f" uv installed successfully: {ver}") - - -def _install_uv_windows(): - """Try multiple methods to install uv on Windows.""" - methods = [ - ("winget", ["winget", "install", "astral-sh.uv", - "--accept-package-agreements", "--accept-source-agreements"]), - ("PowerShell installer", [ - "powershell", "-ExecutionPolicy", "ByPass", "-c", - "irm https://astral.sh/uv/install.ps1 | iex" - ]), - ("pip", [sys.executable, "-m", "pip", "install", "uv"]), - ] + methods = [ + ["sh", "-c", "curl -LsSf https://astral.sh/uv/install.sh | sh"], + [sys.executable, "-m", "pip", "install", "uv"], + ] - for name, cmd in methods: + for cmd in methods: try: - print(f" Trying {name}...") run(cmd) - # Check if PATH needs updating after install - if not is_uv_installed(): - _add_uv_to_path() + add_uv_to_path() if is_uv_installed(): + print(" uv installed successfully") return except (subprocess.CalledProcessError, FileNotFoundError): - print(f" {name} failed, trying next method...") - continue + print(" install method failed, trying next method...") - print(" All installation methods failed.") + raise SystemExit("ERROR: uv could not be installed. Install it manually: https://docs.astral.sh/uv/") -def _add_uv_to_path(): - """Try to add uv's default install location to PATH for this session.""" - home = os.path.expanduser("~") +def add_uv_to_path() -> None: candidates = [ - os.path.join(home, ".local", "bin"), - os.path.join(home, ".cargo", "bin"), - os.path.join(os.environ.get("LOCALAPPDATA", ""), "uv", "bin"), - os.path.join(os.environ.get("LOCALAPPDATA", ""), "Programs", "uv"), + Path.home() / ".local" / "bin", + Path.home() / ".cargo" / "bin", + Path(os.environ.get("LOCALAPPDATA", "")) / "uv" / "bin", + Path(os.environ.get("LOCALAPPDATA", "")) / "Programs" / "uv", + Path(os.environ.get("LOCALAPPDATA", "")) / "Microsoft" / "WinGet" / "Links", ] - for p in candidates: - if not os.path.isdir(p): - continue - uv_name = "uv.exe" if platform.system() == "Windows" else "uv" - if os.path.isfile(os.path.join(p, uv_name)): - os.environ["PATH"] = p + os.pathsep + os.environ["PATH"] + name = "uv.exe" if platform.system() == "Windows" else "uv" + for path in candidates: + if (path / name).is_file(): + os.environ["PATH"] = str(path) + os.pathsep + os.environ.get("PATH", "") return -def create_venv(): - """Create a virtual environment with Python 3.10 using uv.""" - print(f"\n[2/3] Creating virtual environment with Python {PYTHON_VERSION}...") - - venv_path = os.path.join(SCRIPT_DIR, VENV_DIR) - - if os.path.exists(venv_path): - # Check if existing venv has the right Python version - python_exe = _get_venv_python(venv_path) - if python_exe and os.path.isfile(python_exe): - result = subprocess.run( - [python_exe, "--version"], capture_output=True, text=True - ) - ver = result.stdout.strip() - if "3.10" in ver: - print(f" .venv already exists with {ver}, reusing it.") - return python_exe - - print(" Removing existing .venv (wrong Python version)...") - shutil.rmtree(venv_path, ignore_errors=True) - - # uv venv will auto-download Python 3.10 if not present - # --seed installs pip/setuptools into the venv (install.py needs pip) - run(["uv", "venv", "--seed", "--python", PYTHON_VERSION, VENV_DIR], cwd=SCRIPT_DIR) - - python_exe = _get_venv_python(venv_path) - if not python_exe or not os.path.isfile(python_exe): - print("*** ERROR: Failed to create virtual environment. ***") - sys.exit(1) - - result = subprocess.run( - [python_exe, "--version"], capture_output=True, text=True - ) - print(f" Virtual environment created: {result.stdout.strip()}") - return python_exe - - -def _get_venv_python(venv_path): - """Get the Python executable path inside the venv.""" +def venv_python(venv_path: Path) -> Path: if platform.system() == "Windows": - return os.path.join(venv_path, "Scripts", "python.exe") - else: - return os.path.join(venv_path, "bin", "python") - + return venv_path / "Scripts" / "python.exe" + return venv_path / "bin" / "python" -def run_install(python_exe): - """Run install.py using the venv's Python.""" - print("\n[3/3] Running install.py...") - install_script = os.path.join(SCRIPT_DIR, "install.py") - # Prepare env for install.py subprocess: - env = os.environ.copy() - # 1. Avoid pip cache permission errors (common on Windows when cache dir - # is locked or has restrictive ACLs from a previous Python install) - env["PIP_NO_CACHE_DIR"] = "1" - # 2. Put venv Scripts/bin on PATH so install.py can find streamlit etc. - venv_path = os.path.join(SCRIPT_DIR, VENV_DIR) +def venv_bin(venv_path: Path) -> Path: if platform.system() == "Windows": - venv_bin = os.path.join(venv_path, "Scripts") - else: - venv_bin = os.path.join(venv_path, "bin") - env["PATH"] = venv_bin + os.pathsep + env.get("PATH", "") + return venv_path / "Scripts" + return venv_path / "bin" + + +def python_version_ok(python_exe: Path) -> bool: + if not python_exe.is_file(): + return False + result = subprocess.run([str(python_exe), "--version"], capture_output=True, text=True) + return "3.10" in (result.stdout or result.stderr) + + +def create_venv(path: Path, yes: bool = False) -> Path: + print(f"\n[2/3] Creating/reusing virtual environment: {path}") + python_exe = venv_python(path) + if python_version_ok(python_exe): + result = subprocess.run([str(python_exe), "--version"], capture_output=True, text=True) + print(f" Reusing existing venv: {result.stdout.strip() or result.stderr.strip()}") + return python_exe + + if path.exists(): + if not yes: + answer = input(f" Existing venv at {path} is not Python 3.10. Remove and recreate it? [y/N] ").strip().lower() + if answer != "y": + raise SystemExit("Cancelled.") + shutil.rmtree(path, ignore_errors=True) + + path.parent.mkdir(parents=True, exist_ok=True) + run(["uv", "venv", "--seed", "--python", PYTHON_VERSION, str(path)], cwd=SCRIPT_DIR) + if not python_version_ok(python_exe): + raise SystemExit("ERROR: failed to create a Python 3.10 virtual environment") + return python_exe - run([python_exe, install_script], cwd=SCRIPT_DIR, env=env) +def run_installer(python_exe: Path, args: argparse.Namespace) -> None: + print("\n[3/3] Installing VideoLingo dependencies") + env = os.environ.copy() + env["PATH"] = str(venv_bin(python_exe.parent.parent)) + os.pathsep + env.get("PATH", "") + cmd = [str(python_exe), str(SCRIPT_DIR / "installer.py"), "--yes"] + if args.auto_mirror: + cmd.append("--auto-mirror") + if args.force: + cmd.append("--force") + if args.skip_demucs: + cmd.append("--skip-demucs") + if args.require_demucs: + cmd.append("--require-demucs") + run(cmd, cwd=SCRIPT_DIR, env=env) + + +def build_parser() -> argparse.ArgumentParser: + parser = argparse.ArgumentParser(description="Create and install a VideoLingo environment") + parser.add_argument("--shared", action="store_true", help=f"use shared venv at {SHARED_VENV}") + parser.add_argument("--path", help="custom venv path; implies --shared-style external venv") + parser.add_argument("--skip-install", action="store_true", help="only create/reuse the venv") + parser.add_argument("--auto-mirror", action="store_true", help="auto-select a PyPI mirror before install") + parser.add_argument("--skip-demucs", action="store_true", help="skip optional Demucs install") + parser.add_argument("--require-demucs", action="store_true", help="fail if Demucs cannot be installed") + parser.add_argument("--force", action="store_true", help="force reinstall staged packages") + parser.add_argument("--yes", action="store_true", help="non-interactive; recreate wrong-version venvs") + return parser + + +def main() -> None: + args = build_parser().parse_args() + target = Path(args.path).expanduser() if args.path else (SHARED_VENV if args.shared else LOCAL_VENV) -def main(): print("=" * 60) - print(" VideoLingo Environment Setup (conda-free)") + print(" VideoLingo Environment Setup") print("=" * 60) - print(f"\n Project dir : {SCRIPT_DIR}") + print(f" Project dir : {SCRIPT_DIR}") print(f" Python ver : {PYTHON_VERSION}") - print(f" Venv dir : {VENV_DIR}") - - skip_install = "--skip-install" in sys.argv + print(f" Venv path : {target}") install_uv() - python_exe = create_venv() + python_exe = create_venv(target, yes=args.yes) - if skip_install: - print(f"\n --skip-install: Skipping install.py") - print(f"\n To install dependencies manually:") - print(f" {python_exe} install.py") + if args.skip_install: + print("\n --skip-install: dependencies were not installed") + print(f" To install later: {python_exe} {SCRIPT_DIR / 'installer.py'} --yes") else: - run_install(python_exe) + run_installer(python_exe, args) print("\n" + "=" * 60) - print(" Setup complete!") + print(" Setup complete") print("=" * 60) - print(f"\n To start VideoLingo:") if platform.system() == "Windows": - print(f" .venv\\Scripts\\streamlit run st.py") - print(f" (or double-click OneKeyStart_uv.bat)") + print(" Start with: OneKeyStart.bat") else: - print(f" .venv/bin/streamlit run st.py") - print() + streamlit = venv_bin(target) / "streamlit" + print(f" Start with: {streamlit} run st.py") if __name__ == "__main__":