【小沐学Python】Python实现语音识别（Whisper）

日期：2025-01-03 移动：https://sicmodule.kub2b.com/mobile/quote/18530.html

Whisper 是一种通用的语音识别模型。它是在包含各种音频的大型数据集上训练的，也是一个多任务模型，可以执行多语言语音识别、语音翻译和语言识别。

以下是可用模型的名称及其相对于大型模型的近似内存要求和推理速度;实际速度可能因许多因素而异，包括可用的硬件。

SizeParametersEnglish-only modelMultilingual modelRequired VRAMRelative speedtiny39 Mtiny.entiny~1 GB~32xbase74 Mbase.enbase~1 GB~16xsmall244 Msmall.ensmall ~2 GB~6xmedium769 Mmedium.enmedium~5 GB~2xlarge1550 MN/Alarge~10 GB1x

https://github.com/Const-me/Whisper

OpenAI 的 Whisper 自动语音识别（ASR）模型的高性能 GPGPU 推理
This project is a Windows port of the whisper.cpp implementation.
Which in turn is a C++ port of OpenAI’s Whisper automatic speech recognition (ASR) model.

https://github.com/chidiwilliams/buzz

Buzz 在您的个人计算机上离线转录和翻译音频。由 OpenAI 的 Whisper 提供支持。

安装如下：

（1）PyPI:

（2）Windows:

但最好是提前下好模型文件，然后放在指定的位置。

但 Buzz 使用的是 CPU 软解，目前还不支持 GPU 硬解。

https://github.com/jhj0517/Whisper-WebUI
基于 Gradio 的 Whisper 浏览器界面。你可以把它当作一个简单的字幕生成器！

git : https://git-scm.com/downloads
python : https://www.python.org/downloads/
FFmpeg : https://ffmpeg.org/download.html

本文地址：https://sicmodule.kub2b.com/quote/18530.html 企库往 https://sicmodule.kub2b.com/ , 查看更多

特别提示：本信息由相关用户自行提供，真实性未证实，仅供参考。请谨慎采用，风险自负。

0 条相关评论

相关最新动态

推荐最新动态

点击排行