【小沐学Python】Python实现语音识别（Whisper）

【小沐学Python】Python实现语音识别（Whisper）

2025-01-03 12:16

Whisper 是一种通用的语音识别模型。它是在包含各种音频的大型数据集上训练的，也是一个多任务模型，可以执行多语言语音识别、语音翻译和语言识别。

以下是可用模型的名称及其相对于大型模型的近似内存要求和推理速度;实际速度可能因许多因素而异，包括可用的硬件。

SizeParametersEnglish-only modelMultilingual modelRequired VRAMRelative speedtiny39 Mtiny.entiny~1 GB~32xbase74 Mbase.enbase~1 GB~16xsmall244 Msmall.ensmall ~2 GB~6xmedium769 Mmedium.enmedium~5 GB~2xlarge1550 MN/Alarge~10 GB1x

https://github.com/Const-me/Whisper

OpenAI 的 Whisper 自动语音识别（ASR）模型的高性能 GPGPU 推理
This project is a Windows port of the whisper.cpp implementation.
Which in turn is a C++ port of OpenAI’s Whisper automatic speech recognition (ASR) model.

https://github.com/chidiwilliams/buzz

Buzz 在您的个人计算机上离线转录和翻译音频。由 OpenAI 的 Whisper 提供支持。

安装如下：

（1）PyPI:

（2）Windows:

但最好是提前下好模型文件，然后放在指定的位置。

但 Buzz 使用的是 CPU 软解，目前还不支持 GPU 硬解。

https://github.com/jhj0517/Whisper-WebUI
基于 Gradio 的 Whisper 浏览器界面。你可以把它当作一个简单的字幕生成器！

git : https://git-scm.com/downloads
python : https://www.python.org/downloads/
FFmpeg : https://ffmpeg.org/download.html

以上就是本篇文章【【小沐学Python】Python实现语音识别（Whisper）】的全部内容了，欢迎阅览！文章地址：https://sicmodule.kub2b.com/quote/18530.html
栏目首页相关文章动态同类文章热门文章网站地图返回首页企库往资讯移动站https://sicmodule.kub2b.com/mobile/,查看更多