Skip to content

Latest commit

 

History

History
43 lines (32 loc) · 1.36 KB

README_EN.md

File metadata and controls

43 lines (32 loc) · 1.36 KB

Voice2Text

Language

Voice2Text is a voice-to-text application based on OpenAI Whisper, supporting audio, video, and real-time speech-to-text.

How to Use

Option 1

  1. Download the package Voice2Text-pkg.rar and unzip it.
  2. Go to the model repository to download the faster-whisper-large-v2 model and place it in the models folder.
  3. Double-click run.bat on Windows or run.sh on Linux/Mac to run.
  4. If you want to use a GPU, download and install CUDA12 yourself (the same applies to Method 2).

Option 2

  1. Clone the repository.

    git clone https://github.com/caiwuu/Voice2Text
    cd ./Voice2Text
  2. Create a Python virtual environment.

    conda create -p ./env python==3.11.9
    conda activate ./env
  3. Install the dependencies.

    pip install -r requirements.txt
  4. Download the model into the models folder from the repository: https://huggingface.co/Systran. The default model is faster-whisper-large-v2; you can change the model name in the code for other models.

  5. Start the application.

    python webUI.py

    image-20240903122554629