Convert Model to ONNX¶
Save the model according to the instructions below and add config.json:
{
"model_type": "nemo-conformer-rnnt", // See "Supported model types"
"features_size": 80, // Size of preprocessor features for Whisper or Nemo models, supported 80 and 128
"subsampling_factor": 8, // Subsampling factor - 4 for conformer models and 8 for fastconformer and parakeet models
"max_tokens_per_step": 10 // Max tokens per step for RNN-T decoder
}
Then you can upload the model into Hugging Face and use load_model to download it.
Nvidia NeMo Conformer/FastConformer/Parakeet¶
Install NeMo Toolkit:
pip install nemo_toolkit['asr']
Download model and export to ONNX format:
import nemo.collections.asr as nemo_asr
from pathlib import Path
model = nemo_asr.models.ASRModel.from_pretrained("nvidia/stt_ru_fastconformer_hybrid_large_pc")
# To export Hybrid models with CTC decoder
# model.set_export_config({"decoder_type": "ctc"})
onnx_dir = Path("nemo-onnx")
onnx_dir.mkdir(exist_ok=True)
model.export(str(Path(onnx_dir, "model.onnx")))
with Path(onnx_dir, "vocab.txt").open("wt") as f:
for i, token in enumerate([*model.tokenizer.vocab, "<blk>"]):
f.write(f"{token} {i}\n")
GigaChat GigaAM v2/v3¶
Install GigaAM:
git clone https://github.com/salute-developers/GigaAM.git
pip install ./GigaAM --extra-index-url https://download.pytorch.org/whl/cpu
Download model and export to ONNX format:
import gigaam
from pathlib import Path
onnx_dir = "gigaam-onnx"
model_type = "rnnt" # or "ctc"
model = gigaam.load_model(
model_type,
fp16_encoder=False, # only fp32 tensors
use_flash=False, # disable flash attention
)
model.to_onnx(dir_path=onnx_dir)
with Path(onnx_dir, "v2_vocab.txt").open("wt") as f:
for i, token in enumerate(["\u2581", *(chr(ord("а") + i) for i in range(32)), "<blk>"]):
f.write(f"{token} {i}\n")
OpenAI Whisper (with onnxruntime export)¶
Read the onnxruntime instruction to convert Whisper to ONNX.
Download model and export with Beam Search and Forced Decoder Input Ids:
python3 -m onnxruntime.transformers.models.whisper.convert_to_onnx -m openai/whisper-base --output ./whisper-onnx --use_forced_decoder_ids --optimize_onnx --precision fp32
Save the tokenizer config:
from transformers import WhisperTokenizer
processor = WhisperTokenizer.from_pretrained("openai/whisper-base")
processor.save_pretrained("whisper-onnx")
OpenAI Whisper (with optimum export)¶
Export model to ONNX with Hugging Face optimum-cli:
optimum-cli export onnx --model openai/whisper-base ./whisper-onnx