A Practical Approach to Using Sherpa-ONNX Production-Ready on Windows

Jan 28, 2026

ASR & Sherpa-ONNX

Automatic Speech Recognition (ASR) is a core technology behind modern voice-driven applications such as virtual assistants, real-time transcription systems, and speech-based user interfaces. While model accuracy and inference speed often get the attention, production deployments often fail or stall due to more mundane issues such as build systems, runtime libraries, and platform-specific constraints.

Sherpa-ONNX is a powerful and flexible open-source toolkit for deploying ASR models using the ONNX Runtime. However, when integrating Sherpa-ONNX into existing Windows-based production systems, there is a dependency on the MT (static CRT) runtime, which clashes with the MD (dynamic CRT) runtime used by most production-grade applications.

This blog explains why this issue happens, how it shows up in real-world projects, and offers a practical engineering approach for building Sherpa-ONNX in MD mode. This allows seamless integration into production environments without disrupting established build configurations.

MT vs MD Runtime Library Modes in Visual Studio

On Windows, Visual Studio provides two runtime library options that control how the C/C++ Runtime (CRT) is linked:

MD (Multi-Threaded Dynamic runtime)
Links against the CRT dynamically via shared DLLs.
MT (Multi-Threaded Static runtime)
Statically links the CRT into each binary.

Comparison of MD vs. MT

Aspect	MD	MT
CRT instance	Shared across all EXEs and DLLs	Each EXE/DLL has its own private CRT
Cross-DLL memory management	malloc/free and new/delete across DLLs are safe	Memory must be freed in the same module
Third-party library compatibility	High	Often requires rebuilding dependencies
Binary size	Smaller	Larger

Based on this comparison, MD mode is the standard choice in real-world production environments, especially for projects involving DLLs, plugins, or third-party libraries.

Sherpa-ONNX Uses MT Mode by Default

By default, Sherpa-ONNX is built in MT mode, and its build system does not provide a configuration option to change this behavior.

This can be observed in the following code snippet from the official repository: https://github.com/k2-fsa/sherpa-onnx/blob/master/CMakeLists.txt#L137

		if(MSVC)
  add_compile_options(
      $<$<CONFIG:>:/MT> #---------|
      $<$<CONFIG:Debug>:/MTd> #---|-- Statically link the runtime libraries
      $<$<CONFIG:Release>:/MT> #--|
      $<$<CONFIG:RelWithDebInfo>:/MT>
      $<$<CONFIG:MinSizeRel>:/MT>
  )
endif()

As shown above, Sherpa-ONNX enforces the MT runtime for all targets, including libraries and sample applications. As a result, any downstream project linking against Sherpa-ONNX must also be built in MT mode.

If Sherpa-ONNX is built using its default configuration and then integrated into a customer’s project, the customer is forced to change the runtime library linking mode of their existing build. In real-world production environments, such changes are rarely acceptable. Runtime library settings are often standardized and tightly controlled, and switching away from the default or recommended configuration introduces unnecessary risk and potential instability. Consequently, this requirement becomes a significant barrier to adopting Sherpa-ONNX in production systems.

Real-World Example: Using Sherpa-ONNX in a Custom Project

To illustrate the issue and its practical impact, we walk through a concrete example.

1) Prepare the Model
Download and extract the punctuation model used in this example to some directory:

		# wget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8.tar.bz2

2) Build Sherpa-ONNX on Windows
Clone the repository and build Sherpa-ONNX by following the official guide:https://k2-fsa.github.io/sherpa/onnx/install/windows.html?highlight=windows

Example build steps:

		# git clone https://github.com/k2-fsa/sherpa-onnx
# cd sherpa-onnx
# cmake -G "Visual Studio 17 2022" -A x64 -B build . \
       -DCMAKE_BUILD_TYPE=Release \
       -DSHERPA_ONNX_ENABLE_BINARY=OFF  \
       -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF  \
       -DSHERPA_ONNX_ENABLE_TTS=OFF
# cd build
# cmake --build . --config release -j 16

The generated library files can be found in:
build\lib\Release

3) Create a Sample Project
Project structure:

		include\
    sherpa-onnx\
        csrc\
            offline-punctuation.h
            offline-punctuation-model-config.h
            parse-options.h
src\
    test_punctuation.cpp
lib\
    sherpa-onnx-core.lib
    onnxruntime.lib
    onnxruntime.dll
CMakeLists.txt

Preparation steps:

Copy the three header files from the Sherpa-ONNX source into the project’s include\ directory.
Copy sherpa-onnx-core.lib from sherpa-onnx\build\lib\Release\, into the lib\ directory.
Install Ryzen AI Software following the official instructions: https://ryzenai.docs.amd.com/en/latest/inst.html
Copy onnxruntime.dll from C:\Program Files\RyzenAI\1.7.0\onnxruntime\bin into the lib\ directory.
Copy onnxruntime.lib from C:\Program Files\RyzenAI\1.7.0\onnxruntime\lib into the lib\ directory.

Sample CMakeLists.txt:

		cmake_minimum_required(VERSION 3.18.1)
project(test_punctuation)
set(CMAKE_CXX_STANDARD 20)
add_compile_options($<$<CONFIG:>:/MT>  $<$<CONFIG:Release>:/MT>)
add_executable(test_punctuation ${CMAKE_SOURCE_DIR}/src/test_punctuation.cpp)
target_link_directories(test_punctuation PRIVATE ${CMAKE_SOURCE_DIR}/lib )
target_include_directories(test_punctuation PRIVATE ${CMAKE_SOURCE_DIR}/include )
target_link_libraries(test_punctuation PRIVATE  sherpa-onnx-core  onnxruntime)

Sample code of test_punctuation.cpp:

		#include <iostream>
#include <fstream>
#include "sherpa-onnx/csrc/offline-punctuation.h"

int32_t main(int32_t argc, char *argv[]) {
  sherpa_onnx::OfflinePunctuationConfig cfg;
  cfg.model.ct_transformer = argv[1];
  cfg.model.num_threads = 1;
  cfg.model.debug = false;
  cfg.model.provider = "cpu";
  std::string stringv = "你好吗how are you Fantasitic 谢谢我很好你怎么样呢";
  auto punct = std::make_unique<sherpa_onnx::OfflinePunctuation>(cfg);
  auto text_with_punct = punct->AddPunctuation(stringv);
  std::cout<< text_with_punct <<"\n";
  return 0;
}

4) Build and Run

		# cmake -G "Visual Studio 17 2022" -A x64 -S . -B build
# cd build
# cmake --build . --config release
# copy ..\lib\onnxruntime.dll .
# release\test_punctuation.exe some_dir\model.int8.onnx

Output:

		你好吗？how are you Fantasitic？谢谢我很好，你怎么样呢？

The program runs successfully only because the project is explicitly forced to use MT mode.

The MT/MD Mismatch Problem

Remove following line from CMakeLists.txt of the sample project:

add_compile_options($<$<CONFIG:>:/MT> $<$<CONFIG:Release>:/MT>)

Rebuild the project in default MD mode, the following linker error occurs:

		sherpa-onnx-core.lib(offline-punctuation.obj) : error LNK2038: mismatch detected for 'RuntimeLibrary': value 'MT_Static
Release' doesn't match value 'MD_DynamicRelease' in test_punctuation.obj [test_punct
uation.vcxproj]

This error clearly indicates a runtime library mismatch between the application and sherpa-onnx-core.lib.

Making Sherpa-ONNX use MD mode

Searching for “$<$<CONFIG:>:/MT>” within sherpa-onnx\build directory, it appears in sherpa-onnx\build\_deps\kaldi_decoder-src\CMakeLists.txt.

That means: the dependent kaldi Decoder project uses MT mode, and it causes Sherpa-onnx project must use MT mode.

To resolve this, MD mode must be enabled in the Kaldi Decoder build configuration.

In sherpa-onnx\cmake\kaldi-decoder.cmake, the following configuration is defined:

		  set(kaldi_decoder_URL  "https://github.com/k2-fsa/kaldi-decoder/archive/refs/tags/v0.2.10.tar.gz")
  set(kaldi_decoder_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/kaldi-decoder-0.2.10.tar.gz")
  set(kaldi_decoder_HASH "SHA256=a3d602edc1f422acfe663153faf3f0a716305ec1f95b8fcf9d28d301d6827309")

This configuration indicates that Sherpa-ONNX depends on kaldi-decoder version v0.2.10.

After building Sherpa-ONNX, the corresponding source archive is cached locally at

`sherpa-onnx\build\_deps\kaldi_decoder-subbuild\kaldi_decoder-populate-prefix\srcv0.2.10.tar.gz`.

Modification steps:

1. Transfer the archive to a linux environment (for example, via SSH), and extract it: # tar zxvf v0.2.10.tar.gz

2. In the extracted directory, edit kaldi-decoder-0.2.10\CMakeLists.txt and comment out the following lines (starting at approximately line 59) to disable the MT runtime:

		     # add_compile_options(
     #     $<$<CONFIG:>:/MT> #---------|
     #     $<$<CONFIG:Debug>:/MTd> #---|-- Statically link the runtime libraries
     #     $<$<CONFIG:Release>:/MT> #--|
     # )

3. Repackage the modified source directory:
# tar zcvf kaldi-decoder-0.2.10.tar.gz ./ kaldi-decoder-0.2.10

4. Compute the SHA256 checksum of the updated archive:
# sha256sum kaldi-decoder-0.2.10.tar.gz Record the newly generated SHA256 value.

5. Transfer the updated archive back to the Windows environment and place it in the Sherpa-ONNX root directory. This forces the Sherpa-ONNX build to use the modified kaldi-decoder package insteadof downloading it from the internet.

6. Update the SHA256 value in sherpa-onnx\cmake\kaldi-decoder.cmake: set(kaldi_decoder_HASH "SHA256=NEW_SHA256_VALUE_FROM_SHA256SUM")

Replace the original checksum with the value obtained in the previous step.

7. Modify sherpa-onnx\CMakeLists.txt and comment out the following block to allow Sherpa-ONNX itself to be built in MD mode:

		 # if(MSVC)
 # add_compile_options(
 #     $<$<CONFIG:>:/MT> #---------|
 #     $<$<CONFIG:Debug>:/MTd> #---|-- Statically link the runtime libraries
 #     $<$<CONFIG:Release>:/MT> #--|
 #     $<$<CONFIG:RelWithDebInfo>:/MT>
 #     $<$<CONFIG:MinSizeRel>:/MT>
 # )
 # endif()

8. Delete sherpa-onnx\build directory and rebuild Sherpa-ONNX from scratch.

9. Copy the newly generated sherpa-onnx-core.lib into the lib directory of the example project.

10. Rebuild the example project. The build should now complete successfully without any MT/MD runtime mismatch errors.

11. Run the project again. The output should be identical to the result obtained when using MT mode.

Summary

Integrating ASR frameworks into production systems often exposes challenges that go beyond model design or inference speed. Runtime library configuration, especially on Windows, can play a crucial role in determining whether technology is usable at all.

In this blog, we have:

Explained what Sherpa-ONNX is and why runtime library configuration matters in production
Demonstrated a real-world MT/MD runtime mismatch issue encountered in production scenarios
Presented a practical, engineering-oriented approach for building Sherpa-ONNX in MD mode
Enabled Sherpa-ONNX to be integrated into production environments without requiring changes to existing build configurations

This approach allows Sherpa-ONNX to work closely with large language and speech models on the AMD Ryzen™ AI platform, ensuring stable and reliable deployment in real-world production systems.