WeSearch

TensorSharp: Open-Source Local LLM Inference Engine

·42 min read · 0 reactions · 0 comments · 2 views
#technology#software#open-source#TensorSharp#Ollama#OpenAI#Gemma#Qwen#Mistral
TensorSharp: Open-Source Local LLM Inference Engine
⚡ TL;DR · AI summary

TensorSharp is an open-source C# inference engine designed for running large language models locally. It supports various model architectures and provides multiple interfaces for programmatic access. The engine features optimized backends for CPU and GPU, enabling efficient multimodal inference.

Key facts
Original article
GitHub
Read full at GitHub →
Opening excerpt (first ~120 words) tap to expand

TensorSharp English | 中文 A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. Documentation Map Start here Use this when you want to... Quick build and usage Build the solution, compile the native GGML bridge, and run the CLI or server Supported model architectures Check which GGUF architecture keys, modalities, thinking mode, and tool calling paths are implemented Compute backends Choose between pure C# CPU, direct CUDA/cuBLAS, MLX Metal, GGML CPU, GGML Metal, and GGML CUDA HTTP APIs Use the Ollama-compatible, OpenAI-compatible, or Web UI SSE endpoints Per-model architecture cards Read end-to-end documentation…

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from GitHub