AVTR-1 A free, open-source, open weights real time avatar model

May 26, 2026 · 3:59 PM UTC ·5 min read · 0 reactions · 0 comments · 28 views

#technology #open-source #artificial-intelligence

TL;DR · WeSearch summary

AVTR-1 is a new open-source model designed for real-time avatar dialogue, capable of rendering lip-synced speech at 25 frames per second. It utilizes a single GPU for efficient performance and includes features such as an interactive demo and production-ready backend. The model is built for easy deployment with accessible weights and inference code available for users.

Key facts

▪AVTR-1 is an autoregressive model that matches flow for live dialogue.
▪It can render lip-synced speech and active listening using a portrait image and dual-stream audio.
▪The model is designed for production deployment and includes an API for self-hosting.

Original article

GitHub

Read full at GitHub →

Opening excerpt (first ~120 words) tap to expand

AVTR-1 AVTR-1 is a flow-matching-based autoregressive model for live dialogue. Given a portrait image and dual-stream audio, it renders lip-synced speech and active listening at 25 fps on a single GPU. Built for production deployment: model weights, TensorRT-accelerated inference, and the live-session backend - available as an API or fully self-hosted trailer_720p_small.mp4 📑 What's included Model weights Inference code Interactive streaming demo Technical report (Coming soon) Production-ready back-end (Coming soon) Table of Contents Quick Start Performance Troubleshooting 1.

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed

Discussion

0 comments

AVTR-1 A free, open-source, open weights real time avatar model

Discussion

More from GitHub