WeSearch

AVTR-1 A free, open-source, open weights real time avatar model

·5 min read · 0 reactions · 0 comments · 14 views
#technology#open-source#artificial-intelligence
AVTR-1 A free, open-source, open weights real time avatar model
⚡ TL;DR · AI summary

AVTR-1 is a new open-source model designed for real-time avatar dialogue, capable of rendering lip-synced speech at 25 frames per second. It utilizes a single GPU for efficient performance and includes features such as an interactive demo and production-ready backend. The model is built for easy deployment with accessible weights and inference code available for users.

Key facts
Original article
GitHub
Read full at GitHub →
Opening excerpt (first ~120 words) tap to expand

AVTR-1 AVTR-1 is a flow-matching-based autoregressive model for live dialogue. Given a portrait image and dual-stream audio, it renders lip-synced speech and active listening at 25 fps on a single GPU. Built for production deployment: model weights, TensorRT-accelerated inference, and the live-session backend - available as an API or fully self-hosted trailer_720p_small.mp4 📑 What's included Model weights Inference code Interactive streaming demo Technical report (Coming soon) Production-ready back-end (Coming soon) Table of Contents Quick Start Performance Troubleshooting 1.

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from GitHub