namo-sslm (Coming Soon)

Introduction

We are excited to open-source NAMO-SSLM, small yet powerful real-time multi-modal. The AI landscape is shifting from massive, resource-intensive models to lightweight, optimized small models—and for good reason. Small models (like NAMO-SSLM) offer a compelling mix of efficiency, speed, and cost-effectiveness, making them the smarter choice for real-world applications.

Key research includes:

Run on CPU: Run model real-time on consumer CPU devices.
Multimodal (voice + vision): Native support for real-time speech and vision and OCR capabilites.
Low Latency, Real-Time Processing: Real-time streaming support with end to end latency as low as 80ms.
Multilingual Support: Enable multi-lang and hybrid language capabilities such as hing-lish.
Multi-turn RAG: Supports multi-turn RAG to retrive rich context from while keeping conversation real-time
Voiced + Silent Function / Tools Calling: Function calling support with silent voice with text as well as voice output.

[video]

Updates

21.03.2025: Announched Model Launch.

Roadmap

Launch real-time vision+text modality
Launch real-time speech modality

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

namo-sslm (Coming Soon)

Introduction

Updates

Roadmap

About

videosdk-live/namo-sslm

Folders and files

Latest commit

History

Repository files navigation

namo-sslm (Coming Soon)

Introduction

Updates

Roadmap

About

Resources

Stars

Watchers

Forks