Skip to content

NAMO Small Speech Language Model

Notifications You must be signed in to change notification settings

videosdk-live/namo-sslm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

namo-sslm (Coming Soon)

Introduction

We are excited to open-source NAMO-SSLM, small yet powerful real-time multi-modal. The AI landscape is shifting from massive, resource-intensive models to lightweight, optimized small models—and for good reason. Small models (like NAMO-SSLM) offer a compelling mix of efficiency, speed, and cost-effectiveness, making them the smarter choice for real-world applications.

Key research includes:

  1. Run on CPU: Run model real-time on consumer CPU devices.
  2. Multimodal (voice + vision): Native support for real-time speech and vision and OCR capabilites.
  3. Low Latency, Real-Time Processing: Real-time streaming support with end to end latency as low as 80ms.
  4. Multilingual Support: Enable multi-lang and hybrid language capabilities such as hing-lish.
  5. Multi-turn RAG: Supports multi-turn RAG to retrive rich context from while keeping conversation real-time
  6. Voiced + Silent Function / Tools Calling: Function calling support with silent voice with text as well as voice output.

[video]

Updates

21.03.2025: Announched Model Launch.

Roadmap

  • Launch real-time vision+text modality
  • Launch real-time speech modality

About

NAMO Small Speech Language Model

Resources

Stars

Watchers

Forks