Models

NELA uses specialized model classes for different tasks. The task router picks from installed models based on task type and model priority.

LLM

Generation

General text generation for chat, summarization, enrichment, mindmaps, and podcast scripting.

Examples: Qwen3.5 0.8B Q4, Qwen3.5 2B Q4, LFM 1.2B INT8.
Use this class for normal conversation and reasoning tasks.

VLM

Vision

Vision-language inference for image-grounded Q&A.

Uses both a vision model file and a projector (`mmproj`) file.
Selected automatically when you run Vision mode tasks.

ASR

Speech Input

Speech-to-text transcription pipeline used for microphone and audio-to-text flows.

Current stack uses Parakeet TDT assets.
Feeds transcribed text into normal chat or workflow prompts.

TTS

Speech Output

Text-to-speech generation for audio outputs and podcast rendering.

Current stack uses KittenTTS ONNX runtime assets.
Supports voice selection and pacing controls in app workflows.

Embedding

RAG Retrieval

Converts chunks and queries into vectors for local similarity search.

Current options include BGE small and BGE base variants.
Higher-dimensional embeddings usually improve retrieval quality.

Classifier + Grader

Quality Control

Classifier routes intent; grader reranks retrieval evidence for better final context.

Classifier: DistilBERT ONNX query router.
Grader: MS MARCO MiniLM cross-encoder.

Suggested install order for new users:

Install one LLM first.
Add VLM only if you need image Q&A.
Add TTS/ASR for audio workflows.
Add Embedding, Classifier, and Grader for stronger RAG quality.

If you switch embedding families with different dimensions (for example BGE small to BGE base), re-ingest your documents so vector dimensions stay compatible.