Models
NELA uses specialized model classes for different tasks. The task router picks from installed models based on task type and model priority.
LLM
GenerationGeneral text generation for chat, summarization, enrichment, mindmaps, and podcast scripting.
- Examples: Qwen3.5 0.8B Q4, Qwen3.5 2B Q4, LFM 1.2B INT8.
- Use this class for normal conversation and reasoning tasks.
VLM
VisionVision-language inference for image-grounded Q&A.
- Uses both a vision model file and a projector (`mmproj`) file.
- Selected automatically when you run Vision mode tasks.
ASR
Speech InputSpeech-to-text transcription pipeline used for microphone and audio-to-text flows.
- Current stack uses Parakeet TDT assets.
- Feeds transcribed text into normal chat or workflow prompts.
TTS
Speech OutputText-to-speech generation for audio outputs and podcast rendering.
- Current stack uses KittenTTS ONNX runtime assets.
- Supports voice selection and pacing controls in app workflows.
Embedding
RAG RetrievalConverts chunks and queries into vectors for local similarity search.
- Current options include BGE small and BGE base variants.
- Higher-dimensional embeddings usually improve retrieval quality.
Classifier + Grader
Quality ControlClassifier routes intent; grader reranks retrieval evidence for better final context.
- Classifier: DistilBERT ONNX query router.
- Grader: MS MARCO MiniLM cross-encoder.
Suggested install order for new users:
- Install one LLM first.
- Add VLM only if you need image Q&A.
- Add TTS/ASR for audio workflows.
- Add Embedding, Classifier, and Grader for stronger RAG quality.