We are ampixa labs — a Nepal-based audio lab working on the raw signal: speech before it becomes text, and speech after text becomes voice.
We build text-to-speech, ASR data pipelines, forced alignment, pronunciation systems, human evaluation, and low-resource speech tooling for Nepali and regional languages.
Audio is messy. It carries accent, breath, noise, code-switching, hesitation, dialect, bad microphones, and the parts of language that plain text throws away. That is the layer we care about.
Sovereign AI for Nepal is not one giant model with a flag on it. It is Nepali data, speech corpora, lexicons, benchmarks, ASR, TTS, alignment, evaluation, and the systems around them.
Nepal does not live in English. It lives in spoken Nepali, mixed Nepali, regional languages, phone calls, classrooms, radio, clinics, government counters, farms, cities, and homes. If AI cannot understand that audio, it is not useful infrastructure here.
The bridge is speech: ASR that can listen, TTS that can answer, alignment that can clean data, pronunciation tools that know the language, and evaluation that uses native listeners instead of imported assumptions.
The future is audio. Ampixa exists to make Nepal audible to machines, and to make machines speak back without forcing Nepal through English first.
Speech evaluation — NepTTS-Bench, the first comprehensive Nepali TTS benchmark, with 365 designed sentences, baseline outputs, ASR round-trip metrics, human MOS ratings, and a NepaliMOS predictor. Dataset on huggingface.
Text-to-speech — Nepali Piper/VITS voice releases and reproducible training work, including real-nepali-v0.4, real-nepali-v0.2-kala, and nepali-voices-v0.
ASR data quality — nepali-mfa, a forced-alignment and review pipeline that turns raw Nepali audio and transcripts into cleaner manifests for ASR training and robustness work.
Pronunciation systems — nepali-reverse-g2p and lexicon tooling for moving between sounds and Devanagari spellings, repairing dictionaries, and supporting speech pipelines.
Regional speech tooling — limbu-speech-toolkit, with Limbu/Yakthung G2P, dictionary review tools, Piper training recipes, and the limbu-piper-lifwbt voice work.
Human feedback loops — collection and rating tools for native speakers, including voice recording and TTS evaluation.
| GitHub org | github.com/Ampixa |
| HuggingFace org | huggingface.co/ampixa |
| NepTTS-Bench | github.com/Ampixa/neptts-bench |
| Nepali MFA | github.com/Ampixa/nepali-mfa |
| Reverse G2P | github.com/Ampixa/nepali-reverse-g2p |
| Limbu toolkit | github.com/Ampixa/limbu-speech-toolkit |
| Voice recorder | tts.ampixa.com/speak |
| Rating platform | tts.ampixa.com/rating |
| hello@ampixa.com |
© 2026 ampixa labs.
Devanagari wordmark uses
"8-bit devanagari"
by colonelhathii (CC BY-NC 3.0).