ampixa labs

raw Nepali audio infrastructure — speech in, speech out

new — nine tiny neural voices, six languages, live in your browser and on a $3 chip. try it →

ampixa labs

We are ampixa labs — a Nepal-based audio lab working on the raw signal: speech before it becomes text, and speech after text becomes voice.

We build text-to-speech, ASR data pipelines, forced alignment, pronunciation systems, human evaluation, and low-resource speech tooling for Nepali and regional languages.

Audio is messy. It carries accent, breath, noise, code-switching, hesitation, dialect, bad microphones, and the parts of language that plain text throws away. That is the layer we care about.

manifesto

Sovereign AI for Nepal is not one giant model with a flag on it. It is Nepali data, speech corpora, lexicons, benchmarks, ASR, TTS, alignment, evaluation, and the systems around them.

Nepal does not live in English. It lives in spoken Nepali, mixed Nepali, regional languages, phone calls, classrooms, radio, clinics, government counters, farms, cities, and homes. If AI cannot understand that audio, it is not useful infrastructure here.

The bridge is speech: ASR that can listen, TTS that can answer, alignment that can clean data, pronunciation tools that know the language, and evaluation that uses native listeners instead of imported assumptions.

The future is audio. Ampixa exists to make Nepal audible to machines, and to make machines speak back without forcing Nepal through English first.

what we build

Speech evaluation — NepTTS-Bench, the first comprehensive Nepali TTS benchmark, with 365 designed sentences, baseline outputs, ASR round-trip metrics, human MOS ratings, and a NepaliMOS predictor. Dataset on huggingface.

Tiny text-to-speech — sanoTTS (सानो = “small”), neural voices from 745k to 1.8M parameters in six languages — English, Nepali, Hindi, Vietnamese, Indonesian, Chinese. They synthesize live in the browser via WebAssembly and run faster than real time on a $3 ESP32 microcontroller. pip install sanotts — code, models, live demo.

ASR data quality — nepali-mfa, a forced-alignment and review pipeline that turns raw Nepali audio and transcripts into cleaner manifests for ASR training and robustness work.

Pronunciation systems — nepali-reverse-g2p and lexicon tooling for moving between sounds and Devanagari spellings, repairing dictionaries, and supporting speech pipelines.

Regional speech tooling — limbu-speech-toolkit, with Limbu/Yakthung G2P, dictionary review tools, Piper training recipes, and the limbu-piper-lifwbt voice work.

Human feedback loops — collection and rating tools for native speakers, including voice recording and TTS evaluation.

find us

GitHub org	github.com/Ampixa
HuggingFace org	huggingface.co/ampixa
NepTTS-Bench	github.com/Ampixa/neptts-bench
Nepali MFA	github.com/Ampixa/nepali-mfa
Reverse G2P	github.com/Ampixa/nepali-reverse-g2p
Limbu toolkit	github.com/Ampixa/limbu-speech-toolkit
Voice recorder	tts.ampixa.com/speak
Rating platform	tts.ampixa.com/rating
Email	hello@ampixa.com