v1 api

Build your own app (web, mobile, LMS) using loxai's GS tech. Send practice text plus a learner recording; get scores and playable word-level audio clips.

Base URL

https://loxai.tech/api/v1

All requests support CORS (Access-Control-Allow-Origin: *).

API client mode

Send on every request (header preferred):

X-Loxai-Client: api

Or form field / JSON body: client=api

Enroll creates one ElevenLabs voice clone per session_id. Re-enrolling the same session reuses the existing voice (saves quota).
Compare returns inline base64 audio for the full golden utterance and each practice word (you + golden clips).

Sessions

Generate a UUID (or your own stable id) per learner or device in your application. Pass it as session_id on enroll, compare, listen, and reset. The backend stores the Golden Speaker voice against that id.

1. Enroll (once per session)

POST /api/v1/enroll

multipart/form-data

session_id — your app’s session id (required)
language — es or en (optional, default es)
audio — enrollment recording (pangram; ~3+ seconds)
client=api — optional if using header

curl -X POST https://loxai.tech/api/v1/enroll \
  -H "X-Loxai-Client: api" \
  -F session_id=my-app-learner-42 \
  -F language=es \
  -F audio=@enrollment.webm

2. Compare (text + learner recording)

POST /api/v1/compare

multipart/form-data or application/json (API mode only)

session_id — same id as enroll
target_text — script the learner should read
audio — learner attempt (multipart), or attempt_audio_base64 (JSON)
language, fusion — optional

curl -X POST https://loxai.tech/api/v1/compare \
  -H "X-Loxai-Client: api" \
  -F session_id=my-app-learner-42 \
  -F target_text="Me gusta coleccionar conchas." \
  -F language=es \
  -F audio=@attempt.webm

JSON body example (same header):

{
  "client": "api",
  "session_id": "my-app-learner-42",
  "target_text": "Me gusta coleccionar conchas.",
  "language": "es",
  "attempt_audio_base64": "…",
  "attempt_audio_mime": "audio/webm"
}

Response includes scoring fields plus a comparison object for your UI:

{
  "ok": true,
  "api": true,
  "session_id": "my-app-learner-42",
  "accuracy_percent": 88.2,
  "transcripts": { "attempt": "…", "golden": "…", "target_text": "…" },
  "golden_audio_base64": "…",
  "golden_audio_mime": "audio/mpeg",
  "attempt_audio_base64": "…",
  "attempt_audio_mime": "audio/wav",
  "comparison": {
    "target_text": "Me gusta coleccionar conchas.",
    "accuracy_percent": 88.2,
    "transcripts": { … },
    "phoneme_feedback": {
      "available": true,
      "practice_words": [
        {
          "label": "conchas",
          "reason": "phoneme",
          "word_index": 3,
          "attempt_audio_base64": "…",
          "golden_audio_base64": "…",
          "attempt_audio_mime": "audio/wav",
          "golden_audio_mime": "audio/wav"
        }
      ]
    },
    "audio": {
      "golden": { "base64": "…", "mime": "audio/mpeg" },
      "attempt": { "base64": "…", "mime": "audio/wav" }
    }
  }
}

Only words with both you and golden clips appear in practice_words.

3. Play audio in your app

function playBase64(b64, mime) {
  const binary = atob(b64);
  const bytes = new Uint8Array(binary.length);
  for (let i = 0; i < binary.length; i++) bytes[i] = binary.charCodeAt(i);
  const url = URL.createObjectURL(new Blob([bytes], { type: mime }));
  const audio = new Audio(url);
  audio.onended = () => URL.revokeObjectURL(url);
  return audio.play();
}


playBase64(data.comparison.audio.golden.base64, data.comparison.audio.golden.mime);

data.comparison.phoneme_feedback.practice_words.forEach(function (word) {
  // word.label, word.reason, word.attempt_audio_base64, word.golden_audio_base64
});

Other endpoints

GET /api/v1/session?session_id=…: enrolled?
GET /api/v1/pangram?language=es: enrollment prompt text
POST /api/v1/listen: golden TTS only (target_text)
POST /api/v1/reset: clear session and delete cloned voice