AX TTS API Documentation

Overview

AX TTS is a high-performance Turkish Text-to-Speech service with both REST and WebSocket streaming APIs. Built on Piper TTS engine with Turkish language support.

📖 Swagger UI

Interactive API documentation

📚 ReDoc

Alternative API docs

📊 Monitoring

Real-time metrics

✅ Health Check

Service status

Features

⚡ Fast TTS: Real-time factor 0.07 (14x faster than real-time)
🔊 WebSocket Streaming: < 300ms time to first audio
🇹🇷 Turkish Normalization: Numbers, dates, currency, abbreviations
📊 High Throughput: ~1080 requests/minute capacity
🔒 Production Ready: Rate limiting, monitoring, SSL/TLS

Quick Start

1. REST API - Simple Synthesis

curl -X POST 'https://api.axtts.pixagor.net/api/v1/synthesize' \
  -H 'Content-Type: application/json' \
  -d '{
    "text": "Merhaba dünya",
    "format": "wav"
  }' \
  -o output.wav

2. Health Check

curl https://api.axtts.pixagor.net/health

✅ Service is operational! Base URL: https://api.axtts.pixagor.net/

REST API

POST /api/v1/synthesize

Synthesize text to speech and return complete audio file.

Request Parameters

Parameter	Type	Required	Description
`text`	string	required	Text to synthesize (max 5000 characters)
`voice`	string	optional	Voice model (default: "default")
`speed`	float	optional	Speech speed (0.5-2.0, default: 1.0)
`format`	string	optional	Output format: "wav" or "pcm16" (default: "wav")

Response

Content-Type: audio/wav or audio/l16

Binary audio data

Example with Options

curl -X POST 'https://api.axtts.pixagor.net/api/v1/synthesize' \
  -H 'Content-Type: application/json' \
  -d '{
    "text": "Bugün 24.10.2025 tarihinde ₺150,50 ödeme yaptım",
    "speed": 1.1,
    "format": "wav"
  }' \
  -o turkish.wav

WebSocket Streaming

WebSocket wss://api.axtts.pixagor.net/ws/stream

Real-time audio chunk streaming with low latency.

Connection

const ws = new WebSocket('wss://api.axtts.pixagor.net/ws/stream');

ws.onopen = () => {
  ws.send(JSON.stringify({
    text: "Merhaba dünya",
    format: "pcm16",
    speed: 1.0
  }));
};

ws.onmessage = (event) => {
  const chunk = JSON.parse(event.data);

  if (chunk.type === 'audio_chunk') {
    // Process base64 encoded PCM audio
    const audioData = atob(chunk.payload);
    // chunk.chunk_id for ordering
  } else if (chunk.type === 'done') {
    // Synthesis complete
    console.log(`Duration: ${chunk.duration_ms}ms`);
    console.log(`Chunks: ${chunk.chunks}`);
  } else if (chunk.type === 'error') {
    console.error(chunk.message);
  }
};

Message Types

Type	Direction	Description
`request`	Client → Server	Initial synthesis request with text and options
`audio_chunk`	Server → Client	Audio data chunk (base64 encoded PCM16)
`done`	Server → Client	Synthesis completed with stats
`error`	Server → Client	Error message

💡 Performance: WebSocket streaming provides < 300ms time to first audio (TTFA), ideal for real-time applications.

Text Normalization

The service automatically normalizes Turkish text for natural speech:

Input	Normalized Output
`1250`	bin iki yüz elli
`₺150,75`	yüz elli lira yetmiş beş kuruş
`24.10.2025`	yirmi dört Ekim iki bin yirmi beş
`14:30`	saat on dört otuz
`%20`	yüzde yirmi
`Dr.`	Doktor
`vb.`	ve benzeri

Code Examples

Python

import requests

response = requests.post(
    'https://api.axtts.pixagor.net/api/v1/synthesize',
    json={
        'text': 'Merhaba dünya',
        'format': 'wav'
    }
)

with open('output.wav', 'wb') as f:
    f.write(response.content)

JavaScript (Fetch)

fetch('https://api.axtts.pixagor.net/api/v1/synthesize', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    text: 'Merhaba dünya',
    format: 'wav'
  })
})
.then(res => res.blob())
.then(blob => {
  const url = URL.createObjectURL(blob);
  const audio = new Audio(url);
  audio.play();
});

cURL with Turkish Text

curl -X POST 'https://api.axtts.pixagor.net/api/v1/synthesize' \
  -H 'Content-Type: application/json' \
  -d '{
    "text": "Toplam 1250 kişi, 24.10.2025 tarihinde ₺150,50 ödeme yaptı.",
    "speed": 1.0,
    "format": "wav"
  }' \
  -o payment.wav

Interactive API Documentation

Explore and test the API interactively:

📖 Swagger UI

Try API endpoints directly

📚 ReDoc

Detailed API reference

📄 OpenAPI Spec

Download JSON schema

Rate Limits & Performance

Rate Limits

REST API: 10 requests/second per IP (burst: 20)
WebSocket: 5 connections/second per IP (burst: 10)

Performance Metrics

TTFA (WebSocket): < 300ms
Real-time Factor: 0.07 (14x real-time)
Throughput: ~1080 requests/minute
Max Text Length: 5000 characters

📊 Monitoring: View real-time metrics at grafana.axtts.pixagor.net

Support & Links

Base URL: api.axtts.pixagor.net
Health Check: api.axtts.pixagor.net/health
Monitoring: grafana.axtts.pixagor.net
Prometheus: prometheus.axtts.pixagor.net

🎤 AX TTS API Documentation

Overview

📖 Swagger UI

📚 ReDoc

📊 Monitoring

✅ Health Check

Features

Quick Start

1. REST API - Simple Synthesis

2. Health Check

REST API

Request Parameters

Response

Example with Options

WebSocket Streaming

Connection

Message Types

Text Normalization

Code Examples

Python

JavaScript (Fetch)

cURL with Turkish Text

Interactive API Documentation

📖 Swagger UI

📚 ReDoc

📄 OpenAPI Spec

Rate Limits & Performance

Rate Limits

Performance Metrics

Support & Links