Audio

Audio Generation Quickstart

This quickstart walks you through generating your first audio with Infron.

Text-to-speech creation

Generates audio from the input text.

curl https://audio.onerouter.pro/v1/audio/speech \
    -H "Content-Type: application/json" \
    -H "Authorization: <API_KEY>" \
    -d '{
    "model": "gpt-4o-mini-tts",
    "input": "A cute baby sea otter",
    "voice": "alloy"
  }' \
  --output speech.mp3
  • <API_KEY> is your API Key generated in API pagearrow-up-right.

  • model is the model name, such as gpt-4o-mini-tts, available model list can be access in Model pagearrow-up-right.

  • The voice to use when generating the audio. Supported voices are alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, and verse.

Example response

circle-info

The audio file content.

Speech-to-text translation

Translates audio into English.

  • <API_KEY> is your API Key generated in API pagearrow-up-right.

  • model is the model name, such as whisper-1, available model list can be access in Model pagearrow-up-right.

  • file is the audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

Example response

Speech-to-text transcription

Transcribes audio into the input language.

  • <API_KEY> is your API Key generated in API pagearrow-up-right.

  • model is the model name, such as whisper-1, available model list can be access in Model pagearrow-up-right.

  • file is the audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

Example response

Last updated