Home/API Reference

API Reference

Complete reference for the img2voice API • Base URL: https://api.img2voice.com • Version: v1

Overview

The img2voice API converts images to speech. Submit an image (URL or upload), the API uses Grok vision to understand and describe it, then generates speech in any of 60+ languages. Returns both the AI description and a signed audio URL. There are no streaming endpoints, no WebSocket connections, and no SDKs required - just a simple JSON endpoint.

Requests are authenticated using an IMG2VOICE-API-KEY header. Get your API key from your dashboard.

Authentication

All requests must include your API key in the IMG2VOICE-API-KEY header. API keys are prefixed with sk_live_ for production and sk_test_ for sandbox testing.

Never expose your API key in client-side code. Always make requests from your server.
http - request header
IMG2VOICE-API-KEY: sk_live_your_api_key_here

Endpoint

POSThttps://api.img2voice.com/v1

This is the main endpoint in v1. Submit an image and configuration parameters, receive a JSON response with an AI description and your audio URL.

Parameters

Send a JSON body with the following fields. Submit an image either as a URL or base64-encoded data:

ParameterTypeRequiredDescription
image_urlstringYes*URL of the image to analyze. Must be a public HTTP/HTTPS URL. *Either image_url, image_base64, or a multipart file upload is required.
image_base64stringYes*Raw base64-encoded image data (no data URI prefix). Supported formats: JPEG, PNG, WebP, BMP, TIFF.
image_mime_typestringNoMIME type of the base64 image. Required when using image_base64. e.g. "image/jpeg", "image/png", "image/webp". Default: image/jpeg.
voice_idstringNoVoice: eve, ara, rex, sal, or leo. Default: eve.
languagestringNoBCP-47 language code (e.g. "en", "es", "fr") or "auto" for auto-detect. Default: en.
detailstringNoImage detail level: "brief" (1-2 sentences), "standard" (2-3 sentences), or "detailed" (3+ sentences). Default: standard.
output_formatstringNo"mp3" or "wav". Default: mp3.

Response fields

FieldTypeDescription
idstringUnique request ID.
statusstring"completed" on success.
descriptionstringAI-generated description of the image.
audio_urlstringURL to the generated audio file, valid for 24 hours. Serve directly or download.
duration_secondsnumberEstimated length of the generated audio in seconds.
image_credits_usedintegerImage credits consumed by this request (always 1).
image_credits_remainingintegerRemaining image credits on your account after this request.
voice_idstringVoice used for this request.
created_atstringISO 8601 timestamp of when the request was processed.
source_languagestringDetected source language (only present when translation occurred).
translatedbooleantrue if the description was translated to the requested language.

Code Examples

cURL (URL input)

bash
curl -X POST https://api.img2voice.com/v1 \
  -H "IMG2VOICE-API-KEY: sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://example.com/photo.png",
    "voice_id": "eve",
    "language": "en",
    "detail": "standard",
    "output_format": "mp3"
  }'

Python (URL input)

python
import requests
import json

response = requests.post(
    "https://api.img2voice.com/v1",
    headers={
        "IMG2VOICE-API-KEY": "sk_live_your_api_key_here",
        "Content-Type": "application/json",
    },
    json={
        "image_url": "https://example.com/photo.png",
        "voice_id": "eve",
        "language": "en",
        "detail": "standard",
        "output_format": "mp3",
    },
    timeout=15,
)
response.raise_for_status()
data = response.json()

print(data["description"])               # e.g. "A sunny beach with waves"
print(data["audio_url"])                 # Stream or download this URL
print(data["duration_seconds"])          # e.g. 3.2
print(data["image_credits_used"])        # 1
print(data["image_credits_remaining"])   # credits left

JavaScript (with file upload)

typescript
async function uploadImage(file: File): Promise<void> {
  const formData = new FormData();
  formData.append("image", file);  // field must be named "image"
  formData.append("voice_id", "eve");
  formData.append("language", "en");
  formData.append("detail", "standard");

  const res = await fetch("https://api.img2voice.com/v1", {
    method: "POST",
    headers: {
      "IMG2VOICE-API-KEY": process.env.IMG2VOICE_API_KEY!,
      // Do NOT set Content-Type — the browser sets it with the boundary
    },
    body: formData,
  });

  if (!res.ok) {
    const error = await res.json();
    throw new Error(`Error: ${error.error.code} - ${error.error.message}`);
  }

  const data = await res.json();
  console.log(data.description);             // AI description
  console.log(data.audio_url);               // Download or stream
  console.log(data.image_credits_remaining); // Credits left
}

Rate Limits

Rate limits apply per API key. When exceeded, requests return a 429 status with a Retry-After header.

PlanReq / minConcurrentImage credits
Free10125
Starter20010Per purchase
Paid20010Per purchase

*Paid users can purchase additional image credit packs as needed.

Error Codes

Errors return a JSON body with an error object containing code, message, and optionally param.

json - error response
{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "You have exceeded 200 requests per minute. Retry after 18 seconds.",
    "retry_after": 18
  }
}
CodeHTTPDescription
invalid_api_key401API key is missing, malformed, or revoked.
invalid_request400Malformed JSON, missing required fields, unsupported image format, or invalid parameter value.
invalid_voice400voice_id is not recognized. Valid values: eve, ara, rex, sal, leo.
invalid_language400language is not a valid BCP-47 code. Use a supported code or "auto".
rate_limit_exceeded429Requests per minute limit exceeded. Check the Retry-After header.
quota_exceeded402No image credits remaining. Purchase more credits to continue.
plan_required403This feature requires a paid plan (e.g. batch processing).
server_error500Internal server error. Retry the request — if it persists, contact support.