Home/API Reference

API Reference

Complete reference for the img2voice API • Base URL: https://api.img2voice.com • Version: v1

Overview

The img2voice API converts images to speech. Submit an image (URL or upload), the API uses Grok vision to understand and describe it, then generates speech in any of 60+ languages. Returns both the AI description and a signed audio URL. There are no streaming endpoints, no WebSocket connections, and no SDKs required - just a simple JSON endpoint.

Requests are authenticated using an IMG2VOICE-API-KEY header. Get your API key from your dashboard.

Authentication

All requests must include your API key in the IMG2VOICE-API-KEY header. API keys are prefixed with sk_live_ for production and sk_test_ for sandbox testing.

Never expose your API key in client-side code. Always make requests from your server.

http - request header

IMG2VOICE-API-KEY: sk_live_your_api_key_here

Endpoint

POSThttps://api.img2voice.com/v1

This is the main endpoint in v1. Submit an image and configuration parameters, receive a JSON response with an AI description and your audio URL.

Parameters

Send a JSON body with the following fields. Submit an image either as a URL or base64-encoded data:

Parameter	Type	Required	Description
image_url	string	Yes*	URL of the image to analyze. Must be a public HTTP/HTTPS URL. *Either image_url, image_base64, or a multipart file upload is required.
image_base64	string	Yes*	Raw base64-encoded image data (no data URI prefix). Supported formats: JPEG, PNG, WebP, BMP, TIFF.
image_mime_type	string	No	MIME type of the base64 image. Required when using image_base64. e.g. "image/jpeg", "image/png", "image/webp". Default: image/jpeg.
voice_id	string	No	Voice: eve, ara, rex, sal, or leo. Default: eve.
language	string	No	BCP-47 language code (e.g. "en", "es", "fr") or "auto" for auto-detect. Default: en.
detail	string	No	Image detail level: "brief" (1-2 sentences), "standard" (2-3 sentences), or "detailed" (3+ sentences). Default: standard.
output_format	string	No	"mp3" or "wav". Default: mp3.

Response fields

Field	Type	Description
id	string	Unique request ID.
status	string	"completed" on success.
description	string	AI-generated description of the image.
audio_url	string	URL to the generated audio file, valid for 24 hours. Serve directly or download.
duration_seconds	number	Estimated length of the generated audio in seconds.
image_credits_used	integer	Image credits consumed by this request (always 1).
image_credits_remaining	integer	Remaining image credits on your account after this request.
voice_id	string	Voice used for this request.
created_at	string	ISO 8601 timestamp of when the request was processed.
source_language	string	Detected source language (only present when translation occurred).
translated	boolean	true if the description was translated to the requested language.

Code Examples

cURL (URL input)

bash

curl -X POST https://api.img2voice.com/v1 \
  -H "IMG2VOICE-API-KEY: sk_live_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "image_url": "https://example.com/photo.png",
    "voice_id": "eve",
    "language": "en",
    "detail": "standard",
    "output_format": "mp3"
  }'

Python (URL input)

python

import requests
import json

response = requests.post(
    "https://api.img2voice.com/v1",
    headers={
        "IMG2VOICE-API-KEY": "sk_live_your_api_key_here",
        "Content-Type": "application/json",
    },
    json={
        "image_url": "https://example.com/photo.png",
        "voice_id": "eve",
        "language": "en",
        "detail": "standard",
        "output_format": "mp3",
    },
    timeout=15,
)
response.raise_for_status()
data = response.json()

print(data["description"])               # e.g. "A sunny beach with waves"
print(data["audio_url"])                 # Stream or download this URL
print(data["duration_seconds"])          # e.g. 3.2
print(data["image_credits_used"])        # 1
print(data["image_credits_remaining"])   # credits left

JavaScript (with file upload)

typescript

async function uploadImage(file: File): Promise<void> {
  const formData = new FormData();
  formData.append("image", file);  // field must be named "image"
  formData.append("voice_id", "eve");
  formData.append("language", "en");
  formData.append("detail", "standard");

  const res = await fetch("https://api.img2voice.com/v1", {
    method: "POST",
    headers: {
      "IMG2VOICE-API-KEY": process.env.IMG2VOICE_API_KEY!,
      // Do NOT set Content-Type — the browser sets it with the boundary
    },
    body: formData,
  });

  if (!res.ok) {
    const error = await res.json();
    throw new Error(`Error: ${error.error.code} - ${error.error.message}`);
  }

  const data = await res.json();
  console.log(data.description);             // AI description
  console.log(data.audio_url);               // Download or stream
  console.log(data.image_credits_remaining); // Credits left
}

Rate Limits

Rate limits apply per API key. When exceeded, requests return a 429 status with a Retry-After header.

Plan	Req / min	Concurrent	Image credits
Free	10	1	25
Starter	200	10	Per purchase
Paid	200	10	Per purchase

*Paid users can purchase additional image credit packs as needed.

Error Codes

Errors return a JSON body with an error object containing code, message, and optionally param.

json - error response

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "You have exceeded 200 requests per minute. Retry after 18 seconds.",
    "retry_after": 18
  }
}

Code	HTTP	Description
invalid_api_key	401	API key is missing, malformed, or revoked.
invalid_request	400	Malformed JSON, missing required fields, unsupported image format, or invalid parameter value.
invalid_voice	400	voice_id is not recognized. Valid values: eve, ara, rex, sal, leo.
invalid_language	400	language is not a valid BCP-47 code. Use a supported code or "auto".
rate_limit_exceeded	429	Requests per minute limit exceeded. Check the Retry-After header.
quota_exceeded	402	No image credits remaining. Purchase more credits to continue.
plan_required	403	This feature requires a paid plan (e.g. batch processing).
server_error	500	Internal server error. Retry the request — if it persists, contact support.

Questions? support@img2voice.com