API Reference
Complete reference for the img2voice API • Base URL: https://api.img2voice.com • Version: v1
Overview
The img2voice API converts images to speech. Submit an image (URL or upload), the API uses Grok vision to understand and describe it, then generates speech in any of 60+ languages. Returns both the AI description and a signed audio URL. There are no streaming endpoints, no WebSocket connections, and no SDKs required - just a simple JSON endpoint.
Requests are authenticated using an IMG2VOICE-API-KEY header. Get your API key from your dashboard.
Authentication
All requests must include your API key in the IMG2VOICE-API-KEY header. API keys are prefixed with sk_live_ for production and sk_test_ for sandbox testing.
IMG2VOICE-API-KEY: sk_live_your_api_key_hereEndpoint
https://api.img2voice.com/v1This is the main endpoint in v1. Submit an image and configuration parameters, receive a JSON response with an AI description and your audio URL.
Parameters
Send a JSON body with the following fields. Submit an image either as a URL or base64-encoded data:
| Parameter | Type | Required | Description |
|---|---|---|---|
| image_url | string | Yes* | URL of the image to analyze. Must be a public HTTP/HTTPS URL. *Either image_url, image_base64, or a multipart file upload is required. |
| image_base64 | string | Yes* | Raw base64-encoded image data (no data URI prefix). Supported formats: JPEG, PNG, WebP, BMP, TIFF. |
| image_mime_type | string | No | MIME type of the base64 image. Required when using image_base64. e.g. "image/jpeg", "image/png", "image/webp". Default: image/jpeg. |
| voice_id | string | No | Voice: eve, ara, rex, sal, or leo. Default: eve. |
| language | string | No | BCP-47 language code (e.g. "en", "es", "fr") or "auto" for auto-detect. Default: en. |
| detail | string | No | Image detail level: "brief" (1-2 sentences), "standard" (2-3 sentences), or "detailed" (3+ sentences). Default: standard. |
| output_format | string | No | "mp3" or "wav". Default: mp3. |
Response fields
| Field | Type | Description |
|---|---|---|
| id | string | Unique request ID. |
| status | string | "completed" on success. |
| description | string | AI-generated description of the image. |
| audio_url | string | URL to the generated audio file, valid for 24 hours. Serve directly or download. |
| duration_seconds | number | Estimated length of the generated audio in seconds. |
| image_credits_used | integer | Image credits consumed by this request (always 1). |
| image_credits_remaining | integer | Remaining image credits on your account after this request. |
| voice_id | string | Voice used for this request. |
| created_at | string | ISO 8601 timestamp of when the request was processed. |
| source_language | string | Detected source language (only present when translation occurred). |
| translated | boolean | true if the description was translated to the requested language. |
Code Examples
cURL (URL input)
curl -X POST https://api.img2voice.com/v1 \
-H "IMG2VOICE-API-KEY: sk_live_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"image_url": "https://example.com/photo.png",
"voice_id": "eve",
"language": "en",
"detail": "standard",
"output_format": "mp3"
}'Python (URL input)
import requests
import json
response = requests.post(
"https://api.img2voice.com/v1",
headers={
"IMG2VOICE-API-KEY": "sk_live_your_api_key_here",
"Content-Type": "application/json",
},
json={
"image_url": "https://example.com/photo.png",
"voice_id": "eve",
"language": "en",
"detail": "standard",
"output_format": "mp3",
},
timeout=15,
)
response.raise_for_status()
data = response.json()
print(data["description"]) # e.g. "A sunny beach with waves"
print(data["audio_url"]) # Stream or download this URL
print(data["duration_seconds"]) # e.g. 3.2
print(data["image_credits_used"]) # 1
print(data["image_credits_remaining"]) # credits leftJavaScript (with file upload)
async function uploadImage(file: File): Promise<void> {
const formData = new FormData();
formData.append("image", file); // field must be named "image"
formData.append("voice_id", "eve");
formData.append("language", "en");
formData.append("detail", "standard");
const res = await fetch("https://api.img2voice.com/v1", {
method: "POST",
headers: {
"IMG2VOICE-API-KEY": process.env.IMG2VOICE_API_KEY!,
// Do NOT set Content-Type — the browser sets it with the boundary
},
body: formData,
});
if (!res.ok) {
const error = await res.json();
throw new Error(`Error: ${error.error.code} - ${error.error.message}`);
}
const data = await res.json();
console.log(data.description); // AI description
console.log(data.audio_url); // Download or stream
console.log(data.image_credits_remaining); // Credits left
}Rate Limits
Rate limits apply per API key. When exceeded, requests return a 429 status with a Retry-After header.
| Plan | Req / min | Concurrent | Image credits |
|---|---|---|---|
| Free | 10 | 1 | 25 |
| Starter | 200 | 10 | Per purchase |
| Paid | 200 | 10 | Per purchase |
*Paid users can purchase additional image credit packs as needed.
Error Codes
Errors return a JSON body with an error object containing code, message, and optionally param.
{
"error": {
"code": "rate_limit_exceeded",
"message": "You have exceeded 200 requests per minute. Retry after 18 seconds.",
"retry_after": 18
}
}| Code | HTTP | Description |
|---|---|---|
| invalid_api_key | 401 | API key is missing, malformed, or revoked. |
| invalid_request | 400 | Malformed JSON, missing required fields, unsupported image format, or invalid parameter value. |
| invalid_voice | 400 | voice_id is not recognized. Valid values: eve, ara, rex, sal, leo. |
| invalid_language | 400 | language is not a valid BCP-47 code. Use a supported code or "auto". |
| rate_limit_exceeded | 429 | Requests per minute limit exceeded. Check the Retry-After header. |
| quota_exceeded | 402 | No image credits remaining. Purchase more credits to continue. |
| plan_required | 403 | This feature requires a paid plan (e.g. batch processing). |
| server_error | 500 | Internal server error. Retry the request — if it persists, contact support. |
Questions? support@img2voice.com