Skip to main content
Version: v1.2

Image Caption

Use this API to analyze images and automatically generate captions or descriptive text.

Prerequisites

  • You’re familiar with the concepts described on the Platform overview page.
  • You have created a memories.ai API key.
  • Supported file formats: image/png, image/jpeg

Host URL

  • https://security.memories.ai

Endpoints

  • POST /v1/understand/uploadImg – Upload image by URL
  • POST /v1/understand/uploadImg – Upload image by local file (multipart form)

Request Example (Upload by URL)

import requests, json

url = "https://security.memories.ai/v1/understand/uploadImg"
headers = {"Authorization": "<API_KEY>"}

json_body = {
"url": "https://example.com/test_image.png",
"user_prompt": "What's happening in this picture?",
"system_prompt": "You are an image understanding system.",
"thinking": False
}

response = requests.post(url, headers=headers, json=json_body)
print(response.json())

Replace the following placeholders:

  • API_KEY: Your actual memories.ai API key.
  • url: Publicly accessible image URL.

Request Example (Upload by Local File)

import requests, json

url = "https://security.memories.ai/v1/understand/uploadImg"
headers = {"Authorization": "<API_KEY>"}

data = {
"user_prompt": "What's happening in this picture?",
"system_prompt": "You are an image understanding system.",
"thinking": False
}

files = [
("req", ("req.json", json.dumps(data), "application/json")),
("file", ("test_image.png", open("test_image.png", "rb"), "image/png"))
]

response = requests.post(url, files=files, headers=headers)
print(response.json())

Response Example

Status code 200

{
"code": 0,
"msg": "success",
"data": {
"text": "It shows a person lying on the ground. The person's clothing and posture are indistinct due to the poor image quality. It's impossible to determine if they are injured, unconscious, or simply resting.",
"token": {
"input": 273,
"output": 79,
"total": 352
}
}
}

Response Structure

NameTypeRequiredDescription
codeintYesStatus code (0 for success, -1 for failure)
msgstringYesMessage text
dataobjectYesResponse data
» textstringYesGenerated caption or descriptive text
» tokenobjectYesToken usage details
» inputintYesNumber of input tokens
» outputintYesNumber of output tokens
» totalintYesTotal token count

Note: The thinking parameter allows toggling reasoning mode for more detailed responses.