Skip to main content
Version: v1.2

Video Chat

By providing the videoNos, developers can ask the Memories.ai to analyze, summarize, annotate, or perform other reasoning tasks across all uploaded videos. This API also supports streaming responses to reduce latency during generation.


uploadupload

Prerequisites

  • You have created a memories.ai API key.
  • You have uploaded a video via the Upload API and obtained its videoNo.
  • The video is currently in the PARSE status.

Host URL

  • https://api.memories.ai

Endpoint

POST /serve/api/v1/chat


Request Example

Streaming mode:

import requests
import json

headers = {
"Authorization": "<API_KEY>",
"Content-Type": "application/json",
"Accept": "text/event-stream"
}

payload = {
"video_nos": ["<VIDEO_ID_1>", "<VIDEO_ID_2>"], # List of video IDs to chat about
"prompt": "Summarize the emotional moments in these videos", # User query
"session_id": "<SESSION_ID>", # Chat session ID
"unique_id": "<UNIQUE_ID>",
}

response = requests.post(
"https://api.memories.ai/serve/api/v1/chat",
headers=headers,
data=json.dumps(payload),
stream=True
)

if response.status_code != 200:
print(response.status_code)
print(response.text)
else:
try:
for line in response.iter_lines(decode_unicode=True):
if line:
print(line)
if line.strip().lower() == 'data:"done"':
print("\n")
break
if line.startswith("data:"):
print(line.replace("data:", "").strip(), end="", flush=True)
except Exception as e:
print(str(e))

Non-streaming mode:

import requests
import json

headers = {
"Authorization": "<API_KEY>",
"Content-Type": "application/json",
}

payload = {
"video_nos": ["<VIDEO_ID_1>", "<VIDEO_ID_2>"], # List of video IDs to chat about
"prompt": "Summarize the emotional moments in these videos", # User query
"session_id": "<SESSION_ID>", # Chat session ID
"unique_id": "<UNIQUE_ID>",
}

response = requests.post(
"https://api.memories.ai/serve/api/v1/chat",
headers=headers,
data=json.dumps(payload),
stream=False
)

if response.status_code != 200:
print(response.status_code)
print(response.text)
else:
try:
for line in response.iter_lines(decode_unicode=True):
if line:
print(line)
if line.strip().lower() == 'data:"done"':
print("\n")
break
if line.startswith("data:"):
print(line.replace("data:", "").strip(), end="", flush=True)
except Exception as e:
print(str(e))

Request Body

{
"videoNos": [
"string"
],
"prompt": "string",
"session_id": "123456",
"unique_id": "default",
}

Request Parameters

NameLocationTypeRequiredDescription
AuthorizationheaderstringYesAPI key used for authorization
AcceptheaderstringNotext/event-stream for streaming mode
video_nosbody[string]Yeslist of video numbers
promptbodystringYesnatural language prompt
session_idbodyintNoID of the chat session
unique_idbodystringNodefault by default

Response Example

🧠 Thinking Message

{
"type": "thinking",
"title": "Based on selected videos, fetch detailed information",
"content": "Okay, the user wants a \"Video summary.\" I've been given some selected videos and need to fetch their detailed information to understand their content. This means I need to go",
"sessionId": "606120397607260160"
}

🔁 Reference Message (ref)

{
"type": "ref",
"sessionId": "606143186439766016",
"ref": [{
"video": {
"duration": "10",
"video_no": "VI606140356924534784",
"video_name": "test_video_gz_visual_understanding_fuse_s9_video_fuse_4_video_fuse_4"
},
"refItems": [{
"videoNo": "VI606140356924534784",
"startTime": 23,
"type": "keyframe"
}
{
"videoNo": "VI606140356924534784",
"startTime": 30,
"type": "visual_ts",
"endTime": 36,
"text": "A close-up view shows a collection of items on a concrete surface. To the left, two brown, round objects resembling small barrels or planters are visible. One has a blue lid or insert. Next to them, a black corrugated pipe is partially visible, along with a red flexible tube and a yellow pole with a red tip. A grey, textured wall or fence dominates the right side of the frame, with wooden planks visible at the top right."
},{
"videoNo": "VI606140356924534784",
"startTime": 30,
"type": "audio_ts",
"endTime": 36,
"text": "A close-up view shows a collection of items on a concrete surface. To the left, two brown, round objects resembling small barrels or planters are visible. One has a blue lid or insert. Next to them, a black corrugated pipe is partially visible, along with a red flexible tube and a yellow pole with a red tip. A grey, textured wall or fence dominates the right side of the frame, with wooden planks visible at the top right."
}
]
}]
}

💬 Content Message

{
"type": "content",
"role": "assistant",
"content": "A\" shape, is suggested to be inspired by Eiffel's past",
"sessionId": "606122521255088128"
}

✅ Response End Example Success Response (Status Code: 200)

{
"code": "SUCCESS",
"data": "Done"
}

⚠️ Error End Conditions Any final response where "data" is not "Done" is considered an error. Common examples include: "data": "Error" "data": "No videos found for the provided video numbers." "data": "user don't login" "data": "Video is not parsed:"

Response End Example

Status code 200

{
"code": "SUCCESS",
"data": "Done"
}

Response Result

Status codeStatus code msgDescriptionData
200OKnoneInline

Response Structure

Status code 200

NameTypeRequiredRestrictionDescription
codestringtruenoneResponse code
dataobject or stringtruenoneJSON data. "Done" on success; error message string or structured data otherwise
typestringtrueenumType of message, e.g., "thinking", "ref", "content"
titlestringfalsenoneTitle of the thinking message (used when type is "thinking")
contentstringfalsenoneText content of the message (used in "thinking" or "content" types)
sessionIdstringtrueUUID/IDID of the current session
rolestringfalseenumRole of the responder; currently used with "assistant"
refarrayfalsenoneList of reference objects containing video and timestamp-based metadata
videoobjecttruenestedVideo metadata (only when type is "ref")
video_nostringtruenoneUnique video identifier
video_namestringtruenoneName of the video
durationstring or numbertruesecondsDuration of the video
refItemsarraytruenoneReference annotations such as keyframe, visual_ts, audio_ts
videoNostringtruenoneVideo identifier in refItems (redundant with video_no)
startTimenumbertruesecondsStart timestamp of the referenced segment
endTimenumberoptionalsecondsEnd timestamp (if applicable, e.g., for visual_ts or audio_ts)
textstringoptionalnoneTranscribed or described content of the referenced segment