API Reference
Upload Project
Create a new video project by uploading a video file directly for AI-powered caption generation
POST
Upload Project
Create a new video project by uploading a video file directly to Submagic. This endpoint accepts multipart/form-data uploads and is ideal for applications where you have video files stored locally or want to upload directly from user devices.This endpoint requires authentication and has a rate limit of 500 requests per
hour due to the resource-intensive nature of file uploads.
Authentication
Your Submagic API key starting with
sk-Request Body (multipart/form-data)
A descriptive title for your video project (1-100 characters)
Language code for transcription (e.g., “en”, “es”, “fr”). Use the languages
endpoint to get available options.
Video file to upload. Must be in a supported format and under 2GB.
Name of an AI edit template to apply. AI edit templates automatically apply
AI-powered scene splitting, B-roll, music, and styling to your video. Available templates:
"kelly" (minimal, design), "karl"
(effective, modern), "ella" (dynamic, bold).When aiEditTemplate is provided, the only other fields you can pass
alongside it are title, language, file, webhookUrl, and dictionary.
All other fields will be ignored or rejected.ID of a saved preset to apply to the project. A preset is a snapshot of your
project settings — including captions style, hook title, music, effects, and
more. When provided, all preset settings are automatically applied after
transcription completes. To get your preset ID, go to the Presets page in the
app, open the dropdown menu on any preset card, and click “Copy ID”.Cannot be combined with
templateName, userThemeId, hookTitle, music,
items, magicZooms, magicBrolls, magicBrollsPercentage,
removeSilencePace, or removeBadTakes — the preset already controls these
settings.Template to apply for styling. Use the templates
endpoint to get available options. Defaults to
“Sara” if not specified. Cannot be used together with
userThemeId.ID of a custom user theme to apply for styling. Must be a valid UUID of a
theme that belongs to you or your team. Cannot be used together with
templateName. You can find the id of your custom theme by opening a project,
selecting the theme, pressing the pen icon to edit it. You’ll see the id of
the theme under its name.Adds an animated hook caption. Pass
"true" to enable the default hook, or
pass a JSON string such as
{"text":"Stop scrolling—watch this in 30 seconds","template":"tiktok","top":45,"size":32}.text: Optional custom copy (1-100 characters)template: Optional template name (defaults to"tiktok"). Use the hook title templates endpoint to discover valid names.top: Optional vertical position between 0-80 (default50)size: Optional font size between 0-80 (default30)
VALIDATION_ERROR.URL to receive webhook notifications when processing is complete. Must be a
valid HTTPS URL.
JSON array string of custom words or phrases to improve transcription accuracy
(max 100 items, 50 characters each).
JSON string describing optional items to insert into the video. Pass an
array of objects. Each item must include a
type field to specify whether it’s
user media from your library or AI-generated content.Important: Each item must have a type field. Items entries cannot overlap
in time. When present, this metadata is parsed after the upload completes and
queued for rendering.Enable automatic zoom effects on the video to enhance visual engagement. Pass
“true” or “false” as string. Optional, defaults to “false”.
Enable automatic B-roll insertion to enhance video content with relevant
supplementary footage. Pass “true” or “false” as string. Optional, defaults to
“false”.
Percentage of automatic B-rolls to include in the video (0-100). Pass as
string. Only effective when magicBrolls is enabled. Optional, defaults to
“50”.
Automatically remove silence from the video at the specified pace. Pass as
string. Optional. Allowed values: “natural”, “fast”, “extra-fast”. -
“extra-fast”: 0.1-0.2 seconds of silence removal - “fast”: 0.2-0.6 seconds of
silence removal - “natural”: 0.6+ seconds of silence removal
Automatically detect and remove bad takes and silence from the video using AI
analysis. Pass “true” or “false” as string. Optional, defaults to “false”.
Enable AI-powered audio cleanup that removes background noises from the video.
Pass “true” or “false” as string. Optional, defaults to “false”.
Hide captions from the exported video. Pass
"true" or "false" as
string. Optional, defaults to "false".JSON string describing an optional background music track that spans the full
project duration.
Supported Formats & Limits
Supported Formats
- MP4 (.mp4) - MOV (.mov)
File Limits
- Max size: 2GB - Max duration: 2 hours

