docs
UAEN
API Reference

Segment Document

Segment a document into sections using a schema. Returns page ranges for each identified segment. Provide a file for end-to-end processing, or a checkpoint_id from a previous /convert call.

POST/api/v1/segment

Authorizations

X-API-Keystringheaderrequired
Your API key for authentication

Body Parameters

segmentation_schemastringbodyrequired
The JSON schema for document segmentation. Should contain segment names and descriptions.
file_urlstringbody
Optional file URL. Provide either file/file_url or checkpoint_id.
checkpoint_idstringbody
Checkpoint ID from a previous /convert request. Skips re-parsing.
modestringbody
Output mode for parsing. 'fast', 'balanced', or 'accurate'.
max_pagesintegerbody
Maximum number of pages to process.
page_rangestringbody
Page range to process, comma separated like 0,5-10,20.
save_checkpointbooleanbody
Save a checkpoint after processing.
skip_cachebooleanbody
Skip the cache and re-run.
webhook_urlstringbody
Optional webhook URL to call when complete.
filefilebody
Input PDF, Word, PowerPoint, or image file. Images must be png, jpg, or webp.

Cookies

wos-sessionstringcookie
Session cookie
access_tokenstringcookie
Access token cookie
datalab_active_teamstringcookie
Active team cookie

Response

Successful Response
Segment Document
import requests

url = "https://www.datalab.to/api/v1/segment"
files = {
    "file.0": ("example-file", open("example-file", "rb"))
}
payload = {
    "segmentation_schema": "<string>",
    "mode": "fast",
    "skip_cache": "false"
}
headers = {"X-API-Key": "<api-key>"}
response = requests.post(url, data=payload, files=files, headers=headers)
print(response.text)
200Success
{
  "request_id": "<string>",
  "request_check_url": "<string>",
  "success": true,
  "error": "<string>",
  "versions": {}
}