API Reference
Convert Document
Convert a PDF, image, or document to markdown, HTML, JSON, or chunks. Use save_checkpoint=true to save parsed state for later /extract or /segment calls.
Authorizations
X-API-KeystringheaderrequiredYour API key for authentication
Body Parameters
file_urlstringbodyOptional file URL (http/https). If provided, the server will download and process it.
modestringbodyOutput mode: 'fast' (lowest latency), 'balanced', 'accurate' (highest accuracy).
max_pagesintegerbodyMaximum number of pages to convert.
page_rangestringbodyPage range to convert, comma separated like 0,5-10,20. Overrides max_pages.
paginatebooleanbodyPaginate the output. Pages separated by horizontal rule with page number.
add_block_idsbooleanbodyAdd data-block-id attributes to HTML elements for citation tracking.
include_markdown_in_chunksbooleanbodyInclude markdown field in chunks and JSON output.
disable_image_extractionbooleanbodyDisable image extraction from the document.
disable_image_captionsbooleanbodyDisable synthetic image captions/descriptions in output.
fence_synthetic_captionsbooleanbodyWrap synthetic image captions with HTML comment markers.
output_formatstringbodyOutput format: 'markdown', 'html', 'json', 'chunks'.
token_efficient_markdownbooleanbodyUse token-efficient markdown output.
skip_cachebooleanbodySkip the cache and re-run.
save_checkpointbooleanbodySave a checkpoint after processing for future /extract or /segment calls.
additional_configstringbodyAdditional configuration as JSON string.
webhook_urlstringbodyOptional webhook URL to call when complete.
force_newbooleanbodyForce a new conversion even if cached.
filefilebodyInput PDF, Word, PowerPoint, or image file. Images must be png, jpg, or webp.
Cookies
wos-sessionstringcookieSession cookie
access_tokenstringcookieAccess token cookie
datalab_active_teamstringcookieActive team cookie
Response
Successful Response
Convert Document
import requests
url = "https://www.datalab.to/api/v1/convert"
files = {
"file.0": ("example-file", open("example-file", "rb"))
}
payload = {
"file_url": "<string>",
"mode": "fast",
"max_pages": "123",
"page_range": "<string>",
"output_format": "<string>",
"skip_cache": "false",
"save_checkpoint": "false"
}
headers = {"X-API-Key": "<api-key>"}
response = requests.post(url, data=payload, files=files, headers=headers)
print(response.text)200Success
{
"request_id": "<string>",
"request_check_url": "<string>",
"success": true,
"error": "<string>",
"versions": {}
}