API Reference

Convert Document

Convert a PDF, image, or document to markdown, HTML, JSON, or chunks. Use save_checkpoint=true to save parsed state for later /extract or /segment calls.

POST/api/v1/convert

Authorizations

X-API-Keystringheaderrequired

Your API key for authentication

Body Parameters

file_urlstringbody

Optional file URL (http/https). If provided, the server will download and process it.

modestringbody

Output mode: 'fast' (lowest latency), 'balanced', 'accurate' (highest accuracy).

max_pagesintegerbody

Maximum number of pages to convert.

page_rangestringbody

Page range to convert, comma separated like 0,5-10,20. Overrides max_pages.

paginatebooleanbody

Paginate the output. Pages separated by horizontal rule with page number.

add_block_idsbooleanbody

Add data-block-id attributes to HTML elements for citation tracking.

include_markdown_in_chunksbooleanbody

Include markdown field in chunks and JSON output.

disable_image_extractionbooleanbody

Disable image extraction from the document.

disable_image_captionsbooleanbody

Disable synthetic image captions/descriptions in output.

fence_synthetic_captionsbooleanbody

Wrap synthetic image captions with HTML comment markers.

output_formatstringbody

Output format: 'markdown', 'html', 'json', 'chunks'.

token_efficient_markdownbooleanbody

Use token-efficient markdown output.

skip_cachebooleanbody

Skip the cache and re-run.

save_checkpointbooleanbody

Save a checkpoint after processing for future /extract or /segment calls.

additional_configstringbody

Additional configuration as JSON string.

webhook_urlstringbody

Optional webhook URL to call when complete.

force_newbooleanbody

Force a new conversion even if cached.

filefilebody

Input PDF, Word, PowerPoint, or image file. Images must be png, jpg, or webp.

Cookies

wos-sessionstringcookie

Session cookie

access_tokenstringcookie

Access token cookie

datalab_active_teamstringcookie

Active team cookie

Response

Successful Response

import requests

url = "https://www.datalab.to/api/v1/convert"
files = {
    "file.0": ("example-file", open("example-file", "rb"))
}
payload = {
    "file_url": "<string>",
    "mode": "fast",
    "max_pages": "123",
    "page_range": "<string>",
    "output_format": "<string>",
    "skip_cache": "false",
    "save_checkpoint": "false"
}
headers = {"X-API-Key": "<api-key>"}
response = requests.post(url, data=payload, files=files, headers=headers)
print(response.text)

{
  "request_id": "<string>",
  "request_check_url": "<string>",
  "success": true,
  "error": "<string>",
  "versions": {}
}

← PreviousCreate Document Next →Extract Structured Data