Skip to main content
POST
/
api
/
agent
/
extract
Web Extract
curl --request POST \
  --url https://encrata.com/api/agent/extract \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "url": "<string>",
  "mode": "<string>",
  "selectors": {},
  "render_js": true,
  "block_ads": true,
  "block_trackers": true,
  "wait_for": "<string>",
  "timeout": 123,
  "headers": {}
}
'
{
  "success": true,
  "url": "https://example.com/product/123",
  "status_code": 200,
  "extracted": {
    "title": "Wireless Headphones",
    "price": "$129.00"
  },
  "metadata": {
    "title": "Wireless Headphones — Example",
    "language": "en"
  },
  "credits": 3,
  "latency_ms": 2140
}

Authentication

Requires an API key in the Authorization header.
Authorization: Bearer YOUR_API_KEY

Request

url
string
required
The URL to extract from. Must be a valid http:// or https:// URL.
mode
string
Extraction mode. markdown (default) returns clean page content; selectors returns structured JSON keyed by the selectors map.
selectors
object
Map of output field name → CSS/XPath selector. Used when mode is selectors.
render_js
boolean
Render the page in a headless browser before extracting. Default: true.
block_ads
boolean
Block ad networks while loading. Default: true.
block_trackers
boolean
Block tracking scripts while loading. Default: true.
wait_for
string
CSS selector to wait for before extracting (useful for lazy-loaded content).
timeout
integer
Page load timeout in milliseconds. Default: 30000. Maximum: 60000.
headers
object
Optional custom request headers to send with the page load.

Example request

curl -X POST "https://encrata.com/api/agent/extract" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/product/123",
    "mode": "selectors",
    "selectors": { "title": "h1", "price": ".price" }
  }'

Response

success
boolean
Whether the extraction succeeded.
url
string
The final URL that was extracted (after redirects).
status_code
integer
HTTP status code returned by the target page.
extracted
object
The extracted data — markdown content, or a JSON object keyed by your selectors.
metadata
object
Page metadata such as title, description, and language.
error_code
string
Machine-readable error code when success is false.
error
string
Human-readable error message when success is false.
credits
integer
Credits consumed. 0 on a failed (refunded) extraction.
latency_ms
integer
Total processing time in milliseconds.
{
  "success": true,
  "url": "https://example.com/product/123",
  "status_code": 200,
  "extracted": {
    "title": "Wireless Headphones",
    "price": "$129.00"
  },
  "metadata": {
    "title": "Wireless Headphones — Example",
    "language": "en"
  },
  "credits": 3,
  "latency_ms": 2140
}

Credits

3 credits per request. Failed extractions (blocked, captcha, rate-limited) are automatically refunded.