Batch API

Process thousands of records in a single request with MarketCheck's asynchronous Batch API.

The Batch API allows you to process large files of vehicle data asynchronously. Instead of making thousands of individual API calls, you submit one file, track progress via polling or webhooks, and download the results when processing completes.

The Batch API is an Enterprise package offering and is enabled on your account upon request. All batch requests require authentication.

When to Use Batch vs Single-Request API

ScenarioRecommended API
Decode a single VINSingle-request NeoVIN Decode
Decode 1,000–3,000 VINs at onceBatch NeoVIN Decode
Rank a single vehicleSingle-request MarketMatch
Rank a large CSV of vehiclesBatch MarketMatch Rank

Use the single-request API for real-time, low-latency lookups. Use the Batch API when you have a file of records to process and can wait for asynchronous results.

Supported Batch Operations

OperationBase PathInput FormatOutput Format
NeoVIN Decode/v2/batch/neovin/decodePlain CSV (.csv)Gzip JSONL (.jsonl.gz)
MarketMatch Rank/v2/batch/marketmatch/rankGzip CSV (.csv.gz)Gzip CSV (.csv.gz)
Input file formats differ between operations. NeoVIN Decode accepts plain CSV files and rejects gzip. MarketMatch Rank accepts gzip-compressed CSV files and rejects plain CSV.

How It Works

Every batch job follows the same workflow:

1. Prepare your input file Create a CSV file with the required columns for your chosen operation. Compress it if the operation requires gzip.

2. Submit the job Upload the file via a POST request to the operation's submit endpoint. You receive a job_id immediately.

3. Track progress Poll the status endpoint using your job_id, or register a webhook URL at submission time to receive automatic notifications.

4. Download results When the job status is COMPLETED, download the result file from the download endpoint.

Job Statuses

Every batch job has one of three statuses:

StatusTerminalDescription
PROCESSINGNoThe job is being processed. Check progress_percent for progress (0–100).
COMPLETEDYesProcessing finished. Results are available for download.
FAILEDYesProcessing failed. Check error_code and error_message for details.

Jobs always start in PROCESSING and transition to either COMPLETED or FAILED. There is no way to cancel a job or return it to PROCESSING from a terminal state.

PROCESSING ───→ COMPLETED
     │
     └─────────→ FAILED

Progress Tracking

The progress_percent field (0–100) reports processing progress as a percentage:

RangeMeaning
0–10Job submitted, preparing to process
10–90Records are being processed
90–99Processing complete, preparing and finalizing results
100Complete — results ready for download
Do not use progress_percent to estimate remaining time. Progress may advance in bursts and does not correlate linearly with elapsed time. For failed jobs, progress freezes at the last value reached before the failure.

Recommended polling interval: 60 seconds. More frequent polling provides no benefit and is subject to rate limiting.

Idempotency

To safely retry a job submission after a network timeout, include an Idempotency-Key header:

Idempotency-Key: your-unique-key-123

If a job with the same idempotency key already exists for your account, the API returns the existing job instead of creating a duplicate. Idempotency keys:

  • Are scoped to your account — different accounts can use the same key
  • Are permanent — once used, a key always returns the same job
  • Are payload-independent — resubmitting with a different file and the same key returns the original job
  • Have a maximum length of 255 characters

Webhooks

Webhooks allow you to receive automatic notifications when your batch jobs complete or fail, instead of polling the status endpoint. Provide a webhook_url when submitting a job to enable them.

Requirements:

  • Must use HTTPS
  • Must resolve to a publicly routable IP address
  • Maximum length: 2,048 characters

Webhook Events

EventTriggerContent-Type
job.completedJob finished processing successfullyapplication/json
job.failedJob failed or timed outapplication/json

Verifying Webhook Signatures

If you provided a webhook_secret at submission time, each webhook includes a signature header for verification:

X-Webhook-Signature: t=1773585000,v1=a1b2c3d4e5f6...
X-Webhook-ID: 550e8400-e29b-41d4-a716-446655440000

Verification steps:

  1. Extract t (timestamp) and v1 (signature) from the X-Webhook-Signature header
  2. Reject the webhook if |current_time - t| > 300 seconds (5-minute replay window)
  3. Compute HMAC-SHA256(your_webhook_secret, "{t}.{raw_request_body}")
  4. Compare your computed signature with v1 using a constant-time comparison

Python example:

import hmac
import hashlib
import time

def verify_webhook(payload_body, signature_header, secret):
    parts = dict(p.split("=", 1) for p in signature_header.split(","))
    timestamp = parts["t"]
    expected_sig = parts["v1"]

    # Reject stale webhooks (> 5 minutes old)
    if abs(time.time() - int(timestamp)) > 300:
        raise ValueError("Webhook timestamp too old")

    computed = hmac.new(
        secret.encode(),
        f"{timestamp}.{payload_body}".encode(),
        hashlib.sha256
    ).hexdigest()

    if not hmac.compare_digest(computed, expected_sig):
        raise ValueError("Invalid webhook signature")

The X-Webhook-ID header contains the job_id and can be used as a deduplication key to handle duplicate deliveries.

In webhook payloads, completed_at is always present for both job.completed and job.failed events. In the status API response, completed_at appears only for COMPLETED jobs — it is omitted for PROCESSING and FAILED jobs.

Webhook Retry Behavior

If your webhook endpoint is unavailable, the API retries delivery with exponential backoff:

AttemptApproximate Delay
1Immediate
2~30 seconds
3~1 minute
4~2 minutes
5~4 minutes

After 5 failed attempts (~9 minutes total), the API marks the webhook as permanently failed and stops retrying.

Your ResponseAPI Behavior
2xxDelivery successful — no retries
429Rate limited — retries with Retry-After header honored
4xx (except 429)Permanent failure — no retries
5xxTemporary failure — retries with backoff
Connection error / timeoutTemporary failure — retries with backoff
Return 200 OK immediately after receiving the webhook, then process it asynchronously. If your processing takes too long, the request may time out and trigger unnecessary retries.

Downloads

Results for completed jobs are available for download via the operation's download endpoint. Each download call streams the result file directly.

Key constraints:

  • 3 downloads per job. Each call to the download endpoint consumes one slot. Save the file on first download.
  • POST method. The download endpoint uses POST (not GET) because each call has side effects (incrementing the download counter).

File Integrity Verification

Every download response includes an X-File-Checksum header containing the SHA-256 checksum of the file. Verify integrity after downloading:

echo "EXPECTED_CHECKSUM  results.jsonl.gz" | sha256sum -c -

Where EXPECTED_CHECKSUM is the value from the X-File-Checksum response header.

Error Format

All batch API errors follow the RFC 7807 Problem Details standard:

{
    "type": "about:blank",
    "title": "Bad Request",
    "status": 400,
    "detail": "Uploaded file must be a valid CSV (.csv)",
    "code": "invalid_file_format",
    "instance": "req-a1b2c3d4"
}
FieldTypeDescription
typestringAlways "about:blank"
titlestringHTTP status phrase
statusintegerHTTP status code
detailstringHuman-readable error description
codestringMachine-readable error code for programmatic handling
instancestringRequest ID — include when contacting support

Concurrency

Each account is limited to 1 active job per operation at a time. Submitting a second job while one is active returns 409 Conflict with error code active_job_exists. Wait for your current job to complete or fail before submitting a new one.

You can have one NeoVIN Decode job and one MarketMatch Rank job running simultaneously, as the limit is per operation.

Best Practices

Retry strategy:

  • Use the Idempotency-Key header on all submission requests to safely retry after timeouts
  • For 5xx errors and network timeouts, retry with exponential backoff: wait 1s, then 2s, then 4s, up to 30s between attempts
  • Never retry 4xx errors — these indicate a problem with your request that must be fixed

Efficient polling:

  • Poll the status endpoint every 60 seconds, not more frequently
  • Use webhooks instead of polling when possible — they provide immediate notification with no wasted requests

File handling:

  • Verify downloaded files using the X-File-Checksum header
  • Save downloaded files immediately — you have only 3 download attempts per job

Error handling:

  • Always check the code field in error responses for programmatic error handling
  • Log the instance field from error responses — include this when contacting support
  • Handle 409 active_job_exists gracefully — this means your previous job is still running, not that something is broken