Analyzer API · v1

RentHistory Analyzer API

Submit a DHCR rent history PDF by URL and poll for structured JSON. Generate a key from your dashboard; calls are authenticated with a bearer token.

Base: https://renthistory.org/v1 Auth: Bearer token Async jobs

Overview

The Analyzer API takes a hosted DHCR rent history PDF and returns a parsed, structured result you can feed into your own tools. The flow is:

  1. Upload the PDF to your own storage (or any HTTPS URL we can fetch).
  2. POST /v1/analyze with the file_url. You get a job_id.
  3. Poll GET /v1/jobs/{job_id} until status is completed (or error).

Quickstart

End-to-end in three calls.

# 1. Submit a PDF for analysis
curl https://renthistory.org/v1/analyze \
  -H "Authorization: Bearer $RH_KEY" \
  -H "Content-Type: application/json" \
  -d '{"file_url":"https://yourhost.com/rent-history.pdf"}'
# → {"job_id":"job_80b524d73aa4661023c8361c","status":"pending"}

# 2. Poll for the result
curl https://renthistory.org/v1/jobs/job_80b524d73aa4661023c8361c \
  -H "Authorization: Bearer $RH_KEY"

# 3. Check your daily usage
curl https://renthistory.org/v1/usage \
  -H "Authorization: Bearer $RH_KEY"
const KEY = process.env.RH_KEY;
const BASE = "https://renthistory.org";
const headers = { "Authorization": `Bearer ${KEY}` };

const { job_id } = await fetch(`${BASE}/v1/analyze`, {
  method: "POST",
  headers: { ...headers, "Content-Type": "application/json" },
  body: JSON.stringify({ file_url: "https://yourhost.com/rent-history.pdf" }),
}).then(r => r.json());

let job;
do {
  await new Promise(r => setTimeout(r, 2000));
  job = await fetch(`${BASE}/v1/jobs/${job_id}`, { headers }).then(r => r.json());
} while (job.status === "pending" || job.status === "processing");

console.log(job);
import os, time, json, requests

KEY  = os.environ["RH_KEY"]
BASE = "https://renthistory.org"
H    = {"Authorization": f"Bearer {KEY}"}

r = requests.post(f"{BASE}/v1/analyze", headers=H,
                  json={"file_url": "https://yourhost.com/rent-history.pdf"})
job_id = r.json()["job_id"]

while True:
    time.sleep(2)
    job = requests.get(f"{BASE}/v1/jobs/{job_id}", headers=H).json()
    if job["status"] in ("completed", "error"):
        break

# Pretty-print as real JSON (not Python's single-quoted repr)
print(json.dumps(job, indent=2))

Authentication

Every request requires a bearer token. Generate one from the Dashboard → API Access app. Keys look like rh_live_… and are shown once at creation time.

Authorization: Bearer rh_live_d6924970c13182bb6383851eb7d15fbb
Keep your key server-side. Never ship it to a browser, mobile app, or public repo. If a key leaks, revoke it from the dashboard and generate a new one.

Submit a document

POST /v1/analyze

Queues a new analysis job. Returns immediately with a job_id.

Body

FieldTypeDescription
file_url required string An HTTPS URL we can fetch. Must point to a DHCR rent history PDF (or image). HTTP URLs are rejected.

Example

curl https://renthistory.org/v1/analyze \
  -H "Authorization: Bearer $RH_KEY" \
  -H "Content-Type: application/json" \
  -d '{"file_url":"https://yourhost.com/rent-history.pdf"}'

Response

{
  "job_id": "job_80b524d73aa4661023c8361c",
  "status": "pending"
}

Check a job

GET /v1/jobs/{job_id}

Returns the current state of a job. Poll every 1–3 seconds until status leaves pending/processing.

While the job is still running

{
  "job_id": "job_80b524d73aa4661023c8361c",
  "status": "processing",
  "created_at": "2026-04-21T14:05:22.097Z"
}

On failure

{
  "job_id": "job_80b524d73aa4661023c8361c",
  "status": "error",
  "created_at": "2026-04-21T14:05:22.097Z",
  "error": "Request failed with status code 500"
}

On success

When status is completed, the response includes the parsed rent history. The exact shape depends on the document; typical fields include the normalized apartment address and the per-year registration rows.

Stream job progress

GET /v1/jobs/{job_id}/stream

Server-Sent Events (text/event-stream). Holds the connection open and pushes an update event every time the job’s status changes, then a final done event when the job reaches completed or error. No polling required — great for CLIs, watchers, and UIs that want a live progress line.

Heartbeat comments (: ping) are sent every 15 seconds so proxies don’t drop the connection. Streams time out after 5 minutes; just reconnect (or fall back to /v1/jobs/{id}) if that happens.

Example — watch a job from the terminal

curl -N \
  -H "Authorization: Bearer $RH_KEY" \
  https://renthistory.org/v1/jobs/job_80b524d73aa4661023c8361c/stream

Typical output:

event: update
data: {"job_id":"job_80b524d73aa4661023c8361c","status":"processing","created_at":"2026-04-21T15:02:09.031Z"}

: ping 1776783744144

event: update
data: {"job_id":"job_80b524d73aa4661023c8361c","status":"completed","created_at":"…","completed_at":"…","results":{…}}

event: done
data: {"job_id":"…","status":"completed",…}

One-liner with a spinner

Drop this into your shell. It submits a PDF, streams the job, and shows a live status line with elapsed time.

rh_analyze() {
  local key="$RH_KEY" url="$1"
  local job=$(curl -s -X POST "https://renthistory.org/v1/analyze" \
    -H "Authorization: Bearer $key" -H "Content-Type: application/json" \
    -d "{\"file_url\":\"$url\"}" | python3 -c 'import sys,json;print(json.load(sys.stdin)["job_id"])')
  local start=$SECONDS frames='⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏' i=0
  curl -sN -H "Authorization: Bearer $key" \
    "https://renthistory.org/v1/jobs/$job/stream" | \
  while IFS= read -r line; do
    case "$line" in
      data:*)
        status=$(echo "${line#data: }" | python3 -c 'import sys,json;print(json.load(sys.stdin)["status"])')
        f=${frames:$((i%10)):1}; i=$((i+1))
        printf "\r\033[K%s %s · %s · %ss" "$f" "$job" "$status" $((SECONDS-start))
        [ "$status" = completed ] || [ "$status" = error ] && echo && break
        ;;
    esac
  done
}
rh_analyze https://yourhost.com/rent-history.pdf

Python (EventSource-style, stdlib only)

import os, sys, time, json, itertools, threading, requests

KEY, BASE = os.environ["RH_KEY"], "https://renthistory.org"
H = {"Authorization": f"Bearer {KEY}"}

job_id = requests.post(f"{BASE}/v1/analyze", headers=H,
    json={"file_url": sys.argv[1]}).json()["job_id"]

frames = itertools.cycle("⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏")
status = ["pending"]; done = threading.Event(); t0 = time.time()

def spin():
    while not done.is_set():
        sys.stdout.write(f"\r\033[K{next(frames)} {job_id} · {status[0]} · {time.time()-t0:5.1f}s")
        sys.stdout.flush(); time.sleep(0.1)
threading.Thread(target=spin, daemon=True).start()

with requests.get(f"{BASE}/v1/jobs/{job_id}/stream", headers=H, stream=True) as r:
    for line in r.iter_lines(decode_unicode=True):
        if line.startswith("data: "):
            payload = json.loads(line[6:])
            status[0] = payload["status"]
            if status[0] in ("completed", "error"):
                done.set(); print(); print(json.dumps(payload, indent=2)); break

Check your usage

GET /v1/usage

Returns the counts tied to the calling key.

{
  "key_prefix": "rh_live_d692",
  "daily_limit": 100,
  "used_today": 3,
  "used_this_month": 87,
  "total": 412
}

Job states

StatusMeaning
pendingQueued, not yet picked up.
processingActively being parsed.
completedFinished successfully. Result is in the response.
errorFailed. The error field contains a short reason.

Errors

Errors are JSON with a single error string. Representative codes:

200
Success.
400
Missing or invalid file_url. Must be a https:// URL.
401
Missing, malformed, revoked, or invalid API key.
404
No such job.
429
Daily limit reached.
5xx
Server or upstream error. Safe to retry.
{ "error": "file_url must be a valid https:// URL" }

Limits

  • Each key has a daily request limit (daily_limit in /v1/usage). The default starting limit is 100/day; contact support for higher.
  • file_url must be HTTPS. We will not fetch http:// or private IPs.
  • Very large PDFs may take longer to process. Expect most jobs to finish within ~30–60 seconds.

Support

Email support@renthistory.org for higher limits, partner integrations, or help debugging a specific job_id.

Manage keys and view real-time usage in the Dashboard → API Access app.