scriva documentation
scriva — documentation index
Section titled “scriva — documentation index”scriva is a composable, engine-agnostic OCR framework for Python. Every
step of a recognition run — load, detect, recognise, post-process, export —
is a swappable stage, and every stage is a small Protocol.
The one-liner
Section titled “The one-liner”import scrivatext = scriva.read("scan.png")That uses sensible defaults end-to-end. When you want to swap a model, change
a prompt, or stream events, drop down to the Pipeline API — every default
above is one constructor argument away.
Three ways in
Section titled “Three ways in”import scrivafrom scriva.schemas import Invoice
scriva.read("scan.png") # text outscriva.extract("acme.pdf", schema=Invoice) # typed pydantic instance outscriva.presets.invoice("acme.pdf") # tuned pipeline + schemaPick the highest-level entry that fits and drop a rung when you need more
control. See scriva.extract and Presets.
Three pillars for high-accuracy production OCR
Section titled “Three pillars for high-accuracy production OCR”When the project’s bar is “wrong answers are unacceptable,” scriva exposes three composable pillars. Every other reference page slots into one of them.
- Learn from your environment. Keep a
SampleStoreof your own corrections. The same store powers few-shot exemplars on the input side (RecognitionHint.from_store) and derived dictionaries on the output side (postprocess.dictionary.from_samples). One store, two roads — see samples.md › Two roads from a SampleStore. - Cross-check, then route. Three accuracy levers — human-in-the-loop review, cross-check against the original crop (round-trip rendering or multi-recognizer consensus), and cross-check against ground-truth data — compose freely. See Pipeline › High-accuracy patterns.
- Manipulate the pixels the recognizer actually sees. Page-level preprocessing (orientation, deskew, dewarp) and region-level preprocessing (per-cell binarisation, horizontal/vertical slicing, per-role padding, glare removal, whiteboard clean-up) are both first-class. See Preprocessors.
Read in order
Section titled “Read in order”Reference — high level
Section titled “Reference — high level”scriva.extract— schema-first one-liner, classify, batch, watch.- Presets — pre-tuned pipelines per document kind.
- Schemas — built-in pydantic models (Invoice, Receipt, IdCard, …).
- Working with results —
DocumentResultaccessors and serialisation. - CLI —
scriva extract,scriva watch,scriva eval,scriva annotate.
Reference — pipeline internals
Section titled “Reference — pipeline internals”Reference — production
Section titled “Reference — production”- Reliability — retries, rate limits, timeouts, cost caps, PII redaction, telemetry.
- Evaluation —
scriva.eval, ground-truth format, calibration.
Recipes
Section titled “Recipes”- Cookbook — worked recipes per document kind and per workflow.
- Domain packs — pre-built pipelines for forms, P&ID, agentic extraction, annotation.
- Pipeline › Recipes — human-in-the-loop review, confidence-driven re-OCR, recognizer diff.
- Cookbook › Rebuilding ocr-agent — case study porting a Japanese-forms / P&ID / annotation app, plus FastAPI hosting.
One-line map
Section titled “One-line map”| You want to… | Read… |
|---|---|
| just read an image to text | Quickstart › one-liner |
| extract a typed pydantic instance | scriva.extract |
| understand the design | Concepts |
| OCR an invoice / receipt / ID | Cookbook |
| OCR a tabular form to Excel | Domains › forms |
| route an unknown document | scriva.classify_document |
| ingest a folder continuously | Cookbook › Batch & watch |
| swap OpenAI for Anthropic | Recognizers |
| add a new output format | Exporters |
| serialise a result on the fly | Exporters › result verbs |
| cut API spend | Caching |
| set a hard cost cap | Reliability › Cost caps |
| retry on 429 / 5xx | Reliability › Retries |
| redact PII before sending to a VLM | Reliability › PII redaction |
| stream pages from a 100-page PDF | Cookbook › Long PDFs |
| score a pipeline against ground truth | Evaluation |
| OCR pages in parallel | Pipeline › options |
| straighten a rotated scan | Preprocessors › orientation |
| binarise / sharpen each cell separately | Preprocessors › Region preprocessors |
| slice a tall cell into rows | Preprocessors › Slicers |
| use a few-shot exemplar from past corrections | Recognizers › RecognitionHint |
| derive a correction dictionary from a labelled store | Post-processors › dictionary.from_samples |
| let a human review detected cells | Pipeline › Lever 1: HITL |
| cross-check the recognised text against the crop | Pipeline › Lever 2: Cross-check vs original |
| cross-check a run against ground truth | Pipeline › Lever 3: Cross-check vs ground-truth |
| re-OCR only the low-confidence cells | Pipeline › 2c: Confidence-driven re-OCR |
| plug in your own ML detector | Detectors › Writing your own |
| write a one-off post-processor | Post-processors › decorator |
| use scriva from the shell | CLI |
| port a FastAPI OCR app onto scriva | Cookbook › Rebuilding ocr-agent |