Skip to content

scriva documentation

scriva is a composable, engine-agnostic OCR framework for Python. Every step of a recognition run — load, detect, recognise, post-process, export — is a swappable stage, and every stage is a small Protocol.

import scriva
text = scriva.read("scan.png")

That uses sensible defaults end-to-end. When you want to swap a model, change a prompt, or stream events, drop down to the Pipeline API — every default above is one constructor argument away.

import scriva
from scriva.schemas import Invoice
scriva.read("scan.png") # text out
scriva.extract("acme.pdf", schema=Invoice) # typed pydantic instance out
scriva.presets.invoice("acme.pdf") # tuned pipeline + schema

Pick the highest-level entry that fits and drop a rung when you need more control. See scriva.extract and Presets.

Three pillars for high-accuracy production OCR

Section titled “Three pillars for high-accuracy production OCR”

When the project’s bar is “wrong answers are unacceptable,” scriva exposes three composable pillars. Every other reference page slots into one of them.

  1. Learn from your environment. Keep a SampleStore of your own corrections. The same store powers few-shot exemplars on the input side (RecognitionHint.from_store) and derived dictionaries on the output side (postprocess.dictionary.from_samples). One store, two roads — see samples.md › Two roads from a SampleStore.
  2. Cross-check, then route. Three accuracy levers — human-in-the-loop review, cross-check against the original crop (round-trip rendering or multi-recognizer consensus), and cross-check against ground-truth data — compose freely. See Pipeline › High-accuracy patterns.
  3. Manipulate the pixels the recognizer actually sees. Page-level preprocessing (orientation, deskew, dewarp) and region-level preprocessing (per-cell binarisation, horizontal/vertical slicing, per-role padding, glare removal, whiteboard clean-up) are both first-class. See Preprocessors.
  1. Concepts
  2. Quickstart
  3. Architecture
  1. scriva.extract — schema-first one-liner, classify, batch, watch.
  2. Presets — pre-tuned pipelines per document kind.
  3. Schemas — built-in pydantic models (Invoice, Receipt, IdCard, …).
  4. Working with resultsDocumentResult accessors and serialisation.
  5. CLIscriva extract, scriva watch, scriva eval, scriva annotate.
  1. Pipeline
  2. Preprocessors
  3. Detectors
  4. Recognizers
  5. Post-processors
  6. Exporters
  7. Caching
  8. Sample stores
  1. Reliability — retries, rate limits, timeouts, cost caps, PII redaction, telemetry.
  2. Evaluationscriva.eval, ground-truth format, calibration.
  1. Cookbook — worked recipes per document kind and per workflow.
  2. Domain packs — pre-built pipelines for forms, P&ID, agentic extraction, annotation.
  3. Pipeline › Recipes — human-in-the-loop review, confidence-driven re-OCR, recognizer diff.
  4. Cookbook › Rebuilding ocr-agent — case study porting a Japanese-forms / P&ID / annotation app, plus FastAPI hosting.
You want to…Read…
just read an image to textQuickstart › one-liner
extract a typed pydantic instancescriva.extract
understand the designConcepts
OCR an invoice / receipt / IDCookbook
OCR a tabular form to ExcelDomains › forms
route an unknown documentscriva.classify_document
ingest a folder continuouslyCookbook › Batch & watch
swap OpenAI for AnthropicRecognizers
add a new output formatExporters
serialise a result on the flyExporters › result verbs
cut API spendCaching
set a hard cost capReliability › Cost caps
retry on 429 / 5xxReliability › Retries
redact PII before sending to a VLMReliability › PII redaction
stream pages from a 100-page PDFCookbook › Long PDFs
score a pipeline against ground truthEvaluation
OCR pages in parallelPipeline › options
straighten a rotated scanPreprocessors › orientation
binarise / sharpen each cell separatelyPreprocessors › Region preprocessors
slice a tall cell into rowsPreprocessors › Slicers
use a few-shot exemplar from past correctionsRecognizers › RecognitionHint
derive a correction dictionary from a labelled storePost-processors › dictionary.from_samples
let a human review detected cellsPipeline › Lever 1: HITL
cross-check the recognised text against the cropPipeline › Lever 2: Cross-check vs original
cross-check a run against ground truthPipeline › Lever 3: Cross-check vs ground-truth
re-OCR only the low-confidence cellsPipeline › 2c: Confidence-driven re-OCR
plug in your own ML detectorDetectors › Writing your own
write a one-off post-processorPost-processors › decorator
use scriva from the shellCLI
port a FastAPI OCR app onto scrivaCookbook › Rebuilding ocr-agent