Quickstart

A five-minute tour. We’ll go from the one-liner to a full custom recipe that OCRs a scanned tabular form and writes Excel.

Install

pip install "scriva[openai,excel]"

Extras pull optional engine adapters and exporters: openai, anthropic, bedrock, tesseract, paddle, azure, excel, pdf, parquet, vector-cache. The core library has no mandatory dependency on any OCR engine.

Set credentials

export OPENAI_API_KEY=sk-...

The one-liner

import scriva
text = scriva.read("scan.png")

scriva.read picks a sensible Worker and Orchestrator recipe based on the extras you installed and returns the recognised text. A few common knobs:

scriva.read("scan.png")                              # -> str
scriva.read("scan.pdf", as_="json")                  # -> dict
scriva.read("scan.png", as_="dataframe")             # -> pandas.DataFrame
scriva.read("scan.png", to="out.xlsx")               # writes file, returns Path
scriva.read("scan.png", language="ja", worker="anthropic")

That is the entire API for the 80% case. The rest of this page is for when you want to inspect, swap, or extend.

Loading PDFs

For multi-page inputs the loader exposes a few knobs:

from scriva import Document

doc = Document.load(
    "scan.pdf",
    pages=None,                  # Sequence[int] | slice | None — None = all pages
    dpi=300,                     # raster resolution for non-text PDFs
    password=None,               # encrypted PDFs
    renderer="pdfium",           # "pdfium" | "poppler"
)

PDFs without an embedded text layer are rasterised at dpi and then treated as page images by the rest of the recipe. PDFs with a text layer are still rasterised — scriva does not currently short-circuit on embedded text; treat that as an OCR cache, not a substitute. The pdf extra installs pypdfium2; renderer="poppler" requires the poppler binary on $PATH.

pages accepts a list ([0, 2, 5]) or a slice (slice(0, 10)). One-page PDFs do not need this — recipe("file.pdf") always works.

Build a Worker

A Worker converts one crop into one Recognition. Build it by chaining decorators on an engine factory:

from scriva import Worker
from scriva.prompts import Prompt

worker = (
    Worker.openai("gpt-4o")
    .prompt(Prompt.cell)
    .cache(".scriva_cache")
    .score(method="rendering")
    .retry(times=3)
    .parallel(max=8)
)

Each call returns a new Worker; chaining never mutates. See worker.md for the full reference.

Specialised Workers

from scriva import Worker

Worker.number()                # numbers / currency
Worker.date()                  # locale-aware dates
Worker.checkbox()              # checked / unchecked / ambiguous
Worker.handwriting()           # cursive / printed handwriting
Worker.signature()             # presence + similarity, not text
Worker.barcode()               # 1D / 2D / QR
Worker.skip()                  # short-circuit; returns text=None

Build an Orchestrator recipe

The Orchestrator owns the flow around the Worker. Chain steps in the natural order:

from scriva import Orchestrator

recipe = (
    Orchestrator()
    .deskew()
    .crop(margins="auto")
    .split.grid()
    .classify(blank=True, merged=True)
    .recognize(worker)
    .reconstruct.grid()
    .export.xlsx("out.xlsx")
    .options(page_concurrency=4)
)

result = recipe("scan.png")
print(f"Recognised {len(result.regions_with_text())} cells, "
      f"mean confidence {result.mean_confidence:.2f}")

A few things to notice:

Recipes are reusable. recipe is immutable; call it on as many sources as you like.
recipe(doc) is sync. Internally async; the call blocks until done. Use await recipe.aio(doc) from an async caller, or async for page in recipe.stream(doc): … for streaming.
Cache is a string. worker.cache(".scriva_cache") becomes a FileSystemCache. See caching.md for Cache.layered(...), Redis, and friends.

Per-cell-type Workers

Forms mix data types. Dispatch by region.kind:

worker_text = Worker.openai("gpt-4o").cache(".scriva_cache")

recipe = (
    Orchestrator()
    .deskew()
    .split.grid()
    .classify(blank=True, merged=True)
    .recognize.by_kind({
        "text":     worker_text,
        "number":   Worker.number(),
        "date":     Worker.date(),
        "checkbox": Worker.checkbox(),
        "blank":    Worker.skip(),
    })
    .reconstruct.grid()
    .export.xlsx("out.xlsx")
)

region.kind is set by .classify(...). Override or extend with .classify(kinds=...) — see postprocessors.md.

Get text out

The DocumentResult knows how to serialise itself, so you don’t need to wire an exporter just to print things:

result.render()              # str — reading-order text
result.to_dict()             # dict
result.to_dataframe()        # pandas
result.to_excel("out.xlsx")  # writes file, returns Path
result.show()                # matplotlib viz of boxes (optional extra)

Use the in-recipe .export.xlsx(...) when you want the write to happen during the run (it emits an event and respects page_concurrency); use result.to_excel(...) for ad-hoc serialisation.

Inspect what happened

Every step emits structured events. Pass a callback to see them live:

recipe("scan.png", on_event=print)

Or stream them yourself:

async for event in recipe.events("scan.png"):
    print(event.stage, event.kind, event.payload)

Sample output:

preprocess     started
preprocess     finished      ms=42 step=deskew angle=-0.8
split          finished      rows=12 cols=6 cells=72
classify       finished      blank=14 merged=3
recognize      started       total=58
recognize      progress      done=10 total=58 cache_hits=3 worker=text
recognize      finished      ms=4910
reconstruct    finished      shape=grid regions=58
export         finished      path=out.xlsx format=xlsx bytes=18432

See Architecture › Observability for SSE and async-iterator patterns.

Swap any piece

from scriva import Worker
recipe = recipe.replace("recognize", Worker.anthropic("claude-opus-4-7"))

Want a different splitter? Implement the protocol — five lines:

from scriva import Splitter, Layout, Region

class MySplitter(Splitter):
    async def split(self, page) -> Layout:
        ...

…or skip the class entirely with the decorator:

from scriva import splitter, Layout, Region

@splitter
async def my_splitter(page):
    boxes = my_model.predict(page.image)
    return Layout.from_regions([Region(bbox=b, role="data") for b in boxes],
                               page=page)

recipe = recipe.replace("split", my_splitter)

The same pattern works for @worker, @preprocessor, @classifier, @reconstruct, and @exporter.

Quickstart

Quickstart

Install

Set credentials

The one-liner

Loading PDFs

Build a Worker

Specialised Workers

Build an Orchestrator recipe

Per-cell-type Workers

Get text out

Inspect what happened

Swap any piece

What to read next