FastAPI enhancements roadmap¶
This page tracks planned and shipped improvements for using pydantable with FastAPI and Pydantic. For the current integration guide, see FASTAPI; for the shortest runnable service, see GOLDEN_PATH_FASTAPI.
When to use what (quick reference)¶
| Need | Prefer | Avoid / notes |
|---|---|---|
| Lazy file scan + transforms + async materialize | await Model.aread_*, then select / filter, then await …acollect() |
amaterialize_* when you only need a lazy pipeline (eager loads full columns first). |
| Eager columns from file or SQL, then typed frame | await amaterialize_* / await afetch_sql from pydantable.io, then MyModel(cols) |
Mixing eager load with unnecessary extra copies—see IO_OVERVIEW. |
Row list JSON response_model |
await df.acollect() (or collect() in sync routes) |
to_dict() when clients expect columnar JSON. |
| Columnar JSON response | await df.ato_dict() |
acollect() if the client expects a list of objects. |
| Large responses / back-pressure | astream() + ndjson_streaming_response (FASTAPI helpers) |
Single giant to_dict() in memory. |
| Untrusted user data | Default DataFrameModel validation |
trusted_mode="shape_only" unless upstream guarantees cells. |
| Shared thread pool for engine / I/O offload | executor_lifespan + Depends(get_executor) |
Relying only on the default asyncio thread pool under heavy load. |
Production pattern (lifespan + handlers + NDJSON)¶
Typical main.py wiring:
from contextlib import asynccontextmanager
from fastapi import FastAPI
from pydantable.fastapi import (
executor_lifespan,
get_executor,
ndjson_streaming_response,
register_exception_handlers,
)
@asynccontextmanager
async def lifespan(app: FastAPI):
async with executor_lifespan(app, max_workers=8, thread_name_prefix="myapp"):
yield
app = FastAPI(lifespan=lifespan)
register_exception_handlers(app)
Then in routes: Depends(get_executor) for acollect(executor=...), and
return ndjson_streaming_response(df.astream(..., executor=executor)) for streaming
responses. Install pip install 'pydantable[fastapi]'.
NDJSON helpers (pydantable.fastapi)¶
ndjson_streaming_response(async_iter, media_type=...)— returns a StarletteStreamingResponse. Defaultmedia_typeisapplication/x-ndjson; useapplication/jsonlinesif your clients expect that label instead.ndjson_chunk_bytes(async_iter)— async iterator of UTF-8 lines (JSON object +\nper chunk). Use when you set headers or status yourself onStreamingResponse.- Chunks are whatever
astream()yields (dict[str, list]). Values must be JSON-serializable (standardjson.dumpsrules).
Troubleshooting¶
| Symptom | Likely cause |
|---|---|
| 422 on POST before your handler runs | FastAPI RequestValidationError (body shape/types). Not the same as register_exception_handlers’s pydantic.ValidationError handler (in-route). |
503 with detail about _core |
MissingRustExtensionError — install a wheel or build the native extension (DEVELOPER). |
| Empty stream body | astream() produced no chunks (empty frame or zero batches); NDJSON is still valid with an empty body. |
| High latency under concurrent requests | Raise executor_lifespan(..., max_workers=...) or reduce competing work on the default thread pool (EXECUTION). |
| Client disconnect does not stop engine work | Documented limitation: cancelling await acollect() does not cancel in-flight Rust work (EXECUTION). |
| 500 after 422-valid columnar JSON (no handlers) | ColumnLengthMismatchError when column lengths differ. Call register_exception_handlers for 400 with a detail string; or catch in-route. |
OpenAPI shows list but clients send wrong element type |
FastAPI returns 422 from Pydantic; fix payload types. |
Phased roadmap¶
Phase 1 (shipped with this doc’s tooling)¶
ndjson_streaming_response/ndjson_chunk_bytesinpydantable.fastapi— NDJSONStreamingResponsefromastream()without duplicating encode logic.- Combined lifespan + exception handlers pattern documented here (explicit
executor_lifespan+register_exception_handlers— no hidden globals). - This quick reference table and cross-links.
Phase 2 — OpenAPI / bodies (shipped)¶
columnar_body_model/columnar_body_model_from_dataframe_modelinpydantable.fastapi— generated Pydantic models withlist[T]per column, optionalexample=/json_schema_extra=for OpenAPI.- Use the same model as
response_modelfor columnar JSON (to_dict()shape). NDJSON streams still have no per-line OpenAPI schema.
Phase 3 — Dependencies (shipped)¶
columnar_dependency/rows_dependencybuild aDataFrameModelfrom validated bodies; see FASTAPI Columnar OpenAPI and Depends and fastapi_columnar_bodies.- Optional named executors (e.g.
app.state.executors["io"]) for separate pools — not packaged; pattern only.
Phase 4 — Errors (shipped)¶
PydantableUserError,ColumnLengthMismatchErrorinpydantable.errors(subclassValueError).register_exception_handlersmapsColumnLengthMismatchError→ 400. Further narrow types can be added incrementally (FASTAPI error table).
Phase 5 — Ops / observability (shipped as docs)¶
- fastapi_observability — request-ID middleware,
observe.emit/span, optional OpenTelemetry bridge note.
Phase 6 — Long-running work (shipped as docs)¶
- fastapi_background_tasks —
BackgroundTasks+submit/ExecutionHandle, executor alignment, cancellation limits (EXECUTION).
Phase 7 — Testing (shipped)¶
pydantable.testing.fastapi:fastapi_app_with_executor,fastapi_test_client(lifespan-awareTestClient). See FASTAPI Columnar OpenAPI and Depends.
Phase 8 — Templates (shipped as layout + docs)¶
- Checked-in example service layout:
docs/examples/fastapi/service_layout/(main.py,routers/,README.md). Not a published package—copy into your repo. Cookiecutter /uvtemplate remains optional future work.
See also¶
- FASTAPI
- GOLDEN_PATH_FASTAPI
- fastapi_observability
- fastapi_background_tasks
- fastapi_columnar_bodies
- async_lazy_pipeline
- EXECUTION
- DEVELOPER (native extension build / wheels)
- ROADMAP (product-wide backlog)