pydantable.io¶
Eager and lazy I/O helpers. Most symbols are re-exported from the top-level pydantable package.
pydantable.io
¶
Unified data I/O: lazy read_* roots, materialize_* column dicts, and export_*.
read_*/aread_*return :class:pydantable_native._core.ScanFileRootfor lazy Polars scans (no full Python column lists).materialize_*(andfetch_sql_raw/fetch_sql/fetch_*_url/fetch_mongo) returndict[str, list].export_*/aexport_*write column dicts to files;write_mongoinserts into a PyMongo collection.afetch_mongo/aiter_mongo/awrite_mongouse the native PyMongo async API when given an async collection; otherwiseasyncio.to_thread(or an optional executor) for syncpymongo.collection.Collection.
BeanieWriteOptions
dataclass
¶
Options that control ODM-aware write behavior.
Source code in python/pydantable/io/beanie.py
arrow_table_to_column_dict
¶
Convert a PyArrow Table to dict[str, list] (copies into Python lists).
Source code in python/pydantable/io/arrow.py
record_batch_to_column_dict
¶
Convert a PyArrow RecordBatch to dict[str, list].
Source code in python/pydantable/io/arrow.py
iter_chain_batches
¶
Yield batches from several files by chaining per-file iterators.
iter_one is typically lambda p: iter_parquet(p) (or another iter_*).
This does not merge schemas across files; callers must ensure compatible layouts.
Source code in python/pydantable/io/batches.py
afetch_beanie
async
¶
afetch_beanie(document_or_query, *, criteria=None, projection_model=None, fields=None, fetch_links=False, nesting_depth=None, nesting_depths_per_field=None, flatten=True, id_column='id')
Fetch Beanie documents (or a Beanie query) into dict[str, list].
- document_or_query: a Beanie
Documentclass (preferred) or a query object returned fromDocument.find(...). - fields: convenience projection (builds a temporary projection model).
- projection_model: passed to Beanie
.project(...). - fetch_links / nesting_depth*: forwarded to Beanie find call (relations).
- flatten: if True, nested objects are flattened into dot-path keys.
- id_column: normalize Mongo id to
id(default) or_id.
Source code in python/pydantable/io/beanie.py
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 | |
aiter_beanie
async
¶
aiter_beanie(document_or_query, *, criteria=None, batch_size=1000, projection_model=None, fields=None, fetch_links=False, nesting_depth=None, nesting_depths_per_field=None, flatten=True, id_column='id')
Yield rectangular column-dict batches from Beanie results.
Source code in python/pydantable/io/beanie.py
awrite_beanie
async
¶
Insert documents via Beanie so validate_on_save/actions can run.
This is intentionally not a high-throughput bulk insert helper; if you want raw
speed and are ok bypassing ODM hooks, use :func:pydantable.write_mongo /
:func:pydantable.io.write_mongo instead.
Source code in python/pydantable/io/beanie.py
iter_avro
¶
Yield Avro batches via PyArrow (falls back to full read if needed).
Source code in python/pydantable/io/extras.py
iter_bigquery
¶
Yield BigQuery results in Arrow-backed batches when available.
Source code in python/pydantable/io/extras.py
iter_delta
¶
Yield Delta (Parquet dataset) batches via PyArrow dataset scanner.
Source code in python/pydantable/io/extras.py
iter_excel
¶
Yield Excel rows as dict[str, list] batches (openpyxl read-only).
Source code in python/pydantable/io/extras.py
iter_kafka_json
¶
iter_kafka_json(topic, *, bootstrap_servers, max_messages=None, batch_size=1000, experimental=True, **consumer_config)
Stream JSON payloads from Kafka, yielding batches as dict[str, list].
Stops after max_messages if provided; otherwise runs until poll returns empty.
Source code in python/pydantable/io/extras.py
iter_orc
¶
Yield ORC batches via PyArrow.
Source code in python/pydantable/io/extras.py
iter_snowflake
¶
Yield Snowflake query results in batches (cursor.fetchmany).
Source code in python/pydantable/io/extras.py
read_bigquery
¶
Run a BigQuery SQL string via google-cloud-bigquery → dict[str, list].
Source code in python/pydantable/io/extras.py
read_csv_stdin
¶
Read CSV from stdin (or stream) via a temporary file + :func:materialize_csv.
Source code in python/pydantable/io/extras.py
read_delta
¶
Read a Delta table directory via PyArrow dataset ([arrow] extra).
Source code in python/pydantable/io/extras.py
read_excel
¶
Load the first sheet (or sheet_name) from .xlsx via openpyxl → dict[str, list].
Source code in python/pydantable/io/extras.py
read_kafka_json_batch
¶
read_kafka_json_batch(topic, *, bootstrap_servers, max_messages=100, experimental=True, **consumer_config)
Poll JSON payloads from topic into columns key, value, partition, offset.
At-least-once delivery only; values must be JSON objects whose keys become columns when unioning (best-effort flatten).
Source code in python/pydantable/io/extras.py
read_snowflake
¶
Execute sql on Snowflake via snowflake-connector-python (experimental).
Source code in python/pydantable/io/extras.py
write_csv_stdout
¶
Write data as CSV to stdout (or stream) using :func:export_csv to a temp file.
Source code in python/pydantable/io/extras.py
fetch_bytes
¶
Download url and return raw bytes (stdlib urllib).
If max_bytes is set, read in chunks and raise ValueError if the body
would exceed that size (partial data is not returned).
Source code in python/pydantable/io/http.py
fetch_csv_url
¶
Download CSV from url to a temp file and read via the Rust CSV path when possible.
Source code in python/pydantable/io/http.py
fetch_parquet_url
¶
Download a Parquet file from url (HTTP(S) only) and materialize as dict[str, list].
Extra kwargs are forwarded to :func:fetch_bytes (e.g. max_bytes, timeout).
Source code in python/pydantable/io/http.py
read_from_object_store
¶
Read s3://, gs://, or az:// style URIs via fsspec (optional dependency).
Experimental: requires pip install 'pydantable[cloud]' (or fsspec + backend).
max_bytes caps how much of the object is read into memory (streaming reads the
remote file in chunks until the limit). Full streaming without a cap is not implemented.
Source code in python/pydantable/io/http.py
iter_csv
¶
Yield CSV rows in batches as dict[str, list].
Values are yielded as strings (or None for missing cells); downstream typed
constructors can validate/coerce as needed.
Source code in python/pydantable/io/iter_file.py
iter_ipc
¶
Yield Arrow IPC in batches as dict[str, list].
as_stream=Falsereads IPC file format.as_stream=Truereads IPC stream format.
Source code in python/pydantable/io/iter_file.py
iter_json_array
¶
Yield a JSON array-of-objects file in batches.
This is not incremental JSON parsing: the full file is currently loaded then chunked. Provided for API uniformity.
Source code in python/pydantable/io/iter_file.py
iter_json_lines
¶
Alias of :func:iter_ndjson.
Source code in python/pydantable/io/iter_file.py
iter_ndjson
¶
Yield NDJSON (JSON Lines) as dict[str, list] batches.
Each line must be a JSON object; keys are unioned within each batch.
Source code in python/pydantable/io/iter_file.py
iter_parquet
¶
Yield Parquet data in batches as dict[str, list].
Requires pyarrow (install pydantable[arrow]).
Source code in python/pydantable/io/iter_file.py
afetch_mongo_async
async
¶
afetch_mongo_async(collection, *, match=None, projection=None, sort=None, skip=None, limit=None, fields=None, session=None, max_time_ms=None)
Async :func:fetch_mongo for :class:~pymongo.asynchronous.collection.AsyncCollection.
Source code in python/pydantable/io/mongo.py
aiter_mongo_async
async
¶
aiter_mongo_async(collection, *, match=None, projection=None, sort=None, skip=None, limit=None, batch_size=1000, fields=None, session=None, max_time_ms=None)
Async :func:iter_mongo for :class:~pymongo.asynchronous.collection.AsyncCollection.
Source code in python/pydantable/io/mongo.py
awrite_mongo_async
async
¶
Async :func:write_mongo for :class:~pymongo.asynchronous.collection.AsyncCollection.
Source code in python/pydantable/io/mongo.py
fetch_mongo
¶
fetch_mongo(collection, *, match=None, projection=None, sort=None, skip=None, limit=None, fields=None, session=None, max_time_ms=None)
Load all matching documents into a single dict[column_name, list].
Materializes the full cursor in memory; for large scans prefer :func:iter_mongo.
Source code in python/pydantable/io/mongo.py
is_async_mongo_collection
¶
True for :class:pymongo.asynchronous.collection.AsyncCollection instances.
Source code in python/pydantable/io/mongo.py
iter_mongo
¶
iter_mongo(collection, *, match=None, projection=None, sort=None, skip=None, limit=None, batch_size=1000, fields=None, session=None, max_time_ms=None)
Yield dict[column_name, list] batches from a PyMongo Collection.find.
Each batch is rectangular. Document keys are merged per batch (sorted union unless
fields fixes the column order). Install pymongo (or pydantable[mongo]).
Source code in python/pydantable/io/mongo.py
write_mongo
¶
Insert rows from a rectangular column dict via insert_many.
Returns the number of inserted document ids. Empty data or zero rows is a no-op
(returns 0).
Source code in python/pydantable/io/mongo.py
aread_csv_rap
async
¶
Load a CSV file with rapcsv.AsyncDictReader (non-blocking I/O).
Requires pip install 'pydantable[rap]' (rapcsv + rapfiles).
Source code in python/pydantable/io/rap_support.py
rap_csv_available
¶
True when rapcsv and rapfiles are importable and rapfiles.open exists.
Source code in python/pydantable/io/rap_support.py
fetch_sql
¶
fetch_sql(sql, bind, *, parameters=None, batch_size=None, auto_stream=True, auto_stream_threshold_rows=None)
Execute sql and return rows as dict[column_name, list] (materialized).
Source code in python/pydantable/io/sql.py
fetch_sql_raw
¶
fetch_sql_raw(sql, bind, *, parameters=None, batch_size=None, auto_stream=True, auto_stream_threshold_rows=None)
Execute sql and return rows as dict[column_name, list] (materialized).
bind may be any SQLAlchemy URL your environment has drivers for, or a
:class:~sqlalchemy.engine.Engine / :class:~sqlalchemy.engine.Connection.
Use bound parameters only — never interpolate untrusted input into sql.
Source code in python/pydantable/io/sql.py
iter_sql
¶
Execute sql and yield results in batches as dict[column_name, list].
Source code in python/pydantable/io/sql.py
iter_sql_raw
¶
Execute sql and yield results in batches as dict[column_name, list].
Streaming alternative to :func:fetch_sql_raw for large result sets.
Notes:
- sql should be a SELECT (or other statement returning rows).
- Use bound parameters only — never interpolate untrusted input into sql.
- bind may be a SQLAlchemy URL string, Engine, or Connection.
Source code in python/pydantable/io/sql.py
write_sql
¶
Insert data (column dict) into table_name.
Source code in python/pydantable/io/sql.py
write_sql_raw
¶
Insert data (column dict) into table_name.
append: table must already exist; rows are appended.replace: drops the table if it exists, recreates it with inferred column types, then inserts.table_name/schemamust be trusted identifiers (not user-controlled).
bind is any SQLAlchemy-supported URL or Engine (same driver rules as fetch_sql_raw).
if_exists="replace" uses generic DDL; exotic dialects may need app-specific migrations instead.
Source code in python/pydantable/io/sql.py
fetch_sqlmodel
¶
fetch_sqlmodel(model, bind, *, where=None, parameters=None, columns=None, order_by=None, limit=None, batch_size=None, auto_stream=True, auto_stream_threshold_rows=None)
Load rows for model into dict[column_name, list] (or :class:StreamingColumns).
Semantics match :func:pydantable.io.fetch_sql for batch_size,
auto_stream, and auto_stream_threshold_rows.
Source code in python/pydantable/io/sqlmodel_read.py
iter_sqlmodel
¶
iter_sqlmodel(model, bind, *, where=None, parameters=None, columns=None, order_by=None, limit=None, batch_size=None)
Stream rows for model as dict[column_name, list] batches.
model must be a :class:sqlmodel.SQLModel subclass with table=True.
Source code in python/pydantable/io/sqlmodel_read.py
sqlmodel_columns
¶
Return ordered SQLAlchemy column keys for model (table=True SQLModel).
Matches the default key set returned by :func:~pydantable.io.fetch_sqlmodel
and expected by :func:~pydantable.io.write_sqlmodel for a full-row payload.
Source code in python/pydantable/io/sqlmodel_schema.py
write_sqlmodel
¶
write_sqlmodel(data, model, bind, *, schema=None, if_exists='append', chunk_size=None, validate_rows=False, replace_ok=False)
Insert data into the table defined by model (table=True SQLModel).
append: table must already exist.replace: drops the table if present, recreates frommodel.__table__, inserts. Requiresreplace_ok=True(destructive).
validate_rows=True runs model.model_validate per row; failures include the
row index in the error.
Source code in python/pydantable/io/sqlmodel_write.py
write_csv_batches
¶
Write an iterator of rectangular column dict batches to a single CSV file.
path must be a file path (or open text stream), not a directory. mode="w"
truncates or creates the file; mode="a" appends rows. With mode="a", a header
row is written only when the file is new and write_header is true (first batch).
Source code in python/pydantable/io/write_batches.py
write_ipc_batches
¶
Write batches to a single Arrow IPC file or stream (not a dataset directory).
path must be a file path (or open binary stream). Requires pyarrow (install
pydantable[arrow]).
Source code in python/pydantable/io/write_batches.py
write_ndjson_batches
¶
Write an iterator of rectangular column dict batches to a single NDJSON file.
path must be a file path (or open text stream), not a directory. mode="w"
truncates or creates the file; mode="a" appends one JSON object per line.
Source code in python/pydantable/io/write_batches.py
write_parquet_batches
¶
Write batches to a single Parquet file (one row group per non-empty batch).
path must be a file path (or open binary stream), not a directory or hive
dataset root. For partitioned Parquet on disk, use :meth:~pydantable.dataframe.DataFrame.write_parquet
with partition_by. Requires pyarrow (install pydantable[arrow]).
Source code in python/pydantable/io/write_batches.py
read_parquet
¶
Lazy Parquet read (local path); returns ScanFileRoot. Use DataFrame[Schema].read_parquet.
Extra keyword arguments are forwarded as Polars scan options (e.g. low_memory, n_rows,
parallel, glob, hive_partitioning, hive_start_idx, try_parse_hive_dates,
include_file_paths, row_index_name, row_index_offset). Unknown keys raise
ValueError from the Rust layer. Per-scan details: IO_PARQUET on the doc site; kwargs matrix: DATA_IO_SOURCES (Audit: Polars 0.53.x vs pydantable).
Source code in python/pydantable/io/__init__.py
read_csv
¶
Lazy CSV read (local path); returns ScanFileRoot. Use DataFrame[Schema].read_csv.
Extra keyword arguments are forwarded as Polars LazyCsvReader options (e.g. has_header,
separator, skip_rows, skip_lines, n_rows, infer_schema_length,
ignore_errors, low_memory, rechunk, glob, cache, quote_char, eol_char,
include_file_paths, row_index_name, row_index_offset, raise_if_empty,
truncate_ragged_lines, decimal_comma, try_parse_dates). Unknown keys raise
ValueError from the Rust layer. Per-scan details: IO_CSV on the doc site; kwargs matrix:
DATA_IO_SOURCES (Audit: Polars 0.53.x vs pydantable).
Source code in python/pydantable/io/__init__.py
read_ndjson
¶
Lazy newline-delimited JSON read (local path); returns ScanFileRoot.
Extra keyword arguments are forwarded as Polars LazyJsonLineReader options (e.g.
low_memory, rechunk, ignore_errors, n_rows, infer_schema_length,
glob, include_file_paths, row_index_name, row_index_offset). glob=False
raises ValueError (Polars 0.53 NDJSON scans always expand paths). Unknown keys raise
ValueError from the Rust layer. Per-scan details: IO_NDJSON on the doc site; kwargs
matrix: DATA_IO_SOURCES (Audit: Polars 0.53.x vs pydantable).
Source code in python/pydantable/io/__init__.py
read_ipc
¶
Lazy Arrow IPC file read (local path); returns ScanFileRoot.
Extra keyword arguments are forwarded to the Rust layer: IpcScanOptions
(record_batch_statistics) and UnifiedScanArgs (glob, cache,
rechunk, n_rows, hive_partitioning, hive_start_idx,
try_parse_hive_dates, include_file_paths, row_index_name,
row_index_offset). Unknown keys raise ValueError. Per-scan details: IO_IPC on
the doc site; kwargs matrix: DATA_IO_SOURCES (Audit: Polars 0.53.x vs pydantable).
Source code in python/pydantable/io/__init__.py
read_json
¶
Lazy JSON Lines read (local path); alias of :func:read_ndjson (same ScanFileRoot).
Not a lazy reader for a single-file JSON array [{...}, ...] — use
:func:materialize_json or :func:iter_json_array for array layout.
Paths: directory, glob, or a single file behave like :func:read_ndjson (Polars
LazyJsonLineReader). Pass glob=True when using a directory or *.jsonl-style
pattern so kwargs match other read_* APIs. scan_kwargs are the same as NDJSON
(e.g. low_memory, rechunk, ignore_errors, n_rows, infer_schema_length,
glob, include_file_paths, row_index_name, row_index_offset); glob=False
raises ValueError. Unknown keys raise from the Rust layer. See IO_JSON and
DATA_IO_SOURCES (Audit).
Source code in python/pydantable/io/__init__.py
read_parquet_url
¶
Download Parquet from url (HTTP(S)) to a temp file; return ScanFileRoot.
The file is not removed automatically: delete it when the pipeline finishes (see the DATA_IO_SOURCES guide (project docs)).
Source code in python/pydantable/io/__init__.py
read_parquet_url_ctx
¶
Download Parquet from url to a temp file, yield DataFrame[Schema], delete the file after.
Pass the parametrized frame class (e.g. DataFrame[MySchema]). The lazy plan must not
be used after the context exits (the backing file is removed).
Source code in python/pydantable/io/__init__.py
aread_parquet_url_ctx
async
¶
aread_parquet_url_ctx(dataframe_cls, url, *, experimental=True, columns=None, executor=None, **kwargs)
Async variant of :func:read_parquet_url_ctx (uses :func:aread_parquet_url).
Source code in python/pydantable/io/__init__.py
materialize_parquet
¶
Eagerly read Parquet into dict[str, list] (loads full data into Python).
Single file: one local path, buffer, or file-like per call. For multiple Parquet
files, use :func:read_parquet with glob=True / a directory and materialize via
:meth:~pydantable.dataframe.DataFrame.to_dict, or call materialize_parquet per file
and merge (mind schema alignment).
engine="auto"(default): Rust for local file paths whencolumnsisNone; otherwise PyArrow.engine="rust"/"pyarrow": force that implementation.
For out-of-core pipelines prefer :func:read_parquet + :meth:~pydantable.dataframe.DataFrame.write_parquet.
Source code in python/pydantable/io/__init__.py
materialize_ipc
¶
Read Arrow IPC (file or stream) into dict[str, list].
Single file per call. For multiple IPC files, prefer lazy :func:read_ipc + glob /
to_dict, or iterate :func:materialize_ipc per path.
Source code in python/pydantable/io/__init__.py
materialize_csv
¶
Read CSV from a local path into dict[str, list].
Single file per call. For multiple CSVs, use :func:read_csv with glob=True /
to_dict, or call materialize_csv per file and merge.
engine="auto": try Rust, then fall back to stdlibcsvon failure.use_rap=True: load via :func:aread_csv_rap(only when no running event loop).
Source code in python/pydantable/io/__init__.py
materialize_ndjson
¶
Read newline-delimited JSON from a single local path into dict[str, list].
For multiple NDJSON files, use :func:read_ndjson with glob=True / to_dict, or
call materialize_ndjson per file and merge.
Source code in python/pydantable/io/__init__.py
materialize_json
¶
Load a JSON file into dict[str, list]: either a JSON array of objects or JSON Lines.
Single file per call. For multiple JSON files, use lazy :func:read_json /
read_ndjson with glob=True and to_dict, or call materialize_json per path.
If the first non-whitespace character is [, the file is parsed as one JSON array.
Otherwise the file is read as newline-delimited JSON (same as :func:materialize_ndjson).
Source code in python/pydantable/io/__init__.py
export_json
¶
Write dict[str, list] as one JSON array of row objects.
Uses :func:json.dump with default=str. Nested dict/list cells
serialize as JSON objects/arrays; non-JSON-native scalars (e.g. datetime,
Decimal, UUID) become str(value), not necessarily ISO-8601. For
stable JSON output, prefer normalizing rows or using Pydantic
model_dump(mode="json") after materialization.
Source code in python/pydantable/io/__init__.py
export_parquet
¶
Write dict[str, list] to Parquet (eager). For lazy plan output use :meth:DataFrame.write_parquet.
Source code in python/pydantable/io/__init__.py
export_csv
¶
Write dict[str, list] to CSV (eager).
Source code in python/pydantable/io/__init__.py
export_ndjson
¶
Write dict[str, list] as newline-delimited JSON (eager).
Source code in python/pydantable/io/__init__.py
export_ipc
¶
Write dict[str, list] to Arrow IPC file (eager).
Source code in python/pydantable/io/__init__.py
afetch_sql_raw
async
¶
afetch_sql_raw(sql, bind, *, parameters=None, batch_size=None, auto_stream=True, auto_stream_threshold_rows=None, executor=None)
Async :func:fetch_sql_raw via :func:asyncio.to_thread (optional Executor).
Source code in python/pydantable/io/__init__.py
aiter_sql_raw
async
¶
Async generator yielding batches from :func:iter_sql_raw without blocking the loop.
This runs the synchronous SQLAlchemy cursor in a background thread and streams
batch dicts through an asyncio.Queue.
Source code in python/pydantable/io/__init__.py
941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 | |
aiter_sql
async
¶
Source code in python/pydantable/io/__init__.py
aiter_sqlmodel
async
¶
aiter_sqlmodel(model, bind, *, where=None, parameters=None, columns=None, order_by=None, limit=None, batch_size=65536, executor=None)
Async generator yielding batches from :func:iter_sqlmodel without blocking the loop.
Source code in python/pydantable/io/__init__.py
1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 | |
aiter_parquet
async
¶
Async batches from :func:iter_parquet.
Source code in python/pydantable/io/__init__.py
aiter_ipc
async
¶
Async batches from :func:iter_ipc.
Source code in python/pydantable/io/__init__.py
aiter_csv
async
¶
Async batches from :func:iter_csv.
Source code in python/pydantable/io/__init__.py
aiter_ndjson
async
¶
Async batches from :func:iter_ndjson.
Source code in python/pydantable/io/__init__.py
aiter_json_lines
async
¶
Async batches from :func:iter_json_lines.
Source code in python/pydantable/io/__init__.py
aiter_json_array
async
¶
Async batches from :func:iter_json_array.
Source code in python/pydantable/io/__init__.py
afetch_mongo
async
¶
afetch_mongo(collection, *, match=None, projection=None, sort=None, skip=None, limit=None, fields=None, session=None, max_time_ms=None, executor=None)
Async :func:fetch_mongo.
Uses PyMongo's async driver for :class:~pymongo.asynchronous.collection.AsyncCollection;
otherwise :func:asyncio.to_thread (optional Executor) for sync collections.
Source code in python/pydantable/io/__init__.py
aiter_mongo
async
¶
aiter_mongo(collection, *, match=None, projection=None, sort=None, skip=None, limit=None, batch_size=1000, fields=None, session=None, max_time_ms=None, executor=None)
Async batches from :func:iter_mongo (native async or thread-backed).
Source code in python/pydantable/io/__init__.py
awrite_mongo
async
¶
Async :func:write_mongo (native async or thread-backed).
Source code in python/pydantable/io/__init__.py
awrite_sql_raw
async
¶
awrite_sql_raw(data, table_name, bind, *, schema=None, if_exists='append', chunk_size=None, executor=None)
Async :func:write_sql_raw via :func:asyncio.to_thread (optional Executor).
Source code in python/pydantable/io/__init__.py
write_sql_batches
¶
Write an iterator of batch column dicts to SQL.
Deprecated: prefer calling :func:write_sql_raw per batch.
Each batch is a dict[str, list] (e.g. from :func:iter_sql_raw or a
:class:~pydantable.DataFrameModel batch via .to_dict()).
Source code in python/pydantable/io/__init__.py
write_sqlmodel_batches
¶
write_sqlmodel_batches(batches, model, bind, *, schema=None, if_exists='append', chunk_size=None, validate_rows=False, replace_ok=False)
Write an iterator of batch column dicts via :func:write_sqlmodel.
The first batch uses if_exists; later batches always append.