Stateful Processing Concepts ============================ This page introduces core ideas behind processing ordered data in chunks. - **Ordered streams**: data indexed by a monotonically increasing key (e.g., timestamps). - **Chunked iteration**: process batches that fit in memory; keep ordering across chunks. - **Warm-up**: some operations require a look-back window; handle via state or ``n_prev``. - **State**: persist minimal mutable state across chunks (e.g., counters, last values). - **Resuming**: on subsequent runs, restore state to continue seamlessly via the loop persistence file. Building blocks --------------- - **StatefulLoop**: orchestrates iteration, exposes ``is_last_iteration`` and ``iteration_count``, manages state bindings, and provides memory-aware ``buffer``. - **Stateful ops**: vectorized operations designed for chunked usage, such as ``SegmentedAggregator`` (planned, refactoring of ``AggStream``) and ``AsofMerger``. - **Store**: ordered parquet datasets with merging and intersections across datasets. Warm-up strategies ------------------ - Use ``StatefulLoop`` state to carry needed history between different runs. - Use ``Store.iter_intersections(..., n_prev=...)`` to prepend previous rows in the first chunk. Cross-links ----------- - Quickstart: :doc:`quickstart` - StatefulLoop guide: :doc:`stateful_loop` - AsofMerger guide: :doc:`asof_merger` - Store architecture: :doc:`store`