from_pyarrow#
- from_pyarrow(table: Table, reject_nesting: list[str] | str | None = None, autocast_list: bool = False, use_pandas_metadata: bool = True) NestedFrame[source]#
Load a pyarrow Table object into a NestedFrame.
- Parameters:
table (pa.Table) – PyArrow Table object to load NestedFrame from
reject_nesting (list or str, default=None) – Column(s) to reject from being cast to a nested dtype. By default, nested-pandas assumes that any struct column with all fields being lists is castable to a nested column. However, this assumption is invalid if the lists within the struct have mismatched lengths for any given item. Columns specified here will be read using the corresponding pandas.ArrowDtype.
autocast_list (bool, default=False) – If True, automatically cast list columns to nested columns with NestedDType.
use_pandas_metadata (bool, default=True) – If True (default), apply the pandas metadata stored in the Parquet file’s schema when constructing the NestedFrame (e.g. restoring the index and column dtypes). This matches the default behavior of pd.read_parquet. Set to False to ignore the metadata.
- Return type:
Examples
>>> import nested_pandas as npd >>> import pyarrow as pa >>> table = pa.table({ ... "obj_id": [1, 2, 3], ... "nested": pa.array([ ... [{"flux": 0.5, "time": 1}], ... [{"flux": 1.2, "time": 2}, {"flux": 0.8, "time": 3}], ... [{"flux": 2.0, "time": 4}], ... ]) ... }) >>> npd.from_pyarrow(table) obj_id nested 0 1 [{flux: 0.5, time: 1}] 1 2 [{flux: 1.2, time: 2}; …] (2 rows) 2 3 [{flux: 2.0, time: 4}]