NestedFrame#

Constructor#

NestedFrame(*args, **kwargs)

A Pandas Dataframe extension with support for nested structure.

Helpful Properties#

NestedFrame.nested_columns

retrieves the base column names for all nested dataframes

NestedFrame.base_columns

Returns the list of base (non-nested) column names

NestedFrame.all_columns

returns a dictionary of columns for each base/nested dataframe

Nesting#

NestedFrame.join_nested(obj, name, *[, how, ...])

Packs input object to a nested column and adds it to the NestedFrame

NestedFrame.nest_lists(columns, name)

Creates a new NestedFrame where the specified list-value columns are packed into a nested column.

NestedFrame.from_flat(df, base_columns[, ...])

Creates a NestedFrame with base and nested columns from a flat dataframe.

NestedFrame.from_lists(df[, base_columns, ...])

Creates a NestedFrame with base and nested columns from a flat dataframe.

Extended Pandas.DataFrame Interface#

Note

The NestedFrame extends the Pandas.DataFrame interface, so all methods of Pandas.DataFrame are available. The following methods are a mix of newly added methods and extended methods from Pandas DataFrame to support NestedFrame functionality. Please reference the Pandas documentation for more information. https://pandas.pydata.org/docs/reference/frame.html

NestedFrame.get_subcolumns([nested_columns])

Returns a set of all subcolumn names from a set of nested columns, including dot notation

NestedFrame.eval(expr, *[, inplace])

Evaluate a string describing operations on NestedFrame columns.

NestedFrame.query(expr, *[, inplace])

Query the columns of a NestedFrame with a boolean expression.

NestedFrame.dropna(*, axis, , ] = 0, how, , ...)

Remove missing values for one layer of the NestedFrame.

NestedFrame.sort_values(by, *[, axis, ...])

Sort by the values along either axis.

NestedFrame.map_rows(func[, columns, ...])

Takes a function and applies it to each top-level row of the NestedFrame.

NestedFrame.drop([labels, axis, index, ...])

Drop specified labels from rows or columns.

NestedFrame.min([exclude_nest, numeric_only])

Return the minimum value of each column as a series, including nested columns with prefix to indicate the source column.

NestedFrame.max([exclude_nest, numeric_only])

Return the maximum value of each column as a series, including nested columns with prefix to indicate the source column.

NestedFrame.describe([exclude_nest, ...])

Generate descriptive statistics, including nested columns with prefix to indicate the source.

NestedFrame.explode(column[, ignore_index])

Transform each element of a list-like base column to a row, replicating index values.

NestedFrame.fillna([value, axis, inplace, limit])

Fill NA/NaN values using the specified method for base and nested columns.

I/O#

NestedFrame.to_parquet(path[, large_list])

Creates parquet file(s) with the data of a NestedFrame, either as a single parquet file where each nested dataset is packed into its own column or as an individual parquet file for each layer.

NestedFrame.to_pandas([list_struct, large_list])

Convert to an ordinal pandas DataFrame, with no NestedDtype series.

read_parquet(data[, columns, ...])

Load a parquet object from a file path into a NestedFrame.

from_pyarrow(table[, reject_nesting, ...])

Load a pyarrow Table object into a NestedFrame.