query#
- NestedFrame.query(expr: str, *, inplace: bool = False, **kwargs) NestedFrame | None[source]#
Query the columns of a NestedFrame with a boolean expression. Specified queries can target nested columns in addition to the typical column set
- Parameters:
expr (str) –
The query string to evaluate.
Access nested columns using nested_df.nested_col (where nested_df refers to a particular nested dataframe and nested_col is a column of that nested dataframe).
You can refer to variables in the environment by prefixing them with an ‘@’ character like
@a + b.You can refer to column names that are not valid Python variable names by surrounding them in backticks. Thus, column names containing spaces or punctuations (besides underscores) or starting with digits must be surrounded by backticks. (For example, a column named “Area (cm^2)” would be referenced as
`Area (cm^2)`). Column names which are Python keywords (like “list”, “for”, “import”, etc) cannot be used.For example, if one of your columns is called
a aand you want to sum it withb, your query should be`a a` + b.inplace (bool) – Whether to modify the DataFrame rather than creating a new one.
**kwargs – See the documentation for
pandas.DataFrame.query()for complete details on the keyword arguments accepted byquery().
- Returns:
NestedFrame resulting from the provided query expression.
- Return type:
Examples
>>> from nested_pandas.datasets.generation import generate_data >>> nf = generate_data(5,5, seed=1)
>>> nf = nf.query("nested.t > 10") >>> nf a b nested 0 0.417022 0.184677 [{t: 13.40935, flux: 98.886109, flux_error: 1.... 1 0.720324 0.372520 [{t: 13.70439, flux: 68.650093, flux_error: 1.... 2 0.000114 0.691121 [{t: 11.173797, flux: 28.044399, flux_error: 1... 3 0.302333 0.793535 [{t: 17.562349, flux: 1.828828, flux_error: 1.... 4 0.146756 1.077633 [{t: 17.527783, flux: 13.002857, flux_error: 1...
Most of the Series and NestedSeries attibutes and methods are available through the query interface. For example, to query based on the length of the nested frames, you can do:
>>> nf = nf.query("nested.len() > 2") >>> nf a b nested 0 0.417022 0.184677 [{t: 13.40935, flux: 98.886109, flux_error: 1.... 3 0.302333 0.793535 [{t: 17.562349, flux: 1.828828, flux_error: 1.... 4 0.146756 1.077633 [{t: 17.527783, flux: 13.002857, flux_error: 1...
See also
Notes
Queries that target a particular nested structure return a dataframe with rows of that particular nested structure filtered. For example, querying the NestedFrame “df” with nested structure “my_nested” as below will return all rows of df, but with mynested filtered by the condition: nf.query(“mynested.a > 2”)