`Screener`¶

Classifies a list of report texts against a user-defined topic using an LLM.

This class takes a DataFrame of reports, a user query, and various configuration options to classify whether each report matches the query. It can either filter the DataFrame to return only matching reports or add a classification column to the original DataFrame.

Parameters:

Name	Type	Description	Default
`llm`	`LLM`	An instance of the LLM class from `pfd_toolkit`.	`None`
`reports`	`DataFrame`	A DataFrame containing Prevention of Future Death reports.	`None`
`verbose`	`bool`	If True, print more detailed logs. Defaults to False.	`False`
`include_date`	`bool`	Flag to determine if the 'date' column is included. Defaults to False.	`False`
`include_coroner`	`bool`	Flag to determine if the 'coroner' column is included. Defaults to False.	`False`
`include_area`	`bool`	Flag to determine if the 'area' column is included. Defaults to False.	`False`
`include_receiver`	`bool`	Flag to determine if the 'receiver' column is included. Defaults to False.	`False`
`include_investigation`	`bool`	Flag to determine if the 'investigation' column is included. Defaults to True.	`True`
`include_circumstances`	`bool`	Flag to determine if the 'circumstances' column is included. Defaults to True.	`True`
`include_concerns`	`bool`	Flag to determine if the 'concerns' column is included. Defaults to True.	`True`

Examples:

user_topic = "medication errors"
llm_client = LLM()
screener = Screener(llm=llm_client, reports=reports_df)
screened_reports = screener.screen_reports(user_query=user_topic)
print(f"Found {len(screened_reports)} report(s) on '{user_topic}'.")

screen_reports ¶

screen_reports(
    reports=None,
    user_query=None,
    filter_df=True,
    result_col_name="matches_query",
    produce_spans=False,
    drop_spans=False,
)

Classifies reports in the DataFrame against the user-defined topic using the LLM.

Parameters:

Name	Type	Description	Default
`reports`	`DataFrame`	If provided, this DataFrame will be used for screening, replacing any DataFrame stored in the instance for this call.	`None`
`user_query`	`str`	If provided, this query will be used, overriding any query stored in the instance for this call. The prompt template will be rebuilt.	`None`
`filter_df`	`bool`	If `True` the returned DataFrame is filtered to only matching reports. Defaults to `True`.	`True`
`result_col_name`	`str`	Name of the boolean column added when `filter_df` is `False`. Defaults to `"matches_query"`.	`'matches_query'`
`produce_spans`	`bool`	When `True` a `spans_matches_topic` column is created containing the text snippet that justified the classification. Defaults to `False`.	`False`
`drop_spans`	`bool`	When `True` and `produce_spans` is also `True`, the `spans_matches_topic` column is removed from the returned DataFrame. Defaults to `False`.	`False`

Returns:

Type	Description
`DataFrame`	Either a filtered DataFrame (if `filter_df` is `True`), or the original DataFrame with an added classification column.

Examples:

reports_df = pd.DataFrame(data)
screener = Screener(LLM(), reports=reports_df)

# Screen reports with the initial query
filtered_df = screener.screen_reports(user_query="medication safety")

# Screen the same reports with a new query and add a classification column
classified_df = screener.screen_reports(user_query="tree safety", filter_df=False)

Screener¶

screen_reports ¶

`Screener`¶