Tutorial: Detention under the Mental Health Act¶

This page talks you through an example workflow using PFD Toolkit. Here, we will load a dataset and screen for relevant cases related to "detention under the Mental Health Act" (often referring to as 'being sectioned').

We will also discover themes to understand more about the issues coroners keep raising.

This is just an example. PFD reports contain a breadth of information across a whole range of topics and domains. Through this workflow, we hope to give you a sense of how the toolkit can be used, and how it might support your own project.

Load your first dataset¶

First, you'll need to load a PFD dataset. As this is just an example workflow, we'll only load reports between January 2024 and May 2025.

from pfd_toolkit import load_reports

# Load all PFD reports from January 2024 to May 2025
reports = load_reports(
    start_date="2024-01-01",
    end_date="2025-05-01")

reports.head(n=5)

url	date	coroner	area	receiver	investigation	circumstances	concerns
[...]	2025-05-01	A. Hodson	Birmingham and...	NHS England; The Rob...	On 9th December 2024...	At 10.45am on 23rd November...	To The Robert Jones...
[...]	2025-04-30	J. Andrews	West Sussex, Br...	West Sussex C...	On 2 November 2024 I...	They drove their car into...	The inquest was told t...
[...]	2025-04-30	A. Mutch	Manchester Sou...	Fluxton Road Medical...	On 1 October 2024 I...	They were prescribed long...	The inquest heard evide...
[...]	2025-04-25	J. Heath	North Yorkshire...	Townhead Surgery	On 4th June 2024 I...	On 15 March 2024, Richar...	When a referral docume...
[...]	2025-04-25	M. Hassell	Inner North Lo...	The President Royal...	On 23 August 2024, on...	They were a big baby and...	With the benefit of a m...

Set up an LLM client¶

Before exploring some of the other features of PFD toolkit, we first need to set up an LLM client. This is the AI engine that powers all other features of the toolkit.

You'll need to head to platform.openai.com and create an API key. Once you've got this, simply feed it to the LLM.

from pfd_toolkit import LLM

# Set up LLM client
llm_client = LLM(api_key=YOUR-API-KEY) # Replace with actual API key

Note

For a more detailed guide on using LLMs in this toolkit, see Setting up an LLM client.

Screen for relevant reports¶

You're likely using PFD Toolkit because you want to answer a specific question. In our example, we're asking: "Do any PFD reports raise concerns related to detention under the Mental Health Act?"

PFD Toolkit lets you query reports in plain English — no need to know precise keywords or categories. Just describe the cases you care about, and the toolkit will return matching reports.

from pfd_toolkit import Screener

# Create a user query to screen/filter reports by
user_query = "Concerns about detention under the Mental Health Act **only**"

# Set up & run our Screener
screener = Screener(llm = llm_client, # LLM client you set up above
                        reports = reports) # Reports that you loaded earlier

filtered_reports = screener.screen_reports(
    user_query=user_query)

# Optionally, count number of identified reports
len(filtered_reports)

>> 51

filtered_reports returns a filtered version of our original PFD DataFrame, containing the 51 reports the LLM believed matches our query.

Note

For more information on Screening reports, see Screening relevant reports.

Discover recurring themes¶

Now that we've loaded and screened our reports for relevance to being detained under the Mental Health Act, our next step is to discover recurring themes. In other words, concerns that coroners keep raising.

Set up the Extractor¶

Before we get the model to generate a list of themes for us, we first need to set up our Extractor. This class dictates how the model interacts with your filtered list of PFD reports.

Each include_* flag controls whether a specific section of the report are sent to the LLM for analysis.

For example, if we were only interested in patterns related to coroner's concerns, we would set the include_concerns flag to True:

from pfd_toolkit import Extractor

extractor = Extractor(
    llm=llm_client,             # The same client you created earlier
    reports=filtered_reports,   # Your screened reports

    include_date=False,
    include_coroner=False,
    include_area=False,
    include_receiver=False,
    include_investigation=False,
    include_circumstances=False,
    include_concerns=True       # <--- Only supply the 'concerns' text
)

Note

The main reason why we're hiding all reports sections other than the coroners' concerns is to help keep the LLM's instructions short & focused. LLMs often perform better when they are given only relevant information.

Your own research question might be different. For example, you might be interested in discovering recurring themes related to 'cause of death', in which case you'll likely want to set include_investigation and include_circumstances to True.

To understand more about what information is contained within each of the report sections, please see About the data.

Summarise reports¶

Some PFD reports can be long. Because of this, we need to summarise reports before we discover themes:

# Create short summaries of the concerns
extractor.summarise(trim_intensity="medium")

Get a list of themes¶

Now that we've done this, we can run the discover_themes method and assign the result to a new class, which we've named ThemeInstructions:

# Ask the LLM to propose recurring themes
ThemeInstructions = extractor.discover_themes(
    max_themes=6,  # Limit the list to keep things manageable
)

Note

discover_themes() will warn you if the word count of your summaries is still too high. In these cases, you might want to set your trim_intensity to high or very high (though please note that the more we trim, the more detail we lose).

To print our list of themes, run:

print(extractor.identified_themes)

...which gives us:

{
  "bed_shortage": "Insufficient availability of inpatient mental health beds or suitable placements, leading to delays, inappropriate care environments, or patients being placed far from home.",

  "staff_training": "Inadequate staff training, knowledge, or awareness regarding policies, risk assessment, clinical procedures, or the Mental Health Act.",

  "record_keeping": "Poor, inconsistent, or falsified documentation and record keeping, including failures in care planning, observation records, and communication of key information.",

  "policy_gap": "Absence, inconsistency, or lack of clarity in policies, protocols, or guidance, resulting in confusion or unsafe practices.",

  "communication_failures": "Breakdowns in communication or information sharing between staff, agencies, families, or across systems, impacting patient safety and care continuity.",

  "risk_assessment": "Failures or omissions in risk assessment, escalation, or monitoring, including inadequate recognition of suicide risk, self-harm, or other patient safety concerns."
}

Tag the reports with our themes¶

Above, we've only identified a list of themes: we haven't yet assigned these themes to each of our reports.

Here, we take ThemeInstructions that we created earlier and pass it back into the extractor to assign themes to reports via extract_features():

labelled_reports = extractor.extract_features(
    feature_model=ThemeInstructions,
    force_assign=True,  # (Force the model to make a decision)
    allow_multiple=True  # (A single report might touch on several themes)
)

labelled_reports.head()

The resulting DataFrame now contains our existing columns along with a suite of new ones: each filled with either True or False, depending on whether the theme was present.

url	id	date	coroner	area	receiver	investigation	circumstances	concerns	bed_shortage	staff_training	record_keeping	policy_gap	communication_failures	risk_assessment
[…]	2025-0172	2025-04-07	S. Reeves	South London	South London and Maudsley NHS …	On 21 March 2023 an inquest …	Christopher McDonald was …	The evidence heard …	False	True	False	False	False	True
[…]	2025-0144	2025-03-17	S. Horstead	Essex	Chief Executive Officer of Essex …	On 31 October 2023 I …	On the 23rd September 2023 …	(a) Failures in care …	False	False	True	False	True	True
[…]	2025-0104	2025-03-13	A. Harris	South London	Oxleas NHS Foundation Trust; …	On 15 January 2020 an …	Mr Paul Dunne had a …	Individual mental health …	False	True	True	True	True	True
[…]	2025-0124	2025-03-06	D. Henry	Coventry	Chair of the Coventry and …	On 13 August 2021 I …	Mr Gebrsselasié on the 2nd …	The inquest explored issues …	False	False	False	True	False	True
[…]	2025-0119	2025-03-04	L. Hunt	Birmingham and Solihull	Birmingham and Solihull Mental …	On 20 July 2023 I …	Mr Lynch resided in room 1 …	To Birmingham and Solihull …	False	True	True	True	True	True

Tabulate reports¶

Finally, we can count how often a theme appears in our collection of reports:

extractor.tabulate()

Category	Count	Percentage
bed_shortage	14	27.5
staff_training	22	43.1
record_keeping	13	25.5
policy_gap	35	68.6
communication_failures	19	37.3
risk_assessment	34	66.7

That's it! You've gone from a mass of PFD reports, to a focused set of cases relating to being detained under the Mental Health Act to a theme‑tagged dataset ready for deeper exploration.

From here, you might want to export your curated dataset to a .csv for any final qualitative/manual analysis:

labelled_reports.to_csv()

Alternatively, you might want to check out the other analytical features that PFD Toolkit offers.