`LLM`¶

Wrapper around the OpenAI Python SDK for batch prompting.

The helper provides:

generate for plain or vision-enabled prompts with optional pydantic validation.
_call_llm_fallback used by the scraper when HTML and PDF heuristics fail.
Built-in back-off and host-wide throttling via a semaphore.

Parameters:

Name	Type	Description	Default
`api_key`	`str`	OpenAI (or proxy) API key. Defaults to `None` which expects the environment variable to be set.	`None`
`model`	`str`	Chat model name. Defaults to `"gpt-4.1"`.	`'gpt-4.1'`
`base_url`	`str or None`	Override the OpenAI endpoint. Defaults to `None`.	`None`
`max_workers`	`int`	Maximum parallel workers for batch calls and for the global semaphore. Defaults to `8`.	`8`
`temperature`	`float`	Sampling temperature used for all requests. Defaults to `0.0`.	`0.0`
`seed`	`int or None`	Deterministic seed value passed to the API. Defaults to `None`.	`None`
`validation_attempts`	`int`	Number of times to retry parsing LLM output into a pydantic model. Defaults to `2`.	`2`
`timeout`	`float \| Timeout \| None`	Override the HTTP timeout in seconds. `None` uses the OpenAI client default of 600 seconds.	`120`

Attributes:

Name	Type	Description
`_sem`	`Semaphore`	Global semaphore that limits concurrent requests to max_workers.
`client`	`Client`	Low-level SDK client configured with key and base URL.

Examples:

llm_client = LLM(api_key="sk-...", model="gpt-4o-mini", temperature=0.2,
          timeout=600)

estimate_tokens ¶

estimate_tokens(texts, model=None)

Return token counts for text using tiktoken.

Parameters:

Name	Type	Description	Default
`texts`	`list[str] \| str`	Input strings to tokenise.	required
`model`	`str`	Model name for selecting the encoding. Defaults to `self.model`.	`None`

Returns:

Type	Description
`list[int]`	Token counts in the same order as `texts`.

generate ¶

generate(
    prompts,
    images_list=None,
    response_format=None,
    max_workers=None,
    tqdm_extra_kwargs=None,
)

Run many prompts either sequentially or in parallel.

    Parameters:

    prompts : list[str]
        List of user prompts. One prompt per model call.

    images_list : list[list[bytes]] or None, optional
        For vision models: a parallel list where each inner list
        holds **base64-encoded** JPEG pages for that prompt.  Use
        *None* to send no images.

    response_format : type[pydantic.BaseModel] or None, optional
        If provided, each response is parsed into that model via the
        *beta/parse* endpoint; otherwise a raw string is returned.

    max_workers : int or None, optional
        Thread count just for this batch. ``None`` uses the instance-wide
        ``max_workers`` value. Defaults to ``None``.

    Returns:

    list[Union[pydantic.BaseModel, str]]
        Results in the same order as `prompts`.

    Raises:

    openai.RateLimitError
        Raised only if the exponential back-off exhausts all retries.
    openai.APIConnectionError
        Raised if network issues persist beyond the retry window.
    openai.APITimeoutError
        Raised if the API repeatedly times out.

    Examples:

        msgs = ["Summarise:

" + txt for txt in docs] summaries = llm.generate(msgs)

LLM¶

estimate_tokens ¶

generate ¶

`LLM`¶