LLM
¶
Wrapper around the OpenAI Python SDK for batch prompting.
The helper provides:
generate
for plain or vision-enabled prompts with optional pydantic validation._call_llm_fallback
used by the scraper when HTML and PDF heuristics fail.- Built-in back-off and host-wide throttling via a semaphore.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
api_key
|
str
|
OpenAI (or proxy) API key. Defaults to |
None
|
model
|
str
|
Chat model name. Defaults to |
'gpt-4.1'
|
base_url
|
str or None
|
Override the OpenAI endpoint. Defaults to |
None
|
max_workers
|
int
|
Maximum parallel workers for batch calls and for the global semaphore.
Defaults to |
8
|
temperature
|
float
|
Sampling temperature used for all requests. Defaults to |
0.0
|
seed
|
int or None
|
Deterministic seed value passed to the API. Defaults to |
None
|
validation_attempts
|
int
|
Number of times to retry parsing LLM output into a pydantic model.
Defaults to |
2
|
timeout
|
float | Timeout | None
|
Override the HTTP timeout in seconds. |
120
|
Attributes:
Name | Type | Description |
---|---|---|
_sem |
Semaphore
|
Global semaphore that limits concurrent requests to max_workers. |
client |
Client
|
Low-level SDK client configured with key and base URL. |
Examples:
llm_client = LLM(api_key="sk-...", model="gpt-4o-mini", temperature=0.2,
timeout=600)
estimate_tokens ¶
Return token counts for text using tiktoken
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
texts
|
list[str] | str
|
Input strings to tokenise. |
required |
model
|
str
|
Model name for selecting the encoding. Defaults to
|
None
|
Returns:
Type | Description |
---|---|
list[int]
|
Token counts in the same order as |
generate ¶
generate(
prompts,
images_list=None,
response_format=None,
max_workers=None,
tqdm_extra_kwargs=None,
)
Run many prompts either sequentially or in parallel.
Parameters:
Parameters:
prompts : list[str]
List of user prompts. One prompt per model call.
images_list : list[list[bytes]] or None, optional
For vision models: a parallel list where each inner list
holds **base64-encoded** JPEG pages for that prompt. Use
*None* to send no images.
response_format : type[pydantic.BaseModel] or None, optional
If provided, each response is parsed into that model via the
*beta/parse* endpoint; otherwise a raw string is returned.
max_workers : int or None, optional
Thread count just for this batch. ``None`` uses the instance-wide
``max_workers`` value. Defaults to ``None``.
Returns:
Returns:
list[Union[pydantic.BaseModel, str]]
Results in the same order as `prompts`.
Raises:
Raises:
openai.RateLimitError
Raised only if the exponential back-off exhausts all retries.
openai.APIConnectionError
Raised if network issues persist beyond the retry window.
openai.APITimeoutError
Raised if the API repeatedly times out.
Examples:
Examples:
msgs = ["Summarise:
" + txt for txt in docs] summaries = llm.generate(msgs)