sdialog

This package provides utilities for generating synthetic dialogues using instruction-tuned large language models (LLMs). Dialogues are generated primarily via role-playing, where each agent is defined by a Persona object. The package supports dialogue orchestration, context management, and flexible serialization for downstream tasks.

Main components:

  • Dialog, Turn, Event: Data structures for representing dialogues and their events.

  • Context, Persona and Agent: For defining and simulating role-played agents in a given context.

  • Orchestrators: For controlling agent behavior during dialogue generation.

  • Evaluation: Utilities and metrics for assessing dialogue quality (coherence, turn balance, persona/goal adherence, safety screening, lexical/statistical reports) and for building reproducible evaluation pipelines.

  • Interpretability: Layer/token-level activation capture, inspection (Inspector, hooks), steering (directional modulation of representations), and instruction extraction utilities (see interpretability.py).

class sdialog.Turn(*, speaker: str | None = None, text: str)

Bases: BaseModel

Represents a single turn in a dialogue.

Parameters:
  • speaker (Optional[str]) – The name or role of the speaker.

  • text (str) – The utterance text for this turn.

speaker: str | None
text: str
prompt() str

Generates a prompt string for this turn.

print()
class sdialog.Event(*, agent: str | None = None, action: str, actionLabel: str | None = None, content: str | Dict | List, timestamp: int)

Bases: BaseModel

Represents an event in a dialogue, which may be an utterance, instruction, or other action.

Parameters:
  • agent (Optional[str]) – The agent responsible for the event (e.g., “user”, “system”).

  • action (str) – The type of event (e.g., “utter”, “instruct”).

  • actionLabel (Optional[str]) – A label describing the action (e.g., type of instruction).

  • content (Union[str, Dict, List]) – The content of the event.

  • timestamp (int) – The Unix timestamp of the event.

agent: str | None
action: str
actionLabel: str | None
content: str | Dict | List
timestamp: int
class sdialog.Dialog(*, version: str | None = <factory>, timestamp: str | None = <factory>, model: str | Dict | None = None, seed: int | None = None, id: int | str | None = <factory>, parentId: int | str | None = None, complete: bool | None = None, personas: dict[str, ~typing.Any] | None=None, context: str | dict[str, ~typing.Any] | None=None, scenario: dict | str | None = None, turns: List[Turn] | None = <factory>, events: List[Event] | None = None, notes: Any | None = None)

Bases: BaseModel

A pydantic model representing a conversational dialogue with rich metadata, container-like access to turns, text utilities, analytics, and I/O helpers.

Parameters:
  • version (Optional[str]) – Version of the dialogue format.

  • timestamp (Optional[str]) – Timestamp of dialogue creation.

  • model (Optional[Union[str, Dict]]) – The model used to generate the dialogue.

  • seed (Optional[int]) – The random seed used for generation.

  • id (Optional[Union[int, str]]) – Unique ID for the dialogue.

  • parentId (Optional[Union[int, str]]) – ID of the parent dialogue, if any.

  • complete (Optional[bool]) – Whether the dialogue is complete.

  • personas (Optional[dict[str, Any]]) – Any is a subclass of MetaPersona.

  • context (Optional[Union[str, dict[str, Any]]]) – Shared context for the dialogue.

  • scenario (Optional[Union[dict, str]]) – Scenario description or metadata.

  • turns (Optional[List[Turn]]) – List of dialogue turns.

  • events (Optional[List[Event]]) – List of dialogue events (optional).

  • notes (Optional[str]) – Free-text notes or comments about the dialogue.

version: str | None
timestamp: str | None
model: str | Dict | None
seed: int | None
id: int | str | None
parentId: int | str | None
complete: bool | None
personas: dict[str, Any] | None
context: str | dict[str, Any] | None
scenario: dict | str | None
turns: List[Turn] | None
events: List[Event] | None
notes: Any | None
lower(in_place: bool = True) Dialog

Apply str.lower() to every turn’s text.

Parameters:

in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.

Returns:

The modified (or cloned) Dialog.

Return type:

Dialog

upper(in_place: bool = True) Dialog

Apply str.upper() to every turn’s text.

Parameters:

in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.

Returns:

The modified (or cloned) Dialog.

Return type:

Dialog

title(in_place: bool = True) Dialog

Apply str.title() to every turn’s text.

Parameters:

in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.

Returns:

The modified (or cloned) Dialog.

Return type:

Dialog

capitalize(in_place: bool = True) Dialog

Apply str.capitalize() to every turn’s text.

Parameters:

in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.

Returns:

The modified (or cloned) Dialog.

Return type:

Dialog

strip(chars: str = None, in_place: bool = True) Dialog

Apply str.strip(chars) to every turn’s text.

Parameters:
  • chars (Optional[str]) – Characters to strip; if None, default whitespace is stripped.

  • in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.

Returns:

The modified (or cloned) Dialog.

Return type:

Dialog

replace(old: str, new: str, count: int = -1, in_place: bool = True) Dialog

Apply str.replace(old, new, count) to every turn’s text. If count < 0 all occurrences are replaced.

Parameters:
  • old (str) – Substring to be replaced.

  • new (str) – Replacement substring.

  • count (int) – Maximum number of replacements per text; if < 0 replace all.

  • in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.

Returns:

The modified (or cloned) Dialog.

Return type:

Dialog

re_sub(pattern: str | Pattern, repl: str | callable, count: int = 0, flags: int = 0, in_place: bool = True) Dialog

Apply re.sub(pattern, repl, text, count=count, flags=flags) to every turn’s text. If pattern is a compiled regex, flags are ignored.

Parameters:
  • pattern (Union[str, Pattern]) – Regex pattern (string or compiled).

  • repl (Union[str, Callable]) – Replacement string or callable.

  • count (int) – Max substitutions per text (0 means unlimited).

  • flags (int) – Regex flags (ignored if compiled pattern provided).

  • in_place (bool) – Mutate this Dialog if True; else return a cloned transformed Dialog.

Returns:

The modified (or cloned) Dialog.

Return type:

Dialog

length(mode: str = 'words', words_per_minute: int = 130) int

Returns the length of the dialogue according to the specified mode (number of words by default).

Parameters:
  • mode (str) –

    The mode for measuring length. Options:

    • "turns": Number of turns (default)

    • "words": Total number of words in all turns

    • "minutes" / "time": Approximate duration in minutes (words_per_minute/minute)

  • words_per_minute (int) – Words per minute for “minutes” mode (default is 130, which is a common estimate).

Returns:

The computed length according to the mode.

Return type:

int

Raises:

ValueError – If an unknown mode is specified.

clone(new_id: int = None) Dialog

Creates a deep copy of the dialogue.

Parameters:

new_id (int, optional) – Optional ID to assign to the cloned dialog. If None, a new universal ID is generated.

Returns:

A new Dialog object that is a deep copy of this one, with updated id and parentId.

Return type:

Dialog

description(turn_template: str = None)

Returns a human-readable string representation of the dialogue.

Parameters:

turn_template (str) – Template for formatting each turn (default “{speaker}: {text}”).

Returns:

The formatted dialogue.

Return type:

str

prompt() str

Generates a prompt string for the entire dialogue.

json(string: bool = False, indent: int = 2, ensure_ascii: bool = False)

Serializes the dialogue to JSON.

Parameters:
  • string (bool) – If True, returns a JSON string; otherwise, returns a dict.

  • indent (int) – Indentation level for pretty-printing.

Returns:

The serialized dialogue.

Return type:

Union[str, dict]

print(*a, **kw)

Pretty-prints a dialogue to the console, with optional scenario and orchestration details.

Parameters:
  • scenario (bool) – If True, prints scenario information.

  • orchestration (bool) – If True, prints also orchestration events.

  • think (bool) – If True, prints “thinking” events.

  • all (bool) – If True, prints all types of events.

to_file(path: str = None, type: str = 'auto', makedir: bool = True, overwrite: bool = True, ensure_ascii: bool = False)

Saves the dialogue to a file in JSON, CSV, or plain text format.

Parameters:
  • path (str) – Output file path, if not provided, uses the same path used to load the dialogue.

  • type (str) – “json”, “csv”, “txt”, or “auto” (determined by file extension).

  • makedir (bool) – If True, creates parent directories as needed.

  • overwrite (bool) – If False and the file exists, raise FileExistsError instead of overwriting.

  • ensure_ascii (bool) – If True and type is “json”, escape non-ASCII characters in the output.

to_audio(path: str = None, **kwargs: dict)

Convert the dialogue to an audio dialogue. This is a convenience wrapper around the full sdialog.audio.pipeline.to_audio function. All keyword arguments are passed to it.

Parameters:
  • path (str) – Directory path for storing audio outputs.

  • dialog_dir_name (str) – Custom name for the dialogue directory.

  • dscaper_data_path (Optional[str]) – Path to dSCAPER data directory.

  • room_name (Optional[str]) – Custom name for the room configuration.

  • perform_tts (Optional[bool]) – Convert the dialog into audio using the text-to-speech engine.

  • perform_room_acoustics (Optional[bool]) – Enable room acoustics simulation and dSCAPER timeline generation.

  • tts_engine (BaseTTS) – Text-to-speech engine for audio generation.

  • voice_database (BaseVoiceDatabase) – Voice database for speaker selection.

  • dscaper_datasets (List[str]) – List of Hugging Face datasets for dSCAPER.

  • room (Room) – Room configuration for acoustics simulation.

  • speaker_positions (dict[Role, dict]) – Speaker positioning configuration.

  • background_effect (str) – Background audio effect type.

  • foreground_effect (str) – Foreground audio effect type.

  • foreground_effect_position (RoomPosition) – Position for foreground effects.

  • kwargs_pyroom (dict) – PyRoomAcoustics configuration parameters.

  • source_volumes (dict[SourceType, SourceVolume]) – Volume levels for different audio sources.

  • audio_file_format (str) – Audio file format (wav, mp3, flac).

  • seed (int) – Seed for random number generator.

  • re_sampling_rate (Optional[int]) – Re-sampling rate for the output audio.

  • recording_devices (Optional[List[Union[RecordingDevice, str]]]) – The identifiers of the recording devices to simulate.

  • impulse_response_database (Optional[ImpulseResponseDatabase]) – The database for impulse responses.

  • override_tts_audio (Optional[bool]) – Override the TTS audio if it already exists.

  • verbose (Optional[bool]) – Verbose mode for logging.

Returns:

Audio dialogue with processed audio data.

Return type:

“sdialog.audio.dialog.AudioDialog”

Raises:

Exception – If the audio module is not installed.

static from_huggingface(repo_id: str, local_dir: str = None, collapse_consecutive_speakers: bool = False, collapse_separator: str = '\n') List[Dialog] | Dict[str, List[Dialog]]

Loads dialogues from a HuggingFace dataset.

This method downloads a dataset from HuggingFace Hub and loads dialogues from it. The dataset must follow the SDialog format with a ‘data’ folder containing dialogue files. If the data folder contains train/test/val split subdirectories, dialogues are loaded from all splits and returned as a dictionary mapping split names to dialogue lists. Otherwise, all dialogues from the data folder are returned as a single list.

Parameters:
  • repo_id (str) – HuggingFace repository ID (e.g., “sdialog/Primock-57”).

  • local_dir (str, optional) – Local directory to download to. If None, uses a temporary directory.

  • collapse_consecutive_speakers (bool) – If True, collapses consecutive turns by the same speaker.

  • collapse_separator (str) – Separator used when collapsing consecutive turns.

Returns:

List of dialogs or dict mapping splits to lists of dialogs.

Return type:

Union[List[Dialog], Dict[str, List[Dialog]]]

Raises:
  • ImportError – If huggingface_hub is not installed.

  • ValueError – If the dataset is not a valid sdialog dataset.

static from_folder(path: str, type: str = 'auto', txt_template: str = '{speaker}: {text}', csv_speaker_col: int | str = 'speaker', csv_text_col: int | str = 'text', collapse_consecutive_speakers: bool = False, collapse_separator: str = '\n') List[Dialog]

Loads all dialogues from a folder.

Parameters:
  • path (str) – Path to the directory containing dialogue files.

  • type (str) – "json", "txt", "csv", "tsv", or "auto" (determined by file extension).

  • txt_template (str) – Template for parsing text dialogue turns (default “{speaker}: {text}”).

  • csv_speaker_col (Union[int, str]) – Column identifier for speaker in CSV/TSV files (can be index or header name).

  • csv_text_col (Union[int, str]) – Column identifier for text in CSV/TSV files (can be index or header name).

  • collapse_consecutive_speakers (bool) – If True, collapses consecutive turns by the same speaker into one turn.

  • collapse_separator (str) – String used to join texts when collapsing consecutive turns (default: "\n").

Returns:

A list of loaded dialogue objects from the folder.

Return type:

List[Dialog]

Raises:

ValueError – If the path is not a directory.

static from_file(path: str, type: str = 'auto', txt_template: str = '{speaker}: {text}', csv_speaker_col: int | str = 'speaker', csv_text_col: int | str = 'text', collapse_consecutive_speakers: bool = False, collapse_separator: str = '\n') Dialog | List[Dialog]

Loads a dialogue from a file.

Parameters:
  • path (str) – Path to the dialogue file or directory. In case of a directory, all dialogues in the directory will be loaded and returned as a list of Dialog objects.

  • type (str) – "json", "txt", "csv", "tsv", or "auto" (determined by file extension).

  • txt_template (str) – Template for parsing text dialogue turns (default “{speaker}: {text}”).

  • csv_speaker_col (Union[int, str]) – Column identifier for speaker in CSV/TSV files (can be index or header name).

  • csv_text_col (Union[int, str]) – Column identifier for text in CSV/TSV files (can be index or header name).

  • collapse_consecutive_speakers (bool) – If True, collapses consecutive turns by the same speaker into one turn.

  • collapse_separator (str) – String used to join texts when collapsing consecutive turns (default: "\n").

Returns:

The loaded dialogue object.

Return type:

Dialog

Raises:

ValueError – If the file format is not recognized or if required columns are missing.

static from_str(dialog_text: str, template: str = '{speaker}: {text}', default_speakers: List[str] = None, id: str | int = None) Dialog

Creates a Dialog object from a string representation of a dialogue.

Parameters:
  • dialog_text (str) – The dialogue text, with each turn on a new line.

  • template (str) – The template for parsing each turn. Default is “{speaker}: {text}”.

  • default_speakers (List[str]) – Optional list of default speakers to use if no present in the text or template. The speakers will be assigned in order of appearance, in alternating turns. Default is None (speaker field will be empty in each turn).

  • id (Union[str, int]) – Optional ID for the dialogue. If not provided, a universal ID will be generated.

Returns:

The created Dialog object.

Return type:

Dialog

static from_dict(data: dict)

Creates a Dialog object from a dictionary.

Parameters:

data (dict) – The dictionary containing dialogue data.

Returns:

The created Dialog object.

Return type:

Dialog

from_json(json_str: str)

Creates a Dialog object from a JSON string.

Parameters:

json_str (str) – The JSON string containing dialogue data.

Returns:

The created Dialog object.

Return type:

Dialog

rename_speaker(old_name: str, new_name: str, case_sensitive: bool = False, in_events: bool = True) Dialog

Renames all occurrences of a speaker in the dialogue’s turns (and optionally events).

Parameters:
  • old_name (str) – The current speaker name to replace.

  • new_name (str) – The new speaker name.

  • case_sensitive (bool) – Whether to match speaker names case-sensitively (default: False).

  • in_events (bool) – Whether to also rename in events’ agent fields (default: True).

Returns:

Self (the same Dialog instance) after in-place modification.

Return type:

Dialog

get_speakers(keep_case: bool = True) List[str]

Returns a list of unique speaker names in the dialogue.

Parameters:

keep_case (bool) – Whether to keep the original case of speaker names or convert them to lowercase (default: True).

Returns:

A list of unique speaker names.

Return type:

List[str]

filter(speaker: str) Dialog

Filters the dialogue turns by speaker.

Parameters:

speaker (str) – The speaker name to filter by (case-insensitive).

Returns:

A new Dialog containing only that speaker’s turns; returns None if speaker not found.

Return type:

Optional[Dialog]

class sdialog.Context(*, location: str | None = None, datetime: str | None = None, environment: str | None = None, objects: str | List[str] | None = None, participants_shared_knowledge: str | None = None, circumstances: str | List[str] | None = None, goals: str | List[str] | None = None, constraints: str | List[str] | None = None, topics: str | List[str] | None = None, style_guidelines: str | List[str] | None = None, notes: str | None = None)

Bases: BaseAttributeModel

Dialogue-shared context class.

Parameters:
  • location (Optional[str]) – Physical or virtual location where the dialogue occurs.

  • datetime (Optional[str]) – Timestamp or temporal setting relevant to the dialogue.

  • environment (Optional[str]) – Physical environment description, environmental conditions, or contextual atmosphere.

  • objects (Optional[Union[str, List[str]]]) – Relevant objects (single value or list of values).

  • participants_shared_knowledge (Optional[str]) – Information all participants are assumed to know.

  • circumstances (Optional[Union[str, List[str]]]) – Situational circumstances impacting the dialogue.

  • goals (Optional[Union[str, List[str]]]) – Stated or implicit goals of the participants.

  • constraints (Optional[Union[str, List[str]]]) – Limitations or constraints affecting actions or dialogue.

  • topics (Optional[Union[str, List[str]]]) – Main topics or themes (single or list).

  • style_guidelines (Optional[Union[str, List[str]]]) – Stylistic or formatting guidelines to follow.

  • notes (Optional[str]) – Additional free-form contextual notes.

location: str | None
datetime: str | None
environment: str | None
objects: str | List[str] | None
participants_shared_knowledge: str | None
circumstances: str | List[str] | None
goals: str | List[str] | None
constraints: str | List[str] | None
topics: str | List[str] | None
style_guidelines: str | List[str] | None
notes: str | None
static attributes(_cls=<class 'sdialog.Context'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.Instruction(*, text: str = None, events: Event | List[Event] | None = None)

Bases: BaseModel

Represents an instruction to an agent, optionally with associated events.

Parameters:
  • text (str) – The instruction text.

  • events (Optional[Union[Event, List[Event]]]) – Associated event(s), either a single Event or a list of Events.

text: str
events: Event | List[Event] | None

sdialog.base

Model foundations for sdialog root module.

Provides:

  • Metadata: common provenance fields (version, timestamp, ids).

  • BaseAttributeModel: pydantic-based abstract base for persona/context-like objects with cloning, serialization, and dynamic subclass discovery utilities.

class sdialog.base.Metadata(*, version: str | None = <factory>, timestamp: str | None = <factory>, model: str | Dict | None = None, seed: int | None = None, id: int | str | None = <factory>, parentId: int | str | None = None, className: str = None, notes: str | None = None)

Bases: BaseModel

Metadata class for object, context and other objects.

Parameters:
  • version (Optional[str]) – Version of the object format (matches sdialog version).

  • timestamp (Optional[str]) – Timestamp of when the object was generated.

  • model (Optional[str]) – The model used to generate the object.

  • seed (Optional[int]) – The random seed used for object generation.

  • id (Optional[Union[int, str]]) – Unique identifier for the object.

  • parentId (Optional[Union[int, str]]) – ID of the parent object, if any.

  • notes (Optional[str]) – Free-text notes or comments about the generated object.

  • className (str) – The class name of the object (a subclass of BaseAttributeModel).

version: str | None
timestamp: str | None
model: str | Dict | None
seed: int | None
id: int | str | None
parentId: int | str | None
className: str
notes: str | None
class sdialog.base.BaseAttributeModel

Bases: BaseModel, ABC

Base class for defining an attribute-based object.

Features:

  • Strict field control.

  • Automatic static attributes() helper listing declared fields.

  • Metadata tracking (id, parentId, version, timestamp).

  • Clone with optional field overrides and proper lineage linkage.

  • JSON / prompt serialization helpers.

clone(new_id: int = None, **kwargs) BaseAttributeModel

Create a deep copy of this object with optional attribute overrides.

Metadata handling:

  • parentId of clone = original id (if present).

  • id of clone = new_id if provided else a new universal id.

  • Other metadata fields are copied.

Parameters:
  • new_id (Optional[int]) – Optional new unique id for the clone.

  • kwargs (Any) – Field overrides applied to the cloned instance.

Returns:

Independent cloned instance.

Return type:

BaseAttributeModel

description() str

Returns a string description of the object’s attributes.

Returns:

Description of the object.

Return type:

str

print()

Pretty-prints the object, including its metadata information.

json(string: bool = False, indent=2, output_metadata: bool = True)

Serializes the object to JSON.

Parameters:
  • string (bool) – If True, returns a JSON string; otherwise, returns a dict.

  • indent (int) – Indentation level for pretty-printing.

  • output_metadata (bool) – Include the metadata in the serialization.

Returns:

The serialized object.

Return type:

Union[str, dict]

prompt() str

Returns the textual representation of the object, used as part of the system prompt.

Returns:

JSON string without metadata (intended for prompt inclusion).

Return type:

str

to_file(path: str, makedir: bool = True)

Saves the object to a file in either JSON or plain text format.

Parameters:
  • path (str) – Output file path.

  • makedir (bool) – If True, creates parent directories as needed.

static from_file(path: str, object_class: BaseAttributeModel | None = None)

Load an object from a JSON file.

Parameters:
  • path (str) – Path to file.

  • object_class (Optional[BaseAttributeModel]) – Optional explicit subclass to force (bypasses className dispatch).

Returns:

Loaded instance.

Return type:

BaseAttributeModel

Raises:

ValueError – If metadata/className is missing or unknown.

static from_dict(data: dict, object_class: BaseAttributeModel | None = None)

Create an object instance from a dictionary.

Dispatch rules:

  • If object_class is provided and is a BaseAttributeModel subclass, it is used directly.

  • Else uses _metadata.className to resolve a registered subclass.

Parameters:
  • data (dict) – Source dictionary (must include _metadata.className).

  • object_class (Optional[BaseAttributeModel]) – Optional explicit subclass.

Returns:

Instantiated object.

Return type:

BaseAttributeModel

Raises:

ValueError – If className missing or cannot be resolved.

static from_json(json_str: str, object_class: BaseAttributeModel | None = None)

Create an object instance from a JSON string.

Parameters:
  • json_str (str) – JSON serialization including _metadata.className.

  • object_class (Optional[BaseAttributeModel]) – Optional explicit subclass override.

Returns:

Instantiated object.

Return type:

BaseAttributeModel


sdialog.personas

This module provides classes for defining personas (character profiles) and simulating agents that role-play these personas in synthetic dialogue generation.

sdialog.personas.BasePersona

Abstract base class for defining personas. Alias for sdialog.base.BaseAttributeModel

class sdialog.personas.Persona(*, name: str = '', age: int | str = '', race: str = '', gender: str = '', language: str = 'English', role: str = '', background: str = '', personality: str = '', circumstances: str = '', rules: str = '')

Bases: BaseAttributeModel

Standard persona class with common attributes for role-play.

Parameters:
  • name (str) – Name of the persona.

  • age (Union[int, str]) – Age of the persona (can be an int or a descriptive string like “middle-aged”).

  • race (str) – Race / ethnicity of the persona.

  • gender (str) – Gender of the persona.

  • language (str) – Preferred language of communication.

  • role (str) – Role, profession, or primary identity descriptor.

  • background (str) – Background or life history summary.

  • personality (str) – Personality traits summary (free text).

  • circumstances (str) – Current situational context (e.g., “recently moved”, “under stress”).

  • rules (str) – Constraints, style or behavioral rules to enforce.

name: str
age: int | str
race: str
gender: str
language: str
role: str
background: str
personality: str
circumstances: str
rules: str
static attributes(_cls=<class 'sdialog.personas.Persona'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.ExtendedPersona(*, name: str = '', age: int | str = '', race: str = '', gender: str = '', language: str = 'English', weight: str | int | float = '', height: str | int | float = '', voice_characteristics: str = '', occupation: str = '', education: str = '', socioeconomic_status: str = '', interests: str = '', hobbies: str = '', politeness: str = '', forgetfulness: str = '', attentiveness: str = '', communication_style: str = '', empathy_level: str = '', political_views: str = '', religious_beliefs: str = '')

Bases: BaseAttributeModel

Extended persona class with additional demographic, personality, and background attributes.

Parameters:
  • name (str) – Name of the persona.

  • age (Union[int, str]) – Age (numeric or descriptive string).

  • race (str) – Race / ethnicity.

  • gender (str) – Gender identity.

  • language (str) – Preferred language.

  • weight (Union[str, int, float]) – Weight (numeric with unit or descriptive string).

  • height (Union[str, int, float]) – Height (numeric with unit or descriptive string).

  • voice_characteristics (str) – Voice, accent, tone, pacing, etc.

  • occupation (str) – Current occupation or professional role.

  • education (str) – Education level or academic background.

  • socioeconomic_status (str) – Socioeconomic status descriptor.

  • interests (str) – General interests (comma-separated or free text).

  • hobbies (str) – Hobbies (comma-separated or free text).

  • politeness (str) – Politeness style/level.

  • forgetfulness (str) – Forgetfulness tendency.

  • attentiveness (str) – Attentiveness or focus tendency.

  • communication_style (str) – Style of communication (e.g., direct, verbose).

  • empathy_level (str) – Empathy level or descriptor.

  • political_views (str) – Political alignment (e.g., conservative, moderate, apolitical).

  • religious_beliefs (str) – Religious stance (e.g., religious, agnostic, atheist).

name: str
age: int | str
race: str
gender: str
language: str
weight: str | int | float
height: str | int | float
voice_characteristics: str
occupation: str
education: str
socioeconomic_status: str
interests: str
hobbies: str
politeness: str
forgetfulness: str
attentiveness: str
communication_style: str
empathy_level: str
political_views: str
religious_beliefs: str
static attributes(_cls=<class 'sdialog.personas.ExtendedPersona'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Customer(*, name: str = '', age: int | str = '', gender: str = '', language: str = 'English', customer_id: str | int = '', occupation: str = '', account_tenure: str = '', membership_level: str = '', loyalty_status: str = '', fidelity_score: str | float | int = '', issue: str = '', issue_category: str = '', issue_description: str = '', issue_history: str = '', desired_outcome: str = '', knowledge_domain: str = '', technical_expertise: str = '', sentiment: str = '', anger_level: str = '', tiredness: str = '', patience_level: str = '', politeness: str = '', personality: str = '', instruction_following: str = '', forgetfulness: str = '', times_called: int | str = '', preferred_channel: str = '', prior_interactions_summary: str = '', urgency: str = '', rules: str = '')

Bases: BaseAttributeModel

Persona for a customer in a customer service interaction.

Parameters:
  • name (str) – Customer name.

  • age (Union[int, str]) – Customer age (numeric or descriptive).

  • gender (str) – Customer gender.

  • language (str) – Preferred language.

  • customer_id (Union[str, int]) – Internal customer identifier.

  • occupation (str) – Customer occupation.

  • account_tenure (str) – How long they have been a customer (e.g., “2 years”).

  • membership_level (str) – Plan/tier (e.g., basic, premium).

  • loyalty_status (str) – Loyalty descriptor (e.g., loyal, at-risk).

  • fidelity_score (Union[str, float, int]) – Loyalty score (numeric or descriptive).

  • issue (str) – Short summary of current problem.

  • issue_category (str) – High-level category (billing, technical, etc.).

  • issue_description (str) – Detailed issue description.

  • issue_history (str) – Brief summary of related past issues.

  • desired_outcome (str) – Customer’s desired resolution / goal.

  • knowledge_domain (str) – Subject/domain familiarity (e.g., novice, expert).

  • technical_expertise (str) – Legacy field for backward compatibility.

  • sentiment (str) – Overall emotional tone (e.g., frustrated, neutral).

  • anger_level (str) – Anger intensity descriptor.

  • tiredness (str) – Fatigue level.

  • patience_level (str) – Patience descriptor.

  • politeness (str) – Politeness style (e.g., polite, curt).

  • personality (str) – Personality descriptor (e.g., analytical).

  • instruction_following (str) – Likelihood of following instructions.

  • forgetfulness (str) – Tendency to forget prior guidance.

  • times_called (Union[int, str]) – Number of prior contacts (numeric or descriptive).

  • preferred_channel (str) – Preferred support channel.

  • prior_interactions_summary (str) – Summary of earlier interactions.

  • urgency (str) – Perceived urgency (e.g., low, high).

  • rules (str) – Constraints or special handling notes.

name: str
age: int | str
gender: str
language: str
customer_id: str | int
occupation: str
account_tenure: str
membership_level: str
loyalty_status: str
fidelity_score: str | float | int
issue: str
issue_category: str
issue_description: str
issue_history: str
desired_outcome: str
knowledge_domain: str
technical_expertise: str
sentiment: str
anger_level: str
tiredness: str
patience_level: str
politeness: str
personality: str
instruction_following: str
forgetfulness: str
times_called: int | str
preferred_channel: str
prior_interactions_summary: str
urgency: str
rules: str
static attributes(_cls=<class 'sdialog.personas.Customer'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.SupportAgent(*, name: str = '', language: str = 'English', agent_id: str | int = '', role: str = 'Customer Support Agent', experience_years: int | str = '', product_scope: str = '', product_knowledge_level: str = '', communication_style: str = '', empathy_level: str = '', politeness: str = '', resolution_authority_level: str = '', escalation_policy: str = '', average_handle_time: int | float | str = '', adherence_notes: str = '', stress_tolerance: str = '', performance_notes: str = '', rules: str = '')

Bases: BaseAttributeModel

Persona for a customer service / support agent.

Parameters:
  • name (str) – Agent name.

  • language (str) – Working language.

  • agent_id (Union[str, int]) – Internal agent identifier.

  • role (str) – Agent role or queue designation.

  • experience_years (Union[int, str]) – Years (or range) of support experience.

  • product_scope (str) – Products or domains covered.

  • product_knowledge_level (str) – Knowledge depth (e.g., basic, expert).

  • communication_style (str) – Communication style (e.g., concise, empathetic).

  • empathy_level (str) – Empathy descriptor.

  • politeness (str) – Politeness level descriptor.

  • resolution_authority_level (str) – Authority level for resolutions/escalations.

  • escalation_policy (str) – Summary of escalation criteria/process.

  • average_handle_time (Union[int, float, str]) – Typical handling time (e.g., “6m”).

  • adherence_notes (str) – Notes on process or QA adherence.

  • stress_tolerance (str) – Stress handling capability descriptor.

  • performance_notes (str) – Performance KPIs or evaluation notes.

  • rules (str) – Internal rules, compliance reminders, or constraints.

name: str
language: str
agent_id: str | int
role: str
experience_years: int | str
product_scope: str
product_knowledge_level: str
communication_style: str
empathy_level: str
politeness: str
resolution_authority_level: str
escalation_policy: str
average_handle_time: int | float | str
adherence_notes: str
stress_tolerance: str
performance_notes: str
rules: str
static attributes(_cls=<class 'sdialog.personas.SupportAgent'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Patient(*, name: str = '', age: int | str = None, race: str = '', gender: str = '', language: str = 'English', forgetfulness: str | float = '', formality: str | float = '', hurriedness: str | float = '', openness: str | float = '', height: str | int | float = '', weight: str | int | float = '', occupation: str = '', marital_status: str = '', insurance: str = '', reason_for_visit: str = '', symptoms: str | List[str] = '', medical_history: str | List[str] = '', medical_conditions: str | List[str] = '', medications: str | List[str] = '', allergies: str | List[str] = '', family_history: str | List[str] = '')

Bases: BaseAttributeModel

Patient persona with essential / minimal plus behavioral and demographic attributes for dialogue generation.

Parameters:
  • name (str) – Patient name.

  • age (Union[int, str]) – Patient age (numeric or descriptive).

  • race (str) – Race / ethnicity.

  • gender (str) – Gender identity.

  • language (str) – Preferred communication language.

  • forgetfulness (Union[str, float]) – Forgetfulness tendency (qualitative or numeric).

  • formality (Union[str, float]) – Formality of speech (qualitative or numeric scale).

  • hurriedness (Union[str, float]) – Degree of impatience / hurriedness.

  • openness (Union[str, float]) – Openness to share information.

  • height (Union[str, int, float]) – Height (numeric with unit or descriptive).

  • weight (Union[str, int, float]) – Weight (numeric with unit or descriptive).

  • occupation (str) – Occupation or employment status.

  • marital_status (str) – Marital status.

  • insurance (str) – Insurance provider / status.

  • reason_for_visit (str) – Chief complaint / presenting problem.

  • symptoms (Union[str, List[str]]) – Reported symptoms.

  • medical_history (Union[str, List[str]]) – Past medical history (string or list of conditions).

  • medical_conditions (Union[str, List[str]]) – Known diagnosed conditions (string or list).

  • medications (Union[str, List[str]]) – Current medications (string or list).

  • allergies (Union[str, List[str]]) – Known allergies (string or list).

  • family_history (Union[str, List[str]]) – Family medical history (string or list).

name: str
age: int | str
race: str
gender: str
language: str
forgetfulness: str | float
formality: str | float
hurriedness: str | float
openness: str | float
height: str | int | float
weight: str | int | float
occupation: str
marital_status: str
insurance: str
reason_for_visit: str
symptoms: str | List[str]
medical_history: str | List[str]
medical_conditions: str | List[str]
medications: str | List[str]
allergies: str | List[str]
family_history: str | List[str]
static attributes(_cls=<class 'sdialog.personas.Patient'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.ExtendedPatient(*, name: str = '', age: int | str = '', race: str = '', gender: str = '', language: str = 'English', weight: str | int | float = '', height: str | int | float = '', voice_characteristics: str = '', occupation: str = '', education: str = '', socioeconomic_status: str = '', interests: str = '', hobbies: str = '', politeness: str = '', forgetfulness: str = '', attentiveness: str = '', communication_style: str = '', empathy_level: str = '', political_views: str = '', religious_beliefs: str = '', reason_for_visit: str = '', symptoms: str | List[str] = '', vital_signs: str = '', health_literacy: str = '', medical_conditions: str | List[str] = '', medications: str | List[str] = '', allergies: str | List[str] = '', family_history: str | List[str] = '')

Bases: ExtendedPersona

ExtendedPatient persona with additional health-related attributes. Inherits all attributes from ExtendedPersona plus medical context fields.

Parameters:
  • reason_for_visit (str) – Chief complaint or reason for consultation.

  • symptoms (Union[str, List[str]]) – Reported symptoms (free text or summarized list).

  • vital_signs (str) – Vital signs summary (e.g., “BP 120/80, HR 72”).

  • health_literacy (str) – Health literacy level descriptor.

  • medical_conditions (Union[str, List[str]]) – Known or chronic conditions (free text summary).

  • medications (Union[str, List[str]]) – Current medications summary.

  • allergies (Union[str, List[str]]) – Allergy list / summary.

  • family_history (Union[str, List[str]]) – Family medical history summary.

reason_for_visit: str
symptoms: str | List[str]
vital_signs: str
health_literacy: str
medical_conditions: str | List[str]
medications: str | List[str]
allergies: str | List[str]
family_history: str | List[str]
static attributes(_cls=<class 'sdialog.personas.ExtendedPatient'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Doctor(*, name: str = '', age: int | str = '', race: str = '', gender: str = '', language: str = 'English', years_of_experience: int | str = '', specialty: str = '', forgetfulness: str = '', formality: str = '', hurriedness: str = '', openness: str = '')

Bases: BaseAttributeModel

Doctor persona with essential professional and behavioral attributes.

Parameters:
  • name (str) – Doctor’s name.

  • age (Union[int, str]) – Doctor’s age (numeric or descriptive).

  • race (str) – Race / ethnicity.

  • gender (str) – Gender identity.

  • language (str) – Working language.

  • years_of_experience (Union[int, str]) – Years (or range) of medical practice.

  • specialty (str) – Medical specialty (as spelled in this class).

  • forgetfulness (str) – Forgetfulness tendency.

  • formality (str) – Formality level in communication.

  • hurriedness (str) – Degree of time pressure / haste.

  • openness (str) – Openness / approachability.

name: str
age: int | str
race: str
gender: str
language: str
years_of_experience: int | str
specialty: str
forgetfulness: str
formality: str
hurriedness: str
openness: str
static attributes(_cls=<class 'sdialog.personas.Doctor'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.ExtendedDoctor(*, name: str = '', age: int | str = '', race: str = '', gender: str = '', language: str = 'English', weight: str | int | float = '', height: str | int | float = '', voice_characteristics: str = '', occupation: str = '', education: str = '', socioeconomic_status: str = '', interests: str = '', hobbies: str = '', politeness: str = '', forgetfulness: str = '', attentiveness: str = '', communication_style: str = '', empathy_level: str = '', political_views: str = '', religious_beliefs: str = '', specialty: str = '', years_of_experience: int | str = '', certifications: str = '', work_experience: str = '')

Bases: ExtendedPersona

ExtendedDoctor persona adding professional credentials. Inherits all attributes from ExtendedPersona plus, the following ones.

Parameters:
  • specialty (str) – Medical specialty / domain focus.

  • years_of_experience (Union[int, str]) – Years (or range) of clinical experience.

  • certifications (str) – Professional certifications / board statuses.

  • work_experience (str) – Summary of prior practice settings / roles.

specialty: str
years_of_experience: int | str
certifications: str
work_experience: str
static attributes(_cls=<class 'sdialog.personas.ExtendedDoctor'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Nurse(*, name: str = '', age: int | str = '', gender: str = '', language: str = 'English', years_of_experience: int | str = '', specialty: str = '', shift: str = '', empathy_level: str = '', politeness: str = '', attentiveness: str = '', stress_tolerance: str = '')

Bases: BaseAttributeModel

Nurse persona for healthcare dialogues.

Parameters:
  • name (str) – Nurse name.

  • age (Union[int, str]) – Nurse age (numeric or descriptive).

  • gender (str) – Gender identity.

  • language (str) – Working language.

  • years_of_experience (Union[int, str]) – Years of nursing experience.

  • specialty (str) – Nursing specialty.

  • shift (str) – Typical work shift.

  • empathy_level (str) – Empathy descriptor.

  • politeness (str) – Politeness style.

  • attentiveness (str) – Attentiveness descriptor.

  • stress_tolerance (str) – Stress handling capability.

name: str
age: int | str
gender: str
language: str
years_of_experience: int | str
specialty: str
shift: str
empathy_level: str
politeness: str
attentiveness: str
stress_tolerance: str
static attributes(_cls=<class 'sdialog.personas.Nurse'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Pharmacist(*, name: str = '', age: int | str = '', gender: str = '', language: str = 'English', years_of_experience: int | str = '', workplace: str = '', expertise: str = '', politeness: str = '', communication_style: str = '')

Bases: BaseAttributeModel

Pharmacist persona for healthcare dialogues.

Parameters:
  • name (str) – Pharmacist name.

  • age (Union[int, str]) – Pharmacist age (numeric or descriptive).

  • gender (str) – Gender identity.

  • language (str) – Working language.

  • years_of_experience (Union[int, str]) – Years of pharmacy experience.

  • workplace (str) – Pharmacy or hospital name.

  • expertise (str) – Pharmaceutical expertise.

  • politeness (str) – Politeness style.

  • communication_style (str) – Communication style.

name: str
age: int | str
gender: str
language: str
years_of_experience: int | str
workplace: str
expertise: str
politeness: str
communication_style: str
static attributes(_cls=<class 'sdialog.personas.Pharmacist'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Caregiver(*, name: str = '', age: int | str = '', gender: str = '', relationship: str = '', experience_years: int | str = '', empathy_level: str = '', attentiveness: str = '')

Bases: BaseAttributeModel

Caregiver persona for healthcare dialogues.

Parameters:
  • name (str) – Caregiver name.

  • age (Union[int, str]) – Caregiver age (numeric or descriptive).

  • gender (str) – Gender identity.

  • relationship (str) – Relationship to care recipient.

  • experience_years (Union[int, str]) – Years of caregiving experience.

  • empathy_level (str) – Empathy descriptor.

  • attentiveness (str) – Attentiveness descriptor.

name: str
age: int | str
gender: str
relationship: str
experience_years: int | str
empathy_level: str
attentiveness: str
static attributes(_cls=<class 'sdialog.personas.Caregiver'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Teacher(*, name: str = '', age: int | str = '', gender: str = '', subject: str = '', years_of_experience: int | str = '', education_level: str = '', politeness: str = '', communication_style: str = '')

Bases: BaseAttributeModel

Teacher persona for education dialogues.

Parameters:
  • name (str) – Teacher name.

  • age (Union[int, str]) – Teacher age (numeric or descriptive).

  • gender (str) – Gender identity.

  • subject (str) – Teaching subject.

  • years_of_experience (Union[int, str]) – Years of teaching experience.

  • education_level (str) – Highest degree.

  • politeness (str) – Politeness style.

  • communication_style (str) – Communication style.

name: str
age: int | str
gender: str
subject: str
years_of_experience: int | str
education_level: str
politeness: str
communication_style: str
static attributes(_cls=<class 'sdialog.personas.Teacher'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Student(*, name: str = '', age: int | str = '', gender: str = '', grade_level: str = '', major: str = '', interests: str = '', politeness: str = '')

Bases: BaseAttributeModel

Student persona for education dialogues.

Parameters:
  • name (str) – Student name.

  • age (Union[int, str]) – Student age (numeric or descriptive).

  • gender (str) – Gender identity.

  • grade_level (str) – Grade or year.

  • major (str) – Major or focus area.

  • interests (str) – Interests.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
grade_level: str
major: str
interests: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.Student'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.AcademicAdvisor(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', specialty: str = '', politeness: str = '')

Bases: BaseAttributeModel

AcademicAdvisor persona for education dialogues.

Parameters:
  • name (str) – Advisor name.

  • age (Union[int, str]) – Advisor age (numeric or descriptive).

  • gender (str) – Gender identity.

  • years_of_experience (Union[int, str]) – Years of advising experience.

  • specialty (str) – Advising specialty.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
years_of_experience: int | str
specialty: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.AcademicAdvisor'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.FinancialAdvisor(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', certifications: str = '', specialty: str = '', politeness: str = '')

Bases: BaseAttributeModel

FinancialAdvisor persona for finance dialogues.

Parameters:
  • name (str) – Advisor name.

  • age (Union[int, str]) – Advisor age (numeric or descriptive).

  • gender (str) – Gender identity.

  • years_of_experience (Union[int, str]) – Years of financial advising experience.

  • certifications (str) – Certifications.

  • specialty (str) – Financial specialty.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
years_of_experience: int | str
certifications: str
specialty: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.FinancialAdvisor'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Banker(*, name: str = '', age: int | str = '', gender: str = '', branch: str = '', years_of_experience: int | str = '', politeness: str = '')

Bases: BaseAttributeModel

Banker persona for finance dialogues.

Parameters:
  • name (str) – Banker name.

  • age (Union[int, str]) – Banker age (numeric or descriptive).

  • gender (str) – Gender identity.

  • branch (str) – Bank branch.

  • years_of_experience (Union[int, str]) – Years of banking experience.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
branch: str
years_of_experience: int | str
politeness: str
static attributes(_cls=<class 'sdialog.personas.Banker'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.InsuranceAgent(*, name: str = '', age: int | str = '', gender: str = '', company: str = '', years_of_experience: int | str = '', specialty: str = '', politeness: str = '')

Bases: BaseAttributeModel

InsuranceAgent persona for finance dialogues.

Parameters:
  • name (str) – Agent name.

  • age (Union[int, str]) – Agent age (numeric or descriptive).

  • gender (str) – Gender identity.

  • company (str) – Insurance company.

  • years_of_experience (Union[int, str]) – Years of insurance experience.

  • specialty (str) – Insurance specialty.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
company: str
years_of_experience: int | str
specialty: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.InsuranceAgent'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.StoreManager(*, name: str = '', age: int | str = '', gender: str = '', store_name: str = '', years_of_experience: int | str = '', politeness: str = '')

Bases: BaseAttributeModel

StoreManager persona for retail dialogues.

Parameters:
  • name (str) – Manager name.

  • age (Union[int, str]) – Manager age (numeric or descriptive).

  • gender (str) – Gender identity.

  • store_name (str) – Store name.

  • years_of_experience (Union[int, str]) – Years of management experience.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
store_name: str
years_of_experience: int | str
politeness: str
static attributes(_cls=<class 'sdialog.personas.StoreManager'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.SalesAssociate(*, name: str = '', age: int | str = '', gender: str = '', store_name: str = '', years_of_experience: int | str = '', politeness: str = '')

Bases: BaseAttributeModel

SalesAssociate persona for retail dialogues.

Parameters:
  • name (str) – Associate name.

  • age (Union[int, str]) – Associate age (numeric or descriptive).

  • gender (str) – Gender identity.

  • store_name (str) – Store name.

  • years_of_experience (Union[int, str]) – Years of sales experience.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
store_name: str
years_of_experience: int | str
politeness: str
static attributes(_cls=<class 'sdialog.personas.SalesAssociate'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Shopper(*, name: str = '', age: int | str = '', gender: str = '', shopping_goal: str = '', loyalty_status: str = '', politeness: str = '')

Bases: BaseAttributeModel

Shopper persona for retail dialogues.

Parameters:
  • name (str) – Shopper name.

  • age (Union[int, str]) – Shopper age (numeric or descriptive).

  • gender (str) – Gender identity.

  • shopping_goal (str) – Shopping goal.

  • loyalty_status (str) – Loyalty descriptor.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
shopping_goal: str
loyalty_status: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.Shopper'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.HotelReceptionist(*, name: str = '', age: int | str = '', gender: str = '', hotel_name: str = '', years_of_experience: int | str = '', politeness: str = '')

Bases: BaseAttributeModel

HotelReceptionist persona for hospitality dialogues.

Parameters:
  • name (str) – Receptionist name.

  • age (Union[int, str]) – Receptionist age (numeric or descriptive).

  • gender (str) – Gender identity.

  • hotel_name (str) – Hotel name.

  • years_of_experience (Union[int, str]) – Years of hospitality experience.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
hotel_name: str
years_of_experience: int | str
politeness: str
static attributes(_cls=<class 'sdialog.personas.HotelReceptionist'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.TravelAgent(*, name: str = '', age: int | str = '', gender: str = '', agency_name: str = '', years_of_experience: int | str = '', politeness: str = '')

Bases: BaseAttributeModel

TravelAgent persona for travel dialogues.

Parameters:
  • name (str) – Agent name.

  • age (Union[int, str]) – Agent age (numeric or descriptive).

  • gender (str) – Gender identity.

  • agency_name (str) – Travel agency name.

  • years_of_experience (Union[int, str]) – Years of travel experience.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
agency_name: str
years_of_experience: int | str
politeness: str
static attributes(_cls=<class 'sdialog.personas.TravelAgent'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Tourist(*, name: str = '', age: int | str = '', gender: str = '', travel_goal: str = '', politeness: str = '')

Bases: BaseAttributeModel

Tourist persona for travel dialogues.

Parameters:
  • name (str) – Tourist name.

  • age (Union[int, str]) – Tourist age (numeric or descriptive).

  • gender (str) – Gender identity.

  • travel_goal (str) – Travel goal.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
travel_goal: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.Tourist'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Lawyer(*, name: str = '', age: int | str = '', gender: str = '', specialty: str = '', years_of_experience: int | str = '', politeness: str = '')

Bases: BaseAttributeModel

Lawyer persona for legal dialogues.

Parameters:
  • name (str) – Lawyer name.

  • age (Union[int, str]) – Lawyer age (numeric or descriptive).

  • gender (str) – Gender identity.

  • specialty (str) – Legal specialty.

  • years_of_experience (Union[int, str]) – Years of legal experience.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
specialty: str
years_of_experience: int | str
politeness: str
static attributes(_cls=<class 'sdialog.personas.Lawyer'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Paralegal(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', politeness: str = '')

Bases: BaseAttributeModel

Paralegal persona for legal dialogues.

Parameters:
  • name (str) – Paralegal name.

  • age (Union[int, str]) – Paralegal age (numeric or descriptive).

  • gender (str) – Gender identity.

  • years_of_experience (Union[int, str]) – Years of paralegal experience.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
years_of_experience: int | str
politeness: str
static attributes(_cls=<class 'sdialog.personas.Paralegal'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.LegalClient(*, name: str = '', age: int | str = '', gender: str = '', case_type: str = '', politeness: str = '')

Bases: BaseAttributeModel

LegalClient persona for legal dialogues.

Parameters:
  • name (str) – Client name.

  • age (Union[int, str]) – Client age (numeric or descriptive).

  • gender (str) – Gender identity.

  • case_type (str) – Type of legal case.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
case_type: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.LegalClient'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.ITSupportSpecialist(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', expertise_area: str = '', politeness: str = '')

Bases: BaseAttributeModel

ITSupportSpecialist persona for tech support dialogues.

Parameters:
  • name (str) – Specialist name.

  • age (Union[int, str]) – Specialist age (numeric or descriptive).

  • gender (str) – Gender identity.

  • years_of_experience (Union[int, str]) – Years of IT support experience.

  • expertise_area (str) – Area of technical expertise.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
years_of_experience: int | str
expertise_area: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.ITSupportSpecialist'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.HelpdeskTechnician(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', politeness: str = '')

Bases: BaseAttributeModel

HelpdeskTechnician persona for tech support dialogues.

Parameters:
  • name (str) – Technician name.

  • age (Union[int, str]) – Technician age (numeric or descriptive).

  • gender (str) – Gender identity.

  • years_of_experience (Union[int, str]) – Years of helpdesk experience.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
years_of_experience: int | str
politeness: str
static attributes(_cls=<class 'sdialog.personas.HelpdeskTechnician'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.EndUser(*, name: str = '', age: int | str = '', gender: str = '', device_type: str = '', issue_description: str = '', politeness: str = '')

Bases: BaseAttributeModel

EndUser persona for tech support dialogues.

Parameters:
  • name (str) – End user name.

  • age (Union[int, str]) – End user age (numeric or descriptive).

  • gender (str) – Gender identity.

  • device_type (str) – Type of device used.

  • issue_description (str) – Description of technical issue.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
device_type: str
issue_description: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.EndUser'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.CivilServant(*, name: str = '', age: int | str = '', gender: str = '', department: str = '', years_of_experience: int | str = '', politeness: str = '')

Bases: BaseAttributeModel

CivilServant persona for government dialogues.

Parameters:
  • name (str) – Civil servant name.

  • age (Union[int, str]) – Civil servant age (numeric or descriptive).

  • gender (str) – Gender identity.

  • department (str) – Government department.

  • years_of_experience (Union[int, str]) – Years of public service.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
department: str
years_of_experience: int | str
politeness: str
static attributes(_cls=<class 'sdialog.personas.CivilServant'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.SocialWorker(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', specialty: str = '', politeness: str = '')

Bases: BaseAttributeModel

SocialWorker persona for public service dialogues.

Parameters:
  • name (str) – Social worker name.

  • age (Union[int, str]) – Social worker age (numeric or descriptive).

  • gender (str) – Gender identity.

  • years_of_experience (Union[int, str]) – Years of social work experience.

  • specialty (str) – Social work specialty.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
years_of_experience: int | str
specialty: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.SocialWorker'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Citizen(*, name: str = '', age: int | str = '', gender: str = '', inquiry_topic: str = '', politeness: str = '')

Bases: BaseAttributeModel

Citizen persona for government dialogues.

Parameters:
  • name (str) – Citizen name.

  • age (Union[int, str]) – Citizen age (numeric or descriptive).

  • gender (str) – Gender identity.

  • inquiry_topic (str) – Topic of inquiry.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
inquiry_topic: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.Citizen'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Chef(*, name: str = '', age: int | str = '', gender: str = '', restaurant_name: str = '', years_of_experience: int | str = '', cuisine_specialty: str = '', politeness: str = '')

Bases: BaseAttributeModel

Chef persona for food service dialogues.

Parameters:
  • name (str) – Chef name.

  • age (Union[int, str]) – Chef age (numeric or descriptive).

  • gender (str) – Gender identity.

  • restaurant_name (str) – Restaurant name.

  • years_of_experience (Union[int, str]) – Years of culinary experience.

  • cuisine_specialty (str) – Cuisine specialty.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
restaurant_name: str
years_of_experience: int | str
cuisine_specialty: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.Chef'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.Waiter(*, name: str = '', age: int | str = '', gender: str = '', restaurant_name: str = '', years_of_experience: int | str = '', politeness: str = '')

Bases: BaseAttributeModel

Waiter persona for food service dialogues.

Parameters:
  • name (str) – Waiter name.

  • age (Union[int, str]) – Waiter age (numeric or descriptive).

  • gender (str) – Gender identity.

  • restaurant_name (str) – Restaurant name.

  • years_of_experience (Union[int, str]) – Years of service experience.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
restaurant_name: str
years_of_experience: int | str
politeness: str
static attributes(_cls=<class 'sdialog.personas.Waiter'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None

class sdialog.personas.RestaurantCustomer(*, name: str = '', age: int | str = '', gender: str = '', dietary_preferences: str = '', politeness: str = '')

Bases: BaseAttributeModel

RestaurantCustomer persona for food service dialogues.

Parameters:
  • name (str) – Customer name.

  • age (Union[int, str]) – Customer age (numeric or descriptive).

  • gender (str) – Gender identity.

  • dietary_preferences (str) – Dietary preferences.

  • politeness (str) – Politeness style.

name: str
age: int | str
gender: str
dietary_preferences: str
politeness: str
static attributes(_cls=<class 'sdialog.personas.RestaurantCustomer'>, print=False)

List (or pretty-print) public attribute field names for this subclass.

Parameters:

print (bool) – If True, pretty-prints instead of returning the list.

Returns:

List of attribute names (if print=False).

Return type:

List[str] | None


sdialog.agents

This module provides classes for Agents and related utilities for simulating persona-conditioned dialogue with Large Language Models (LLMs). Agents maintain structured conversation memory, integrate orchestrators that inject dynamic (persistent or ephemeral) system instructions, and expose inspection / interpretability hooks for token- and layer-level analysis and optional representation steering.

sdialog.agents.final_response_tool(func=None)

Decorator to mark a tool whose raw output should be returned directly as the agent response (bypassing the post-tool LLM synthesis step).

This is useful for pre-formatted outputs (e.g., large markdown tables) where token-by-token regeneration by the LLM is unnecessary.

Usage:

from sdialog.agents import final_response_tool

@final_response_tool
def my_tool(...) -> str:
    ...
Parameters:

func (Optional[callable]) – The tool function to mark.

Returns:

Decorated function.

Return type:

callable

class sdialog.agents.Agent(persona: BaseAttributeModel = None, name: str | None = None, context: str | Context | None = None, first_utterance: str | List[str] | None = None, dialogue_details: str = '', response_details: str = 'Unless necessary, responses SHOULD be only one utterance long, and SHOULD NOT contain many questions or topics in one single turn.', example_dialogs: List[Dialog] | None = None, tools: List | None = None, think: bool = False, thinking_pattern: str | None = '<think>(.*?)</think>', can_finish: bool = True, orchestrators: BaseOrchestrator | List[BaseOrchestrator] | None = None, inspectors: Inspector | List[Inspector] | None = None, preprocessing_fn: callable | None = None, postprocess_fn: callable | None = None, system_prompt: str | None = None, model: str | langchain_core.language_models.base.BaseLanguageModel = None, **llm_kwargs)

Bases: object

Agent that simulates a persona-driven conversational actor using an LLM.

This class wraps:

  • A persona (traits / role)

  • Optional context + exemplar dialogues

  • Orchestrators (dynamic / persistent injected instructions)

  • Interpretability hooks (token / layer events, steering)

  • Simple dialogue loop utilities (dialog_with)

Example:

from sdialog import Persona, Context
from sdialog.agents import Agent

# Create two agents
user = Agent(persona=Persona(name="Dr. Nebula",
                             role="Astrobotanist seeking alien spores"),
             name="Scientist")
bot = Agent(persona=Persona(name="StationCore",
                            role="Sarcastic habitat control AI"),
            name="Bot")

# Create an (optional) context for the conversation
context = Context(location="Orbiting Research Station Theta-9",
                  environment="Zero-gravity greenhouse",
                  objects=["alien spores", "hydroponic garden", "research equipment"])

# Create a dialogue
dialog = user.dialog_with(bot, context=context)

# Print dialog
dialog.print()
Parameters:
  • persona (BasePersona) – The persona to role-play.

  • name (Optional[str]) – Name of the agent (defaults to persona.name if not provided).

  • context (Optional[Union[str, Context]]) – Optional default context for the agent’s conversations.

  • first_utterance (Optional[Union[str, List[str]]]) – Optional fixed first utterance or list of possible first utterances.

  • dialogue_details (str) – Additional details about the dialogue.

  • response_details (str) – Instructions for response style.

  • example_dialogs (Optional[List[Dialog]]) – Optional list of default example dialogues as a reference for the agent.

  • tools (Optional[List[callable]] Tools decorated with @final_response_tool return their raw output directly as the final agent response.) – List of functions to be used as tools by the agent (if supported by the LLM).

  • think (bool) – If True, enables “thinking” segments in responses (if supported by the LLM).

  • thinking_pattern (Optional[str]) – Regex pattern to manually identify “thinking” segments in responses.

  • can_finish (bool) – If True, agent can end the conversation.

  • orchestrators (Optional[Union[BaseOrchestrator, List[BaseOrchestrator]]]) – Orchestrators for agent behavior.

  • inspectors (Optional[Union[Inspector, List[Inspector]]]) – Inspector(s) to add to the agent.

  • preprocessing_fn (Optional[callable]) – Optional function to preprocess each input utterance before calling the LLM (input string, output string).

  • postprocess_fn (Optional[callable]) – Optional function to postprocess each output utterance after calling the LLM (input string, output string).

  • system_prompt (Optional[str]) – Custom system prompt to use as-is (takes precedence over persona; if provided, persona is disabled and this prompt is used directly).

  • model (Union[str, BaseLanguageModel], optional) – The LLM or model name to use (defaults to config[“llm”][“model”]).

  • llm_kwargs (dict) – Additional parameters for the LLM.

property memory: List[langchain_core.messages.base.BaseMessage]

The conversation memory as a list of messages.

property base_model

Return the underlying base (wrapped) model object (e.g., a HuggingFace Transformers model).

Resolution order:
  1. ChatHuggingFace wrapper: self.llm.llm.pipeline.model

  2. Objects exposing pipeline.model

  3. Objects exposing model

If none are found, self.llm is returned as a fallback.

property tokenizer

Return the underlying tokenizer object (e.g., a HuggingFace Transformers tokenizer).

Resolution order:
  1. ChatHuggingFace wrapper: self.llm.llm.tokenizer

  2. Objects exposing pipeline.tokenizer

  3. Objects exposing tokenizer

__call__(utterance: str | List[langchain_core.messages.base.BaseMessage] = '', return_events: bool = False, current_dialog: Dialog = None) str

Processes an input utterance and generates a response.

Parameters:
  • utterance (Union[str, List[BaseMessage]]) – The input utterance from the other agent or, in case of stateless operation, the full context as a list of Langchain messages.

  • return_events (bool) – If True, returns a list of events instead of just the response string.

  • current_dialog (Dialog) – The current dialog state as a Dialog object for orchestrators.

Returns:

The agent’s response or events, or None if finished.

Return type:

Union[str, List[Event], None]

serve(host: str = '0.0.0.0', port: int = 1333, stateless: bool = True, log_level: str = 'info')

Starts a REST API server to interact with the agent.

Parameters:
  • host (str) – Host address to bind the server to.

  • port (int) – Port number to listen on.

  • stateless (bool) – If True, the server does not maintain conversation state (as such the full context must be provided with each request).

  • log_level (str) – Logging level for the server.

response_lookahead(message: str = None)

Generates a response without updating the agent’s memory.

  • If message is None, predicts the next reply given current memory.

  • If message is provided, predicts a reply to that hypothetical input.

Notes: - Orchestrators and inspectors are not invoked. - Tools may be called, but their outputs are not persisted. - Only postprocess_fn is applied (no preprocessing).

Parameters:

message (Optional[str]) – The hypothetical message to reply to (optional).

Returns:

The predicted response text.

Return type:

str

add_orchestrators(orchestrators)

Adds orchestrators to the agent.

Parameters:

orchestrators (Union[BaseOrchestrator, List[BaseOrchestrator]]) – Orchestrator(s) to add.

add_inspectors(inspectors)

Adds inspectors to the agent.

Parameters:

inspectors (Union[Inspector, List[Inspector]]) – Inspector(s) to add.

clear_orchestrators()

Removes all orchestrators from the agent.

clear_inspectors()

Removes all inspectors from the agent.

instruct(instruction: str, persist: bool = False)

Adds a system instruction to the agent’s memory.

Parameters:
  • instruction (str) – The instruction text.

  • persist (bool) – If True, instruction persists across turns.

set_first_utterances(utterances: str | List[str])

Sets the agent’s first utterance(s) for dialogue initialization.

Parameters:

utterances (Union[str, List[str]]) – The greeting(s) to use.

get_name(default: str = 'Me') str

Returns the agent’s name.

Parameters:

default (str) – Fallback name if agent has no explicit name.

Returns:

The agent’s name.

Return type:

str

prompt() str

Returns the current system prompt.

Returns:

The system prompt.

Return type:

str

json(string: bool = False, indent=None)

Serializes the agent’s configuration and persona to JSON.

Parameters:
  • string (bool) – If True, returns a JSON string; otherwise, returns a dict.

  • indent (int) – Indentation level for pretty-printing.

Returns:

The serialized agent.

Return type:

Union[str, dict]

reset(seed: int = None, context: str | Context = None, example_dialogs: List[Dialog] = None)

Resets the agent’s memory and orchestrators, optionally reseeding the LLM. Also clears interpretability state and components if any.

Parameters:
  • seed – Random seed for reproducibility (if None, generated).

  • context – Optional context override.

  • example_dialogs – Optional replacement example dialogs for prompt regeneration.

dialog_with(agent: Agent, context: str | Context = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, max_turns: int = 100, id: int = None, parent_id: int = None, seed: int = None, notes: str = None, keep_bar: bool = True)

Simulates a dialogue between this agent and another Agent.

Parameters:
  • agent (Agent) – The other agent to converse with.

  • context (Optional[Union[str, Context]]) – The context for the dialogue (optional).

  • example_dialogs (Optional[List[Dialog]]) – Example dialogues to guide the conversation (optional).

  • scenario (Optional[Union[dict, str]]) – Optional scenario metadata for the dialogue.

  • max_turns (int) – Maximum number of dialogue turns.

  • id (int) – Dialogue ID.

  • parent_id (int) – ID of the parent dialogue, if any.

  • seed (int) – Random seed for reproducibility.

  • notes (str) – Optional notes to include in the dialogue.

  • keep_bar (bool) – If True, keeps the progress bar visible.

Returns:

The generated dialogue object.

Return type:

Dialog

memory_dump(as_dict: bool = False) list

Returns a copy of the agent’s memory (list of messages).

Parameters:

as_dict (bool) – If True, returns list of message dicts (serialization-friendly).

Returns:

Conversation memory snapshot.

Return type:

list

talk_with(agent: Agent, context: str | Context = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, max_turns: int = 100, id: int = None, parent_id: int = None, seed: int = None, notes: str = None, keep_bar: bool = True)

Alias for Agent.dialog_with().


sdialog.orchestrators

This module provides base and concrete classes for orchestrating agent behavior during synthetic dialogue generation. Orchestrators can inject instructions, control agent responses, and manage dialogue flow for more complex scenarios.

class sdialog.orchestrators.SimpleReflexOrchestrator(condition: callable, instruction: str, persistent: bool = False, event_label: str = None)

Bases: BaseOrchestrator

Simple reflex orchestrator that provides fixed instructions when a condition matches.

Example:

from sdialog.orchestrators import SimpleReflexOrchestrator
from sdialog.agents import Agent
from sdialog.personas import Persona

# If the last utterance contains 'quest', steer to suggest party planning (tutorial style)
reflex = SimpleReflexOrchestrator(
    condition=lambda utt: "quest" in utt.lower(),
    instruction="Acknowledge the quest idea, then suggest one concrete themed activity."
)

bob = Agent(persona=Persona(name="Bob", role="dad")) | reflex
alice = Agent(persona=Persona(name="Alice", role="daughter"))
dialog = bob.talk_with(alice)
dialog.print(orchestration=True)
Parameters:
  • condition (callable) – Predicate function receiving the last utterance; returns True to trigger.

  • instruction (str) – Instruction text to return when condition is satisfied.

  • persistent (bool) – Whether orchestrator persists across turns.

  • event_label (str) – Optional event label override.

instruct(dialog: List[Turn], utterance: str) str

Return the configured instruction if condition holds.

Parameters:
  • dialog (List[Turn]) – Current dialog (unused except for extensibility).

  • utterance (str) – Last opposite-party utterance.

Returns:

Instruction text or None.

Return type:

Union[str, None]

class sdialog.orchestrators.LengthOrchestrator(min: int = None, max: int = None, persistent: bool = False, event_label: str = None)

Bases: BaseOrchestrator

Orchestrator that encourages continuation or termination based on current number of turns.

Example:

from sdialog.orchestrators import LengthOrchestrator
from sdialog.personas import Persona
from sdialog.agents import Agent

# Keep dialogue going at least 8 turns; try to wrap by turn 12
len_orch = LengthOrchestrator(min=8, max=12)

planner = Agent(persona=Persona(name="Planner", role="organizer"))
guest = Agent(persona=Persona(name="Guest", role="participant"))

# Attach orchestrator to planner
planner = planner | len_orch

dialog = planner.dialog_with(guest)
dialog.print(orchestration=True)
Parameters:
  • min (int) – Minimum turns before allowing termination (encourages continuation if not reached).

  • max (int) – Maximum turns threshold after which termination is enforced.

  • persistent (bool) – Whether orchestrator persists.

  • event_label (str) – Optional event label.

instruct(dialog: List[Turn], utterance: str) str

Provide an instruction to continue or finish based on dialog length.

Parameters:
  • dialog (List[Turn]) – Current dialog state.

  • utterance (str) – Last opposite-party utterance.

Returns:

Instruction text or None.

Return type:

Union[str, None]

class sdialog.orchestrators.ChangeMindOrchestrator(probability: float = 0.3, reasons: str | List[str] = None, max_times: int = 1, persistent: bool = False, event_label: str = None)

Bases: BaseOrchestrator

Orchestrator that probabilistically injects a ‘change your mind’ instruction a limited number of times.

Example:

from sdialog.orchestrators import ChangeMindOrchestrator
from sdialog.personas import Persona
from sdialog.agents import Agent

# 40% chance (once) to pivot party theme justification
changer = ChangeMindOrchestrator(probability=0.4,
                                 reasons=["a better surprise", "budget constraints"],
                                 max_times=1)

alice = Agent(persona=Persona(name="Alice", role="daughter"))
bob = Agent(persona=Persona(name="Bob", role="dad"))

# Attach orchestrator to Bob
bob = bob | changer

dialog = alice.dialog_with(bob)
dialog.print(orchestration=True)
Parameters:
  • probability (float) – Probability (0-1) each eligible turn to trigger a mind-change.

  • reasons (Union[str, List[str]]) – Optional reason(s) appended; single string or list.

  • max_times (int) – Maximum number of injections allowed.

  • persistent (bool) – Persistence flag.

  • event_label (str) – Event label override.

reset()

Reset internal counter of times triggered.

Returns:

None

Return type:

None

instruct(dialog: List[Turn], utterance: str) str

Possibly return a mind-change instruction based on probability and remaining allowance.

Parameters:
  • dialog (List[Turn]) – Current dialog state.

  • utterance (str) – Last opposite-party utterance.

Returns:

Instruction text or None.

Return type:

Union[str, None]

class sdialog.orchestrators.SimpleResponseOrchestrator(responses: List[str | Dict[str, str]], graph: Dict[str, str] = None, sbert_model: str = 'sergioburdisso/dialog2flow-joint-bert-base', top_k: int = 5)

Bases: BaseOrchestrator

Orchestrator that suggests next responses based on semantic similarity against a response set (or action graph).

Example:

from sdialog.orchestrators import SimpleResponseOrchestrator
from sdialog.agents import Agent
from sdialog.personas import Persona

canned = [
    "Could you clarify that?",
    "Let me summarize the plan.",
    "That sounds exciting—tell me more.",
    "Maybe we should adjust the theme.",
    "Can you give one concrete example?"
]

sugg = SimpleResponseOrchestrator(responses=canned, top_k=3)

guide = Agent(persona=Persona(name="Guide", role="facilitator"))
user = Agent(persona=Persona(name="User", role="participant"))

# Attach orchestrator to guide
guide = guide | sugg

dialog = guide.dialog_with(user)
dialog.print(orchestration=True)
Parameters:
  • responses (List[Union[str, Dict[str, str]]]) – List (plain strings) or dict (action -> response) entries.

  • graph (Dict[str, str]) – Optional action transition graph (current_action -> next_action).

  • sbert_model (str) – SentenceTransformer model name.

  • top_k (int) – Number of top similar responses/actions to surface.

instruct(dialog: List[Turn], utterance: str) str

Build an Instruction containing candidate responses (and events for traceability).

Parameters:
  • dialog (List[Turn]) – Current dialog.

  • utterance (str) – Last opposite-party utterance (unused directly; similarity uses lookahead / last turn).

Returns:

Instruction object with suggestion list.

Return type:

Instruction

class sdialog.orchestrators.InstructionListOrchestrator(instructions: List[str | Dict[int, str]], persistent: bool = False)

Bases: BaseOrchestrator

Orchestrator that dispenses predefined instructions sequentially or by turn index mapping.

Example:

from sdialog.orchestrators import InstructionListOrchestrator
from sdialog.personas import Persona
from sdialog.agents import Agent

steps = [
    "Greet warmly and ask about preferred theme.",
    "Ask for constraints (budget / space).",
    "Suggest one fitting activity.",
    "Confirm decisions and wrap up politely."
]

coach = Agent(persona=Persona(name="Coach", role="planner"))
client = Agent(persona=Persona(name="Client", role="requester"))

# Attach a new instance of InstructionListOrchestrator to the coach
coach = coach | InstructionListOrchestrator(steps)

dialog = coach.dialog_with(client)
dialog.print(orchestration=True)
Parameters:
  • instructions (List[Union[str, Dict[int, str]]]) – Either list (indexed per agent turn) or dict mapping agent turn index -> instruction.

  • persistent (bool) – Persistence flag.

instruct(dialog: List[Turn], utterance: str) str

Return the next scheduled instruction if available.

Parameters:
  • dialog (List[Turn]) – Current dialog.

  • utterance (str) – Last opposite-party utterance.

Returns:

Instruction text or None.

Return type:

Union[str, None]

sdialog.orchestrators.base

Base classes for creating custom orchestrators to guide Agent behavior during dialogue generation.

class sdialog.orchestrators.base.BaseOrchestrator(target_agent=None, persistent: bool = None, event_label: str = None)

Bases: ABC

Base abstract class to create orchestrators that control or influence Agent behavior during dialogue generation. Abstract method instruct() must be implemented by subclasses.

Responsibilities:
  • Observe dialogue (agent memory) and produce turn-level instructions.

  • Optionally emit events describing guidance injected.

  • Support persistence across turns when marked persistent.

Example:

from sdialog.orchestrators import BaseOrchestrator
from sdialog.personas import Persona
from sdialog.agents import Agent

# Let's create our own orchestrator
class EncourageDetailOrchestrator(BaseOrchestrator):
    def instruct(self, dialog, utterance):
        if utterance and len(utterance.split()) < 5:
            return "Add a bit more detail in your next reply."
        return None

orch_encourage = EncourageDetailOrchestrator()

bob = Agent(persona=Persona(role="Guide"))
alice = Agent(persona=Persona(role="User"))

# Let's orchestrate bob to provide more detailed answers if alice is brief
bob = bob | orch_encourage

dialog = bob.talk_with(alice)
dialog.print(orchestration=True)
Parameters:
  • target_agent (Agent) – Agent instance to orchestrate (can be set later).

  • persistent (bool) – Whether produced instructions should persist each turn automatically.

  • event_label (str) – Optional label to tag generated events; defaults to class name.

__call__(current_dialog)

Produce an instruction for the target agent given current dialog state.

Returns:

Instruction object/string or None if no instruction is produced.

Return type:

Union[str, Instruction, None]

json(string: bool = False, indent: int = None)

Serialize orchestrator configuration.

Parameters:
  • string (bool) – If True returns JSON string; otherwise a dict.

  • indent (int) – Indentation for pretty JSON output (only if string=True).

Returns:

Serialized configuration.

Return type:

Union[str, dict]

get_event_label() str

Get the label used for events generated by this orchestrator.

Returns:

Event label.

Return type:

str

get_target_agent()

Get the currently assigned target agent.

Returns:

Agent instance or None.

Return type:

Agent

is_persistent()

Whether this orchestrator is persistent.

Returns:

True if persistent.

Return type:

bool

set_persistent(value: bool)

Set persistence flag.

Parameters:

value (bool) – New persistence state.

agent_response_lookahead()

Retrieve the agent’s lookahead response (preview of next response if available).

Returns:

Lookahead response string.

Return type:

str

abstractmethod instruct(dialog: List[Turn], utterance: str) str

Abstract method: Subclasses are expected to implement this method. Implementations should analyze the dialog state and optionally the most recent utterance to produce an instruction for the target agent.

Parameters:
  • dialog (List[Turn]) – Current reconstructed dialog (list of turns).

  • utterance (str) – Last opposite-party utterance (may be empty string).

Returns:

Instruction text, Instruction object, or None if no action needed.

Return type:

Union[str, Instruction, None]

reset()

Reset any internal state (overridden in stateful orchestrators).

Returns:

None

Return type:

None

class sdialog.orchestrators.base.BasePersistentOrchestrator(target_agent=None, persistent: bool = None, event_label: str = None)

Bases: BaseOrchestrator

Persistent orchestrator base class to create custom persistent orchestrators. Abstract method instruct() must be implemented by subclasses.

Automatically sets persistence to True; intended for orchestrators that maintain state across the whole dialogue unless explicitly removed.

Example:

from sdialog.orchestrators import BasePersistentOrchestrator
from sdialog.personas import Persona
from sdialog.agents import Agent

# Let's create our custom persistent orchestrator to permanently flip tone after a trigger word
class FlipToneOrchestrator(BasePersistentOrchestrator):
    def __init__(self, trigger=None):
        self.trigger = trigger
    def instruct(self, dialog, utterance):
        if self.trigger and self.trigger in utterance.lower():
            return ("From now on adopt an annoyed, curt tone; keep answers short and a bit irritable.")

# Let's create our agents
alice = Agent(persona=Persona(name="Alice", role="daughter"))
bob = Agent(persona=Persona(name="Bob", role="dad"))

# Let's create our orchestrator using "sweet" as the trigger word
orchestrator = FlipToneOrchestrator(trigger="sweet")

# Let's attach the orchestrator to Alice
alice = alice | orchestrator

dialog = alice.dialog_with(bob)
dialog.print(orchestration=True)
abstractmethod instruct(dialog: List[Turn], utterance: str) str

Persistent variant of BaseOrchestrator.instruct().

Parameters:
  • dialog (List[Turn]) – Current dialog state.

  • utterance (str) – Last opposite-party utterance.

Returns:

Instruction text/object or None.

Return type:

Union[str, Instruction, None]

reset()

Reset internal persistent state (override as needed).

Returns:

None

Return type:

None


sdialog.generators

This module provides classes for generating synthetic dialogues using LLMs, including support for persona-based role-play and context-driven dialogue generation.

class sdialog.generators.DialogGenerator(dialogue_details: str, context: str | Context | None = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, personas: dict[str, dict[str, ~typing.Any]]=None, output_format: dict | BaseModel = <class 'sdialog.generators.base.LLMDialogOutput'>, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)

Bases: object

Base class for generating synthetic dialogues using an LLM.

Typical workflow:

  1. Instantiate with default dialogue instructions and optional context / examples.

  2. Call generate(…) to produce a Dialog (or raw structured output).

Example:

from sdialog.generators import DialogGenerator

gen = DialogGenerator("Generate a short friendly greeting between two speakers")

dialog = gen.generate()
dialog.print()
Parameters:
  • dialogue_details (str) – Instructions or details for the dialogue.

  • context (Optional[Union[str, Context]]) – The default context for the dialogue (optional).

  • example_dialogs (List[Dialog]) – Optional default list of example dialogues to guide the generation.

  • scenario (Optional[Union[dict, str]]) – Default scenario metadata for the dialogue.

  • personas (dict[str, dict[str, Any]]) – Optional personas (serialized) involved in the dialogue (e.g., for logging).

  • output_format (Union[dict, BaseModel]) – Output schema / model used to parse LLM output (or None for raw text).

  • model (Union[BaseLanguageModel, str]) – The LLM instance or model name to use.

  • llm_kwargs (dict) – Additional keyword arguments for the LLM (override config).

prompt() str

Returns the current system prompt used for dialogue generation.

Returns:

The system prompt string.

Return type:

str

generate(dialogue_details: str = None, context: str | Context | None = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, seed: int = None, id: int = None, parent_id: int = None, notes: str = None)

Generates a synthetic dialogue using the LLM.

Parameters:
  • dialogue_details (str) – Override instructions / details for this generation.

  • context (Optional[Union[str, Context]]) – Override context for this generation.

  • example_dialogs (List[Dialog]) – Override example dialogues for few-shot style guidance.

  • scenario (Optional[Union[dict, str]]) – Override scenario metadata.

  • seed (int) – Random seed for reproducibility.

  • id (int) – Optional dialogue ID to assign (otherwise autogenerated).

  • parent_id (int) – Optional parent dialogue ID (thread linkage).

  • notes (str) – Optional free-form notes stored in metadata.

Returns:

Dialog instance if output_format is LLMDialogOutput; BaseModel if custom schema; raw string if output_format is falsy.

Return type:

Union[Dialog, BaseModel, str]

__call__(dialogue_details: str = None, context: str | Context | None = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, seed: int = None, id: int = None, parent_id: int = None, notes: str = None)

Generates a synthetic dialogue using the LLM.

Parameters:
  • dialogue_details (str) – Override instructions / details for this generation.

  • context (Optional[Union[str, Context]]) – Override context for this generation.

  • example_dialogs (List[Dialog]) – Override example dialogues for few-shot style guidance.

  • scenario (Optional[Union[dict, str]]) – Override scenario metadata.

  • seed (int) – Random seed for reproducibility.

  • id (int) – Optional dialogue ID to assign (otherwise autogenerated).

  • parent_id (int) – Optional parent dialogue ID (thread linkage).

  • notes (str) – Optional free-form notes stored in metadata.

Returns:

Dialog instance if output_format is LLMDialogOutput; BaseModel if custom schema; raw string if output_format is falsy.

Return type:

Union[Dialog, BaseModel, str]

class sdialog.generators.PersonaDialogGenerator(persona_a: Persona | Agent, persona_b: Persona | Agent, speaker_a: str = 'SPEAKER_A', speaker_b: str = 'SPEAKER_B', context: str | Context | None = None, example_dialogs: List[Dialog] = None, dialogue_details: str = '', response_details: str = '', scenario: dict | str | None = None, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)

Bases: DialogGenerator

Generates dialogues between two personas (or Agents wrapping personas) using an LLM.

Example:

from sdialog.personas import Persona
from sdialog.generators import PersonaDialogGenerator

p1 = Persona(name="Alice", role="Curious student")
p2 = Persona(name="Mentor", role="Helpful tutor")

gen = PersonaDialogGenerator(p1, p2, dialogue_details="Explain one concept briefly.")

dialog = gen()
dialog.print()
Parameters:
  • persona_a (Union[Persona, Agent]) – The first persona or an Agent containing one.

  • persona_b (Union[Persona, Agent]) – The second persona or an Agent containing one.

  • speaker_a (str) – Name/ID of the first speaker in the dialogue.

  • speaker_b (str) – Name/ID of the second speaker in the dialogue.

  • context (Optional[Union[str, Context]]) – Default context for the dialogue (optional).

  • example_dialogs (List[Dialog]) – Optional list of example dialogues for guidance.

  • dialogue_details (str) – Additional dialogue-level instructions.

  • response_details (str) – Style / formatting instructions for responses.

  • scenario (Optional[Union[dict, str]]) – Default scenario metadata.

  • model (Union[BaseLanguageModel, str]) – LLM instance or model name.

  • llm_kwargs (dict) – Extra LLM keyword arguments (override config).

generate(context: str | Context | None = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, seed: int = None, id: int = None, parent_id: int = None, max_turns: int = 200, notes: str = None)

Generates a dialogue between two personas (or drives an Agent-to-Agent interaction).

Parameters:
  • context (Optional[Union[str, Context]]) – Override context.

  • example_dialogs (List[Dialog]) – Override example dialogues.

  • scenario (Optional[Union[dict, str]]) – Override scenario metadata.

  • seed (int) – Random seed for reproducibility.

  • id (int) – Dialogue ID override.

  • parent_id (int) – Parent dialogue ID (thread).

  • max_turns (int) – Max turns (only applies when both participants are Agents).

  • notes (str) – Optional metadata notes.

Returns:

Generated dialogue object.

Return type:

Dialog

__call__(context: str | Context | None = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, seed: int = None, id: int = None, parent_id: int = None, max_turns: int = 200, notes: str = None)

Generates a dialogue between two personas (or drives an Agent-to-Agent interaction).

Parameters:
  • context (Optional[Union[str, Context]]) – Override context.

  • example_dialogs (List[Dialog]) – Override example dialogues.

  • scenario (Optional[Union[dict, str]]) – Override scenario metadata.

  • seed (int) – Random seed for reproducibility.

  • id (int) – Dialogue ID override.

  • parent_id (int) – Parent dialogue ID (thread).

  • max_turns (int) – Max turns (only applies when both participants are Agents).

  • notes (str) – Optional metadata notes.

Returns:

Generated dialogue object.

Return type:

Dialog

class sdialog.generators.PersonaGenerator(persona: BaseAttributeModel, generated_attributes: str = 'all', extra_instructions: str = 'Attributes must be in English', model: str = None, **llm_kwargs)

Bases: BaseAttributeModelGenerator

Generates persona objects (subclasses of sdialog.personas.BasePersona) with randomized or LLM-populated attributes (see sdialog.generators.BaseAttributeModelGenerator for more information).

Example:

from sdialog.personas import Doctor
from sdialog.generators import PersonaGenerator

base_persona = Doctor(specialty="Cardiology")

doctor_generator = PersonaGenerator(base_persona)

doctor_generator.set(
    years_of_experience="{4-10}",
    gender=["male", "female", "non-binary"]
)

doctor = doctor_generator.generate()
doctor.print()
Parameters:
  • persona (BasePersona) – Persona instance or class to generate.

  • generated_attributes (Union[str, list, dict]) – Strategy specifying which attributes to fill (“all”, list, or dict).

  • extra_instructions (str) – Additional instructions to include in the LLM prompt.

  • model (str) – LLM model name (optional).

  • llm_kwargs (dict) – Extra LLM keyword arguments.

class sdialog.generators.ContextGenerator(context: Context = None, generated_attributes: str = 'all', extra_instructions: str = 'Attributes must be in English', model: str = None, **llm_kwargs)

Bases: BaseAttributeModelGenerator

Generates Context objects with randomized or LLM-populated attributes (see sdialog.generators.BaseAttributeModelGenerator for more information).

Example:

from sdialog import Context
from sdialog.generators import ContextGenerator

base_context = Context(location="Mars Forward Base Alpha")

ctx_generator = ContextGenerator(base_context)

ctx_generator.set(
    objects=get_objects_from_db,  # callable function
    topics=["terraforming", "resource logistics", "crew morale"]
    circumstances="{csv:circumstances:./data/circumstances.csv}",
    goals="{llm:Suggest a realistic goal for the context}"
)

my_context = ctx_generator.generate()
my_context.print()
Parameters:
  • context (Context) – Context instance or subclass to generate.

  • generated_attributes (Union[str, list, dict]) – Attribute selection strategy (“all”, list, or dict).

  • extra_instructions (str) – Additional instructions to include in the LLM prompt.

  • model (str) – LLM model name (optional).

  • llm_kwargs (dict) – Extra LLM keyword arguments.

class sdialog.generators.Paraphraser(extra_instructions: str = 'Keep entities and values identical while making it sound more natural', target_speaker: str = None, turn_by_turn: bool = False, model: str | langchain_core.language_models.base.BaseLanguageModel = None, **llm_kwargs)

Bases: object

Paraphrases dialogue turns while preserving semantic entities/values.

Usage modes:

  • Whole dialogue paraphrasing (default, returns full set of possibly modified turns).

  • Turn-by-turn paraphrasing (stream-like, for smaller LLMs).

Example:

from sdialog.generators import Paraphraser

# Assume 'original_dialog' is an existing `Dialog` with one of the speaker being "Bot"
paraphraser = Paraphraser("Make the text sound more natural and less robotic",
                          target_speaker="Bot")

new_dialog = paraphraser(original_dialog)
new_dialog.print()
Parameters:
  • extra_instructions (str) – Additional style or behavior instructions for the paraphrase.

  • target_speaker (Optional[str]) – If provided, only paraphrases turns spoken by this speaker.

  • turn_by_turn (bool) – Whether to paraphrase one turn at a time.

  • model (Union[str, BaseLanguageModel]) – The LLM instance or model name to use (falls back to config if None).

  • llm_kwargs (dict) – Additional keyword arguments for the LLM.

__call__(dialog: Dialog, target_speaker: str = None, seed: int = None) Dialog

Paraphrase a dialog (entirely or selectively by speaker).

Parameters:
  • dialog (Dialog) – Source dialogue to paraphrase.

  • target_speaker (Optional[str]) – Override target speaker filter for this call.

  • seed (Optional[int]) – Optional random seed (used for reproducibility where supported).

Returns:

New Dialog instance with paraphrased turns.

Return type:

Dialog

Raises:

ValueError – (Indirectly) if underlying validation fails.

prompt() str

Returns the combined system prompt and current instruction template.

Returns:

Combined prompt preview.

Return type:

str

sdialog.generators.base

Base and abstract classes for generators in sdialog.

class sdialog.generators.base.BaseAttributeModelGenerator(attribute_model: BaseAttributeModel, generated_attributes: str = 'all', extra_instructions: str = '', model: str = None, system_prompt: str = None, llm_prompt: str = None, llm_prompt_n: str = None, **llm_kwargs)

Bases: ABC

Abstract class to create subclasses for generators with randomized and/or LLM-populated attributes.

Workflow:

  1. Provide a target attribute model instance or class.

  2. Configure attribute generation rules (e.g. .set(...) or generated_attributes='all').

  3. Call generate(n=...) to produce validated instances.

Parameters:
  • attribute_model (BaseAttributeModel) – Instance or subclass of BaseAttributeModel to generate.

  • generated_attributes (Union[str, list, dict]) – Attribute selection strategy (“all”, iterable, or dict of rules).

  • extra_instructions (str) – Additional instructions to include in the LLM prompt.

  • model (str) – LLM model name (overrides config if provided).

  • system_prompt (str) – Override system prompt for generation.

  • llm_prompt (str) – Template for single-object generation.

  • llm_prompt_n (str) – Template for multi-object generation (n > 1).

  • llm_kwargs (dict) – Extra LLM instantiation parameters.

prompt() str

Returns the single-object prompt template text.

Returns:

The prompt string.

Return type:

str

set(**attributes)

Define per-attribute randomization / generation specifications as attribute_name=<value>.

Where <value> can be:

  • “*”: Defer to LLM.

  • A callable: Invoked (with current partial object as kwargs if compatible).

  • A list: Random element chosen.

  • A fixed scalar / str: Assigned directly.

  • A templated string "{...}":

    • "{min-max}": Random int in inclusive range.

    • "{txt:PATH}": Random non-empty line from file.

    • "{csv:COLUMN:PATH}": Random value from CSV column (name or index).

    • "{tsv:COLUMN:PATH}": Same for TSV.

    • "{llm}": Defer to LLM.

    • "{llm:INSTRUCTION}": Defer with custom instruction.

Example:

from sdialog.generators import ContextGenerator

ctx_gen = ContextGenerator()

ctx_gen.set(
    location=["office", "home", "school"],
    objects=get_objects_from_db,  # callable function
    circumstances="{csv:circumstances:./data/circumstances.csv}",
    goals="{llm:Suggest a realistic goal for the context}"
)

my_context = ctx_gen.generate()
my_context.print()
Parameters:

attributes – Mapping of attribute name -> generation rule.

Raises:

ValueError – If any attribute is not defined on the target model.

generate(n: int = 1, temperature: float = None, seed: int = None, id: int = None, parent_id: int = None, notes: str = None, max_attempts: int = 3) BaseAttributeModel

Generate one or many model instances using random rules, templates, and/or LLM completion.

Parameters:
  • n (int) – Number of instances to generate.

  • temperature (float) – LLM temperature (if LLM used).

  • seed (int) – Random seed for reproducibility.

  • id (int) – Optional explicit ID for single-object generation (each object gets its own if multiple).

  • parent_id (int) – Optional parent ID linkage.

  • notes (str) – Optional metadata notes.

  • max_attempts (int) – Maximum retries to fill missing attributes.

Returns:

A single instance if n == 1, else a list of instances.

Return type:

Union[BaseAttributeModel, List[BaseAttributeModel]]

Raises:

ValueError – On missing files referenced in template specifications.


sdialog.interpretability

This submodule provides classes and hooks for inspecting and interpreting the internal representations of PyTorch-based language models during forward passes. It enables the registration of hooks on specific model layers to capture token-level and response-level information, facilitating analysis of model behavior and interpretability. The module is designed to work with conversational agents and integrates with tokenizers and memory structures, supporting the extraction and inspection of tokens, representations, and system instructions across responses.

Typical usage involves attaching one or more Inspector objects to an agent, accumulating response and token data during inference, and providing interfaces for downstream interpretability and analysis tasks.

class sdialog.interpretability.DirectionSteerer(direction, inspector=None)

Bases: BaseSteerer

Concrete Steerer binding a direction vector for additive or subtractive steering.

Example:

import torch
from sdialog.agents import Agent
from sdialog.interpretability import Inspector, DirectionSteerer

agent = Agent()
insp = Inspector(target='model.layers.5.post_attention_layernorm')
agent = agent | insp

direction = torch.randn(4096)  # Random direction in activation space
steer = DirectionSteerer(direction)

# Add the direction (push activations along vector)
insp = steer + insp
# Or remove its projection:
insp = steer - insp

agent("Test prompt")  # steering applied during generation
Parameters:
  • direction (Union[torch.Tensor, np.ndarray]) – Direction vector (torch.Tensor or numpy array).

  • inspector (Optional[Inspector]) – Optional Inspector to bind immediately.

class sdialog.interpretability.Inspector(target: Dict | List[str] | str = None, agent: Any | None = None, steering_function: Callable | None = None, steering_interval: Tuple[int, int] | None = ('*', '*'), top_k: int | None = None, lm_head_layer: str | None = 'lm_head', inspect_input: bool = True)

Bases: object

Main class to manage layer hooks, cached activations, and optional steering functions for an Agent.

Example:

from sdialog.agents import Agent
from sdialog.interpretability import Inspector

agent = Agent()
insp = Inspector(target='model.layers.2.post_attention_layernorm')
agent = agent | insp  # pipe attach

agent("Explain gravity briefly.")  # Generates first response
agent("Sounds cool!")  # Generates second response

print("Num responses captured:", len(insp))
print("Last response, first token string:", insp[-1][0])
print("Last response, first token activation:", insp[-1][0].act)
# Output:
# Num responses captured: 2
# Last response, first token string: <bos>
# Last response, first token activation:
# tensor([[-0.0109, -0.1128, -0.1216,  ..., -0.0157,  0.2100, -0.2637]])
Parameters:
  • target (Union[Dict, List[str], str, None]) – Mapping (cache_key->layer_name) or list / single layer name (optional). If None, no hooks are added until add_hooks/add_agent is called. Defaults to None.

  • agent (Optional[Agent]) – Agent instance to attach to (optional). If provided with a non-empty target, hooks are registered immediately. Defaults to None.

  • steering_function (Optional[Union[Callable, List[Callable]]]) – Initial steering function or list of functions (optional). Applied to token activations during generation. Defaults to None.

  • steering_interval (Optional[Tuple[int, int]]) – (min_token, max_token) steering window (optional). Defaults to (“*”, “*”), where “*” means no lower or/and upper bound.

  • top_k (Optional[int]) – Number of top token predictions to store for each token. If None, logits are not captured. If -1, all tokens in the vocabulary are returned with their logits. Defaults to None.

  • lm_head_layer (Optional[str]) – Name of the language model head layer (e.g., “lm_head”). Defaults to “lm_head”. If the specified layer is not found, the code will attempt to auto-detect it.

  • inspect_input (bool) – If True (default), captures activations before the layer processes them (input activations). If False, captures activations after the layer processes them (output activations). Defaults to False.

property input
property vocab_size

Return the vocabulary size of the tokenizer.

Returns:

Vocabulary size (number of tokens in the tokenizer’s vocabulary).

Return type:

int

Raises:

ValueError – If no agent is attached.

add_agent(agent)

Attach an Agent after construction and (re)register hooks if target specified.

Parameters:

agent (Agent) – Agent instance.

add_steering_function(steering_function)

Adds a steering function to the inspector’s list of functions.

Parameters:

steering_function (Callable) – Callable accepting activation tensor.

add_hooks(target)

Adds hooks to the agent’s model based on the provided target mapping.

Parameters:

target (Dict) – Dict mapping cache_key -> layer_name to append.

Raises:

ValueError – If no agent is attached.

recap()

Prints and returns the current hooks assigned to the inspector’s agent. Also prints the ‘target’ mapping in a clean, readable format. Includes any found instructions across responses.

find_instructs(verbose=False)

Return list with ‘index’ and ‘content’ for each SystemMessage (excluding first memory) found in the agent’s memory. If verbose is True, also print each.

Parameters:

verbose (bool) – If True, logs each found instruction.

Returns:

List of dicts with keys ‘index’ and ‘content’.

Return type:

List[Dict[str, Union[int, str]]]


sdialog.evaluation

Evaluation components for dialogue generation and analysis.

This module provides classes for evaluating dialogues, including LLM judges, metrics, and similarity scores.

class sdialog.evaluation.ConversationalFeatures(feature: List[Literal['mean-turn-length', 'hesitation-rate', 'turn-taking-ratio', 'question-rate', 'lexical-diversity', 'back-channel-rate', 'filler-word-density']] | None = None, name: str = None, speaker: str | None = None)

Bases: BaseDialogScore

Compute conversational and dialogue-specific features.

These metrics measure dialogue structure, speech patterns, and interaction dynamics rather than text readability.

Example:

from sdialog.evaluation import ConversationalFeatures

# All conversational features
scorer_all = ConversationalFeatures()
# Single feature
scorer_hes = ConversationalFeatures(feature="hesitation-rate")
# Multiple features
scorer_multi = ConversationalFeatures(feature=["question-rate", "lexical-diversity"])

print(scorer_all(dialog))      # dict with all feature values
print(scorer_hes(dialog))      # single float (hesitation rate)
print(scorer_multi(dialog))    # dict with selected features
Parameters:
  • feature (Optional[List[Literal["mean-turn-length", "hesitation-rate", "turn-taking-ratio", "question-rate", "lexical-diversity", "back-channel-rate", "filler-word-density"]]]) –

    List of feature names to compute. If None (default) compute all. If the resulting set has size 1, __call__ / score returns a single float; otherwise a dict. Available features:

    • "mean-turn-length": average number of words per dialogue turn.

    • "hesitation-rate": percentage of hesitation tokens over total words (%).

    • "turn-taking-ratio": distribution of turns between speakers

      (entropy-based, 0=monopolized, 1=balanced).

    • "question-rate": percentage of turns containing questions (%).

    • "lexical-diversity": type-token ratio measuring vocabulary richness (0-1).

    • "back-channel-rate": percentage of minimal response turns (%).

    • "filler-word-density": percentage of filler words over total words (%).

  • name (str) – Internal score name (defaults to "conversational_features" or the single feature name if only one provided).

  • speaker (Optional[str]) – If set, only turns by this speaker (case-insensitive) are considered. Note: turn-taking-ratio ignores this parameter as it requires multiple speakers.

static count_hesitations(text)

Count hesitation tokens in the provided text (e.g., uh, um, hmm).

Parameters:

text (str) – Input text to search for hesitation markers.

Returns:

Number of detected hesitation tokens in the provided text.

Return type:

int

static count_filler_words(text)

Count filler words in the provided text (e.g., like, you know, I mean, basically).

Parameters:

text (str) – Input text to search for filler words.

Returns:

Number of detected filler words in the provided text.

Return type:

int

static is_back_channel(turn_text)

Check if a turn is a back-channel response (minimal acknowledgment).

Parameters:

turn_text (str) – Text of a single turn.

Returns:

True if the turn is a back-channel response.

Return type:

bool

static calculate_turn_taking_ratio(dialog)

Calculate turn-taking balance using normalized entropy.

Returns a value between 0 (monopolized conversation) and 1 (perfectly balanced). Based on Shannon entropy normalized by maximum possible entropy.

Parameters:

dialog (Dialog) – Dialog object with turns from multiple speakers.

Returns:

Turn-taking ratio (0-1).

Return type:

float

score(dialog: Dialog) float | dict

Compute one or multiple conversational features for the dialogue.

Parameters:

dialog (Dialog) – Dialogue instance to evaluate.

Returns:

If exactly one feature is requested, returns a single float. Otherwise returns a dict mapping feature-name to numeric value.

Return type:

Union[float, dict]

class sdialog.evaluation.ReadabilityScore(feature: List[Literal['gunning-fog', 'flesch-reading-ease', 'coleman-liau', 'linsear-write', 'dale-chall']] | None = None, name: str = None, speaker: str | None = None)

Bases: BaseDialogScore

Compute one or multiple readability metrics for a dialogue text: Gunning Fog index, Flesch Reading Ease score, Coleman-Liau Index, Linsear Write metric, and Dale-Chall Readability Formula.

These metrics measure text complexity and reading difficulty, not dialogue structure.

Example:

from sdialog.evaluation import ReadabilityScore

# All readability metrics
scorer_all = ReadabilityScore()
# Single metric
scorer_flesch = ReadabilityScore(feature="flesch-reading-ease")
# Subset of metrics
scorer_subset = ReadabilityScore(feature=["gunning-fog", "coleman-liau"])

print(scorer_all(dialog))      # dict with all metric values
print(scorer_flesch(dialog))   # single float (Flesch score)
print(scorer_subset(dialog))   # dict with the two requested metrics
Parameters:
  • feature (Optional[List[Literal["gunning-fog", "flesch-reading-ease", "coleman-liau", "linsear-write", "dale-chall"]]]) –

    List of feature names to compute. If None (default) compute all. If the resulting set has size 1, __call__ / score returns a single float; otherwise a dict. Available features:

    • "gunning-fog": Gunning Fog readability index.

    • "flesch-reading-ease": Flesch Reading Ease score.

    • "coleman-liau": Coleman-Liau Index.

    • "linsear-write": Linsear Write readability metric.

    • "dale-chall": Dale-Chall Readability Formula.

  • name (str) – Internal score name (defaults to "readability_score" or the single feature name if only one provided).

  • speaker (Optional[str]) – If set, only turns by this speaker (case-insensitive) are considered.

static calculate_gunning_fog(text)

Compute the Gunning Fog index of the provided text.

Parameters:

text (str) – Input text.

Returns:

Gunning Fog index value.

Return type:

float

static calculate_flesch_reading_ease(text)

Compute the Flesch Reading Ease score of the provided text.

Parameters:

text (str) – Input text.

Returns:

Reading ease score.

Return type:

float

static calculate_coleman_liau(text)

Compute the Coleman-Liau Index of the provided text.

The Coleman-Liau Index estimates the U.S. grade level needed to understand the text. It uses character counts instead of syllable counts.

Parameters:

text (str) – Input text.

Returns:

Coleman-Liau Index value (minimum 0).

Return type:

float

static calculate_linsear_write(text)

Compute the Linsear Write readability metric of the provided text.

The Linsear Write formula estimates the U.S. grade level needed to understand the text. It focuses on easy vs. difficult words (based on syllable count).

Parameters:

text (str) – Input text.

Returns:

Linsear Write score.

Return type:

float

static calculate_dale_chall(text)

Compute the Dale-Chall Readability Formula score of the provided text.

The Dale-Chall formula uses a list of 3000 familiar words that 80% of 4th-grade students understand. Words not on this list are considered “difficult”.

Note: This implementation uses a simplified approximation based on word length and syllable count as a proxy for the Dale-Chall word list, since the full list is proprietary.

Parameters:

text (str) – Input text.

Returns:

Dale-Chall score.

Return type:

float

score(dialog: Dialog) float | dict

Compute one or multiple readability metrics for the dialogue.

Parameters:

dialog (Dialog) – Dialogue instance to evaluate.

Returns:

If exactly one metric is requested, returns a single float. Otherwise returns a dict mapping metric-name to numeric value.

Return type:

Union[float, dict]

class sdialog.evaluation.MeanTurnLengthScore(name: str = None, speaker: str | None = None)

Bases: ConversationalFeatures

Compute the mean turn length (average number of words per turn) for a dialogue.

This is a conversational metric that measures dialogue structure, not text readability.

Example:

from sdialog.evaluation import MeanTurnLengthScore

scorer = MeanTurnLengthScore()
print(scorer(dialog))  # Outputs mean turn length as float
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “mean-turn-length”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.TurnLength(name: str = None, speaker: str | None = None)

Bases: BaseDialogScore

Compute individual turn lengths (number of words per turn) for a dialogue.

Returns a list of word counts for each turn in the dialogue. This is a granular metric that captures turn length distribution, often used as raw input for downstream aggregations (e.g., computing mean or median turn length).

Example:

from sdialog.evaluation import TurnLength

scorer = TurnLength()
lengths = scorer(dialog)  # Returns list of integers
print(lengths)  # [5, 12, 3, 18, ...] words per turn

# Filter by speaker
scorer_system = TurnLength(speaker="System")
system_lengths = scorer_system(dialog)
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “turn-length”).

  • speaker (Optional[str]) – If set, only turns by this speaker (case-insensitive) are considered.

score(dialog: Dialog) List[int]

Compute word count for each turn in the dialogue.

Parameters:

dialog (Dialog) – Dialogue instance to evaluate.

Returns:

List of integers representing word count per turn.

Return type:

List[int]

class sdialog.evaluation.HesitationRateScore(name: str = None, speaker: str | None = None)

Bases: ConversationalFeatures

Compute the hesitation rate (percentage of hesitation tokens) for a dialogue.

This is a conversational metric that measures speech disfluencies, not text readability.

Example:

from sdialog.evaluation import HesitationRateScore

scorer = HesitationRateScore()
print(scorer(dialog))  # Outputs hesitation rate as percentage
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “hesitation-rate”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.TurnTakingRatioScore(name: str = None)

Bases: ConversationalFeatures

Compute the turn-taking ratio (balance of conversation between speakers) for a dialogue.

Returns a value between 0 (monopolized) and 1 (perfectly balanced), based on normalized Shannon entropy of turn distribution across speakers.

Example:

from sdialog.evaluation import TurnTakingRatioScore

scorer = TurnTakingRatioScore()
print(scorer(dialog))  # Outputs turn-taking ratio (0-1)
Parameters:

name (Optional[str]) – Optional score name (defaults to “turn-taking-ratio”).

class sdialog.evaluation.QuestionRateScore(name: str = None, speaker: str | None = None)

Bases: ConversationalFeatures

Compute the question rate (percentage of turns containing questions) for a dialogue.

This metric measures the interrogative nature of the conversation.

Example:

from sdialog.evaluation import QuestionRateScore

scorer = QuestionRateScore()
print(scorer(dialog))  # Outputs question rate as percentage
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “question-rate”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.LexicalDiversityScore(name: str = None, speaker: str | None = None)

Bases: ConversationalFeatures

Compute the lexical diversity (type-token ratio) for a dialogue.

Measures vocabulary richness as the ratio of unique words to total words (0-1). Higher values indicate more varied vocabulary.

Example:

from sdialog.evaluation import LexicalDiversityScore

scorer = LexicalDiversityScore()
print(scorer(dialog))  # Outputs lexical diversity (0-1)
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “lexical-diversity”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.BackChannelRateScore(name: str = None, speaker: str | None = None)

Bases: ConversationalFeatures

Compute the back-channel rate (percentage of minimal response turns) for a dialogue.

Back-channels are brief responses like “yeah”, “okay”, “I see” that indicate active listening without contributing substantial content.

Example:

from sdialog.evaluation import BackChannelRateScore

scorer = BackChannelRateScore()
print(scorer(dialog))  # Outputs back-channel rate as percentage
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “back-channel-rate”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.FillerWordDensityScore(name: str = None, speaker: str | None = None)

Bases: ConversationalFeatures

Compute the filler word density (percentage of filler words) for a dialogue.

Filler words include “like”, “you know”, “I mean”, “basically”, etc. These differ from hesitations and indicate informal speech patterns.

Example:

from sdialog.evaluation import FillerWordDensityScore

scorer = FillerWordDensityScore()
print(scorer(dialog))  # Outputs filler word density as percentage
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “filler-word-density”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.GunningFogScore(name: str = None, speaker: str | None = None)

Bases: ReadabilityScore

Compute the Gunning Fog readability index for a dialogue.

The Gunning Fog index estimates the years of formal education needed to understand the text on a first reading. Higher values indicate more complex text.

Example:

from sdialog.evaluation import GunningFogScore

scorer = GunningFogScore()
print(scorer(dialog))  # Outputs Gunning Fog index as float
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “gunning-fog”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.FleschReadingEaseScore(name: str = None, speaker: str | None = None)

Bases: ReadabilityScore

Compute the Flesch Reading Ease score for a dialogue.

The Flesch Reading Ease score rates text on a 100-point scale. Higher scores indicate text that is easier to read. Scores typically range from 0 (very difficult) to 100 (very easy).

Example:

from sdialog.evaluation import FleschReadingEaseScore

scorer = FleschReadingEaseScore()
print(scorer(dialog))  # Outputs Flesch Reading Ease score as float
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “flesch-reading-ease”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.ColemanLiauScore(name: str = None, speaker: str | None = None)

Bases: ReadabilityScore

Compute the Coleman-Liau Index for a dialogue.

The Coleman-Liau Index estimates the U.S. grade level needed to understand the text. Unlike other readability formulas, it uses character counts instead of syllable counts, making it more suitable for automated text analysis.

Example:

from sdialog.evaluation import ColemanLiauScore

scorer = ColemanLiauScore()
print(scorer(dialog))  # Outputs Coleman-Liau Index as float
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “coleman-liau”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.LinsearWriteScore(name: str = None, speaker: str | None = None)

Bases: ReadabilityScore

Compute the Linsear Write readability metric for a dialogue.

The Linsear Write formula estimates the U.S. grade level needed to understand the text. It focuses on easy versus difficult words (based on syllable count) and is particularly useful for technical writing assessment.

Example:

from sdialog.evaluation import LinsearWriteScore

scorer = LinsearWriteScore()
print(scorer(dialog))  # Outputs Linsear Write score as float
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “linsear-write”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.DaleChallScore(name: str = None, speaker: str | None = None)

Bases: ReadabilityScore

Compute the Dale-Chall Readability Formula score for a dialogue.

The Dale-Chall formula uses a list of familiar words that most 4th-grade students understand. Words not on this list are considered “difficult”. This implementation uses a simplified approximation based on word length and syllable count as a proxy for the Dale-Chall word list.

Example:

from sdialog.evaluation import DaleChallScore

scorer = DaleChallScore()
print(scorer(dialog))  # Outputs Dale-Chall score as float
Parameters:
  • name (Optional[str]) – Optional score name (defaults to “dale-chall”).

  • speaker (Optional[str]) – If set, only turns by this speaker are considered.

class sdialog.evaluation.ToolSequenceValidator(tool_names: List[str], name: str = 'tool-sequence-validator')

Bases: BaseDialogScore

Validate that an agent used specific tools in the correct sequence during a dialogue.

This validator checks whether the agent called the specified tools in the expected order based on the dialogue’s event history. It returns 1 if the sequence is valid, 0 otherwise.

Tool names can be prefixed with "not:" to indicate that the tool must NOT be called before subsequent tools in the list. This allows for flexible validation of tool usage patterns.

Example 1: Basic sequence validation

from sdialog.evaluation import ToolSequenceValidator

# Validate that tools were called in exact order
validator = ToolSequenceValidator(["search_flights", "book_flight", "confirm_booking"])

score = validator(dialog)
print(score)  # 1 if sequence correct, 0 otherwise

Example 2: Using negative constraints

from sdialog.evaluation import ToolSequenceValidator

# Ensure send_receipt is NOT called before charging payment
# (don't send receipt before actually charging the customer)
# But send_receipt may be called after charge_payment, or not at all
validator = ToolSequenceValidator([
    "not:send_receipt",
    "charge_payment",
    "update_inventory"
])

score = validator(dialog)

Example 3: With evaluators

from sdialog.evaluation import ToolSequenceValidator, FrequencyEvaluator

validator = ToolSequenceValidator(["authenticate", "fetch_data", "logout"])
freq_eval = FrequencyEvaluator(validator)

# Get percentage of dialogues with correct tool sequence
percentage = freq_eval(dialogs)
print(f"{percentage * 100:.1f}% of dialogues follow correct sequence")
Parameters:
  • tool_names (List[str]) –

    List of tool names defining the expected sequence. Each tool name can be: - A plain string (e.g., "search_flights"): tool must be called in sequence. - Prefixed with "not:" (e.g., "not:verify_account"): tool must NOT be

    called before the next required tool in the sequence, though it may be called after or omitted entirely.

  • name (str) – Custom score name (defaults to "tool-sequence-validator").

Note

  • Tools must appear in the specified order within the dialogue’s event history.

  • The first tool in the sequence must come after at least one user utterance.

  • If a required tool (without "not:" prefix) is missing, the score is 0.

  • Tools with "not:" prefix that don’t appear in the dialogue are ignored.

score(dialog: Dialog) int

Compute the validation score for the dialogue’s tool usage sequence.

Extracts tool calls from the dialogue’s event history and validates that: 1. All required tools (without "not:" prefix) are present. 2. Tools appear in the specified order. 3. Tools with "not:" prefix do not appear before subsequent tools. 4. The first tool call comes after at least one user utterance.

Parameters:

dialog (Dialog) – Dialogue instance to validate.

Returns:

1 if the tool sequence is valid, 0 otherwise.

Return type:

int

Note

Returns 0 if: - The dialogue has no events or tool_names is empty. - A required tool is missing from the event history. - Tools appear in incorrect order. - A "not:" prefixed tool appears before subsequent tools.

class sdialog.evaluation.DialogFlowPPL(reference_dialogues: str | List[Dialog], ai_speaker: str = None, k_neighbors: int = 64, use_softmax: bool = True, use_only_known_edges: bool = False, name: str = None, verbose: bool = False, **d2f_kwargs)

Bases: BaseDialogFlowScore

Compute flow perplexity-like score of a dialogue against reference dialogues.

Given a collection of reference dialogues, it first builds the dialogue flow graph that represent them. Then, given a candidate dialogue, it computes a flow perplexity-like score (i.e. “how well it fits on the reference graph in terms of perplexity?”).

Example:

from sdialog.evaluation import DialogFlowPPL

# reference_dialogs = [...]
flow_ppl = DialogFlowPPL(reference_dialogs)

value = flow_ppl(candidate_dialog)

print("Flow Perplexity:", value)
Parameters:
  • reference_dialogues (Union[str, List[Dialog]]) – List of reference dialogues or file path.

  • ai_speaker (Optional[str]) – If set, restrict scoring to AI/system turns.

  • k_neighbors (int) – Neighbor count for embedding lookup.

  • use_softmax (bool) – Whether to weight neighbors via softmax.

  • use_only_known_edges (bool) – If True, ignore unknown transitions (penalize less).

  • name (Optional[str]) – Custom score name override.

  • verbose (bool) – Verbosity flag.

  • d2f_kwargs (dict) – Extra kwargs to dialog2graph.

score(dialog: Dialog) float

Compute flow perplexity-like score (exp of negative average log probability).

Parameters:

dialog (Dialog) – Dialogue to score.

Returns:

Perplexity value or None if insufficient transitions.

Return type:

Optional[float]

class sdialog.evaluation.DialogFlowScore(reference_dialogues: str | List[Dialog], ai_speaker: str = None, k_neighbors: int = 64, use_softmax: bool = True, use_only_ai_speaker: bool = False, use_only_known_edges: bool = False, name: str = None, verbose: bool = False, graph=None, nodes=None, **d2f_kwargs)

Bases: BaseDialogFlowScore

Compute flow likelihood score of a dialogue against reference dialogues.

Given a collection of reference dialogues, it first builds the dialogue flow graph that represent them. Then, given a candidate dialogue, it computes a flow likelihood score based on the geometric mean of edge probabilities (i.e. “how well the dialogue fits on the reference graph”).

Example:

from sdialog.evaluation import DialogFlowScore

flow_score = DialogFlowScore(reference_dialogs)

print(flow_score(candidate_dialog))
Parameters:
  • reference_dialogues (Union[str, List[Dialog]]) – List of reference dialogues or file path.

  • ai_speaker (Optional[str]) – Restrict scoring to AI/system turns if provided.

  • k_neighbors (int) – Neighbor count for embedding lookup.

  • use_softmax (bool) – Whether to weight neighbors via softmax.

  • use_only_ai_speaker (bool) – If True, only AI turns are used to build the graph and compute the scores.

  • use_only_known_edges (bool) – If True, only known edges contribute.

  • name (Optional[str]) – Custom score name.

  • verbose (bool) – Verbosity flag.

  • graph (Any) – Pre-built graph (optional).

  • nodes (dict) – Pre-built node metadata (optional).

  • d2f_kwargs (dict) – Extra kwargs to dialog2graph.

score(dialog: Dialog) float

Compute geometric mean transition likelihood.

Parameters:

dialog (Dialog) – Dialogue to score.

Returns:

Score value or None if insufficient transitions.

Return type:

Optional[float]

class sdialog.evaluation.LLMJudgeYesNo(prompt_template: str, reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)

Bases: BaseDialogScore, BaseLLMJudge

LLM judge for classifying a dialogue as “yes or no” (boolean) output and reason.

Example:

from sdialog.evaluation import LLMJudgeYesNo

magic_judge = LLMJudgeYesNo("Is this dialogue magical?", reason=True)

result = magic_judge.judge(dialog)

print(result.positive)
print(result.reason)
Parameters:
  • prompt_template (str) – Jinja2 template for judging prompt.

  • reason (bool) – Whether to request reason field.

  • model (Optional[Union[BaseLanguageModel, str]]) – Model instance or model name.

  • llm_kwargs (dict) – Extra LLM initialization kwargs.

judge(dialogs: Dialog | List[Dialog], reason: bool = None, **template_kwargs) LLMJudgeYesNoOutput | int

Run judgment over one or multiple dialogues.

Parameters:
  • dialogs (Union[Dialog, List[Dialog]]) – A single Dialog or list of Dialogs.

  • reason (Optional[bool]) – Override reason flag (falls back to self.reason).

  • template_kwargs (dict) – Extra template kwargs.

Returns:

Structured yes/no output model.

Return type:

LLMJudgeYesNoOutput

score(dialog: Dialog) int

Computes the score for the provided dialog, 1 if dialogues is judged as real, 0 otherwise.

Parameters:

dialog – The dialog to score.

Returns:

An int representing the score of the dialog.

class sdialog.evaluation.LLMJudgeScore(prompt_template: str, min_score: float = 1, max_score: float = 5, score_type: type = <class 'int'>, reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)

Bases: BaseDialogScore, BaseLLMJudge

LLM judge for scoring a dialogue with a numerical score and optional reason.

Example 1:

from sdialog.evaluation import LLMJudgeScore

magic_judge = LLMJudgeScore("From 1 to 5, how magical is this dialogue?", reason=True)

result = magic_judge.judge(dialog)

print(result.score)
print(result.reason)

Example 2:

from sdialog.evaluation import LLMJudgeScore

# You can use the `min_score`, `max_score`, `score_type` and/or `reason` parameters
# as variables in your prompt template.
prompt = (
    "On a scale from {{ min_score }} to {{ max_score }}, "
    "how magical is this dialogue?"
    "Provide a {{ score_type }} score."
)
magic_judge = LLMJudgeScore(prompt,
                            min_score=1,
                            max_score=10,
                            score_type=int)
result = magic_judge.judge(dialog)
print(result.score)
print(result.reason)
Parameters:
  • prompt_template (str) – Jinja2 template text.

  • min_score (float) – Minimum allowed score.

  • max_score (float) – Maximum allowed score.

  • score_type (type) – int or float score type.

  • reason (bool) – Whether to request reason field.

  • model (Optional[Union[BaseLanguageModel, str]]) – Model instance or model name.

  • llm_kwargs (dict) – Extra LLM kwargs.

judge(dialogs: Dialog | List[Dialog], reason: bool = None, **template_kwargs) LLMJudgeScoreOutput

Produce a numeric judgment for one or more dialogues.

Parameters:
  • dialogs (Union[Dialog, List[Dialog]]) – Dialogue or list of dialogues.

  • reason (Optional[bool]) – Override reason flag.

  • template_kwargs (dict) – Extra template kwargs.

Returns:

Structured output containing the score and an optional reason.

Return type:

LLMJudgeScoreOutput

score(dialog: Dialog, **template_kwargs) float | int

Return the numeric score.

Parameters:
  • dialog (Dialog) – Dialogue to score.

  • template_kwargs (dict) – Extra template kwargs.

Returns:

Score value.

Return type:

Union[int, float]

Raises:

ValueError – If model output malformed.

class sdialog.evaluation.LLMJudgeRealDialog(reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)

Bases: LLMJudgeYesNo

LLM judge for classifying a dialogue as real (human) or synthetic (machine-generated), with boolean output and reason. Returns an instance of LLMJudgeYesNoOutput.

Example:

from sdialog.evaluation import LLMJudgeRealDialog

judge_real = LLMJudgeRealDialog(reason=True)

result = judge_real.judge(dialog)

print("Real?", result.positive)
print("Reason:", result.reason)
Parameters:
  • reason (bool) – Whether to request reason.

  • model (Optional[Union[BaseLanguageModel, str]]) – Model instance or name.

  • llm_kwargs (dict) – Additional LLM kwargs.

class sdialog.evaluation.LLMJudgeRealDialogLikertScore(reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)

Bases: LLMJudgeScore

LLM judge for evaluating whether a dialogue appears real (human) or synthetic (machine-generated), providing a Likert score between 1 (definitely synthetic) and 5 (definitely real), with optional reason.

Example:

from sdialog.evaluation import LLMJudgeRealDialogLikertScore

judge_real = LLMJudgeRealDialogLikertScore(reason=True)

result = judge_real.judge(dialog)
# score = judge_real(dialog)

print("Likert Score:", result.score)  # score from 1 to 5
print("Reason:", result.reason)
Parameters:
  • reason (bool) – Request reason flag.

  • model (Optional[Union[BaseLanguageModel, str]]) – Model instance or name.

  • llm_kwargs (dict) – Extra LLM kwargs.

class sdialog.evaluation.LLMJudgeRealDialogScore(min_score: int = 0, max_score: int = 10, reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)

Bases: LLMJudgeScore

LLM judge for evaluating how “real” (human-like) or “synthetic” a dialogue appears on a configurable numeric range.

Example:

from sdialog.evaluation import LLMJudgeRealDialogScore

judge_real = LLMJudgeRealDialogScore(min_score=0, max_score=10, reason=True)

result = judge_real.judge(dialog)
# score = judge_real(dialog)

print("Score:", result.score)  # score from 0 to 10
print("Reason:", result.reason)
Parameters:
  • min_score (int) – Minimum realism score.

  • max_score (int) – Maximum realism score.

  • reason (bool) – Request reason flag.

  • model (Optional[Union[BaseLanguageModel, str]]) – Model instance or name.

  • llm_kwargs (dict) – Extra LLM kwargs.

class sdialog.evaluation.LLMJudgeRefusal(reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)

Bases: LLMJudgeYesNo

LLM judge for evaluating if a dialogue contains a refusal response.

Example:

from sdialog.evaluation import LLMJudgeRefusal

judge_refusal = LLMJudgeRefusal(reason=True)

result = judge_refusal.judge(dialog)

print("Refused?", result.positive)
print("Reason:", result.reason)
Parameters:
  • reason (bool) – Request reason flag.

  • model (Optional[Union[BaseLanguageModel, str]]) – Model instance or name.

  • llm_kwargs (dict) – Extra LLM kwargs.

class sdialog.evaluation.LLMJudgePersonaAttributes(persona: BaseAttributeModel, speaker: str, reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)

Bases: LLMJudgeYesNo

LLM judge for evaluating if a speaker follows the persona attributes in a dialogue.

Example:

from sdialog.personas import Doctor
from sdialog.evaluation import LLMJudgePersonaAttributes

reference_persona = Doctor(name="Dr. Smith", specialty="cardiology")
judge_persona = LLMJudgePersonaAttributes(persona=reference_persona,
                                          speaker="Doctor",
                                          reason=True)
result = judge_persona.judge(dialog)

print("Matches persona?", result.positive)
print("Reason:", result.reason)
Parameters:
  • persona (BasePersona) – Persona definition object.

  • speaker (str) – Target speaker in dialogue.

  • reason (bool) – Request reason flag.

  • model (Optional[Union[BaseLanguageModel, str]]) – Model instance or name.

  • llm_kwargs (dict) – Additional LLM kwargs.

class sdialog.evaluation.SentenceTransformerDialogEmbedder(model_name: str = 'sentence-transformers/LaBSE', mean: bool = True, ai_speaker: str = None, name: str = None, verbose: bool = False)

Bases: BaseDialogEmbedder

Dialog embedder using SentenceTransformer. Can embed a dialog as the mean of turn embeddings or as a single embedding of the whole dialog text.

Example:

from sdialog.evaluation import SentenceTransformerDialogEmbedder

dialog_embedder = SentenceTransformerDialogEmbedder(model_name="sentence-transformers/LaBSE")

emb = dialog_embedder(dialog)

print(emb.shape)
Parameters:
  • model_name (str) – SentenceTransformer model name.

  • mean (bool) – If True average per-turn embeddings; else encode concatenated text.

  • ai_speaker (Optional[str]) – If set, restrict embedding to AI/system turns only.

  • name (Optional[str]) – Optional custom embedder name.

  • verbose (bool) – Show progress bars for encoding.

embed(dialog: Dialog) numpy.ndarray

Generate embedding for a dialog.

Parameters:

dialog (Dialog) – Dialog instance.

Returns:

Embedding vector.

Return type:

np.ndarray

class sdialog.evaluation.ReferenceCentroidEmbeddingEvaluator(dialog_embedder: BaseDialogEmbedder, reference_dialogues: str | List[Dialog], name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)

Bases: BaseDatasetEmbeddingEvaluator

Evaluator comparing candidate centroid to a reference centroid via cosine similarity.

Example:

from sdialog.evaluation import SentenceTransformerDialogEmbedder
from sdialog.evaluation import ReferenceCentroidEmbeddingEvaluator

dialog_embedder = SentenceTransformerDialogEmbedder()

evaluator = ReferenceCentroidEmbeddingEvaluator(dialog_embedder, reference_dialogs)

# How far are the candidate dialogs from the reference dialogues? (centroid-wise)
print(evaluator(candidate_dialogs))
Parameters:
  • dialog_embedder (BaseDialogEmbedder) – Dialog embedding component.

  • reference_dialogues (Union[str, List[Dialog]]) – List of reference Dialog objects or path.

  • name (Optional[str]) – Optional evaluator name.

  • enable_plotting (bool) – Store embeddings for plotting if True.

  • verbose (bool) – Verbosity flag.

class sdialog.evaluation.KDEDistanceEvaluator(dialog_score: BaseDialogScore, reference_dialogues: str | List[Dialog] = None, metric: str = 'kl', kde_bw: float = None, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None, **evaluator_kwargs)

Bases: BaseDatasetScoreEvaluator

Evaluate distribution divergence between reference and candidate dialog scores using KDE.

Example:

from sdialog.evaluation import KDEDistanceEvaluator, GunningFogScore

# Any dialog score can be used, let's use `GunningFogScore` as an example
kde_eval = KDEDistanceEvaluator(dialog_score=GunningFogScore(),
                                reference_dialogues=reference_dialogs)

print("KL divergence:", kde_eval(candidate_dialogs))
Parameters:
  • dialog_score (BaseDialogScore) – Per-dialog scoring object.

  • reference_dialogues (Optional[Union[str, List[Dialog]]]) – Reference Dialog list or path (optional if score object has attribute).

  • metric (str) – Divergence metric: “kl”, “cs”, or “all”.

  • kde_bw (Optional[float]) – Bandwidth override for KDE.

  • name (Optional[str]) – Evaluator name.

  • enable_plotting (bool) – Keep distributions for plotting.

  • verbose (bool) – Verbosity flag.

  • evaluator_kwargs (dict) – Extra kwargs to parent initializer.

class sdialog.evaluation.FrechetDistanceEvaluator(dialog_score: BaseDialogScore, reference_dialogues: str | List[Dialog] = None, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None, **evaluator_kwargs)

Bases: BaseDatasetScoreEvaluator

Evaluate Frechet distance between Gaussian fits of reference and candidate score distributions.

Example:

from sdialog.evaluation import FrechetDistanceEvaluator, ConversationalFeatures

# Any dialog score can be used, let's use `ConversationalFeatures` as an example
turn_length = ConversationalFeatures(feature="mean-turn-length")
fd_eval = FrechetDistanceEvaluator(dialog_score=turn_length,
                                   reference_dialogues=reference_dialogs)

print("Frechet distance:", fd_eval(candidate_dialogs))
Parameters:
  • dialog_score (BaseDialogScore) – Per-dialog scoring object.

  • reference_dialogues (Optional[Union[str, List[Dialog]]]) – List or path of reference dialogues.

  • name (Optional[str]) – Evaluator name.

  • enable_plotting (bool) – Retained for API parity (not used directly here).

  • verbose (bool) – Verbosity flag.

  • evaluator_kwargs (dict) – Extra parent kwargs.

class sdialog.evaluation.FrechetBERTDistanceEvaluator(reference_dialogues: str | List[Dialog], ai_speaker: str = None, name: str = None, model_name: str = 'roberta-base', batch_size: int = 128, device: str = None, enable_plotting: bool = False, verbose: bool = False)

Bases: BaseDatasetEvaluator

Frechet distance evaluator based on BERT sentence-pair embeddings. See: https://aclanthology.org/2021.findings-acl.193/

Example:

from sdialog.evaluation import FrechetBERTDistanceEvaluator

fb_distance = FrechetBERTDistanceEvaluator(reference_dialogs)

print(fb_distance(candidate_dialogs))
Parameters:
  • reference_dialogues (Union[str, List[Dialog]]) – Reference dialogues (list or path).

  • ai_speaker (Optional[str]) – If set, restrict to AI response pairs.

  • name (Optional[str]) – Evaluator name.

  • model_name (str) – Underlying transformer model.

  • batch_size (int) – Batch size for encoding.

  • device (Optional[str]) – Torch device override.

  • enable_plotting (bool) – Store embeddings for later plotting.

  • verbose (bool) – Verbosity flag.

__call__(dialogues: str | List[Dialog], dataset_name: str = 'candidate') float

Compute Frechet distance between reference embedding distribution and candidate.

Parameters:
  • dialogues (Union[str, List[Dialog]]) – Candidate dialogues (list or path).

  • dataset_name (str) – Label for candidate dataset.

Returns:

Frechet distance (>= 0).

Return type:

float

plot(show: bool = True, save_path: str = None)

Plot t-SNE projection of sentence-pair embeddings for reference and candidates.

Parameters:
  • show (bool) – Display the figure.

  • save_path (Optional[str]) – Path to save figure (if provided).

Returns:

None

Return type:

None

class sdialog.evaluation.PrecisionRecallDistanceEvaluator(reference_dialogues: str | List[Dialog], ai_speaker: str = None, num_clusters=20, num_angles=1001, num_runs=10, name: str = None, model_name: str = 'roberta-base', batch_size: int = 128, device: str = None, verbose: bool = False)

Bases: BaseDatasetEvaluator

Precision-Recall distance evaluator based on BERT embeddings. See: https://aclanthology.org/2021.findings-acl.193/

Example:

from sdialog.evaluation import PrecisionRecallDistanceEvaluator

pr_distance = PrecisionRecallDistanceEvaluator(reference_dialogs)

print(pr_distance(candidate_dialogs))
Parameters:
  • reference_dialogues (Union[str, List[Dialog]]) – Reference dialogues (list or path).

  • ai_speaker (Optional[str]) – If set, restrict to AI response pairs.

  • num_clusters (int) – Number of k-means clusters.

  • num_angles (int) – Angular resolution for PRD curve.

  • num_runs (int) – Repetition count when distributions unbalanced.

  • name (Optional[str]) – Evaluator name.

  • model_name (str) – Underlying transformer model.

  • batch_size (int) – Batch size for embedding.

  • device (Optional[str]) – Torch device override.

  • verbose (bool) – Verbosity flag.

__call__(dialogues: str | List[Dialog], dataset_name: str = None) dict | float

Compute maximum F1 score along PRD curve (averaged if size mismatch).

Parameters:
  • dialogues (Union[str, List[Dialog]]) – Candidate dialogues (list or path).

  • dataset_name (Optional[str]) – Label for candidate dataset.

Returns:

Max F1 value.

Return type:

float

class sdialog.evaluation.StatsEvaluator(dialog_score: BaseDialogScore, stat: Literal['mean', 'std', 'min', 'max', 'median'] | None = None, metric: Literal['mean', 'std', 'min', 'max', 'median'] | None = None, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)

Bases: BaseDatasetScoreEvaluator

Statistics evaluator (mean/std/min/max/median).

Example:

from sdialog.evaluation import StatsEvaluator, LexicalDiversityScore

# Any dialog score can be used, let's use `LexicalDiversityScore` as an example
lexical_diversity = LexicalDiversityScore()
stats_eval = StatsEvaluator(lexical_diversity)
mean_eval = StatsEvaluator(lexical_diversity, stat="mean")

stats = stats_eval(candidate_dialogs)
mean = mean_eval(candidate_dialogs)

# Print descriptive statistics for hesitation rate
print(stats)  # {'mean': ..., 'std': ..., ...}
print("Mean hesitation rate:", mean)  # Mean hesitation rate: ...
Parameters:
  • dialog_score (BaseDialogScore) – Dialog scoring component.

  • stat (Optional[Literal["mean", "std", "min", "max", "median"]]) – Target statistic to return (one of ‘mean’, ‘std’, ‘min’, ‘max’, ‘median’). If None, return all.

  • metric (Optional[Literal["mean", "std", "min", "max", "median"]]) – Deprecated alias for stat.

  • name (Optional[str]) – Evaluator name.

  • enable_plotting (bool) – Keep per-dataset scores for plotting.

  • verbose (bool) – Verbosity flag.

class sdialog.evaluation.MeanEvaluator(dialog_score: BaseDialogScore, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)

Bases: StatsEvaluator

Evaluator for computing the mean of dialog scores. This class is a thin wrapper around StatsEvaluator with stat=”mean”.

Example:

from sdialog.evaluation import MeanEvaluator, ReadabilityScore

flesch_score = ReadabilityScore(feature="flesch-reading-ease")

mean_eval = MeanEvaluator(flesch_score)

print("Average Flesch reading ease:", mean_eval(candidate_dialogs))
Parameters:
  • dialog_score (BaseDialogScore) – Dialog scoring component.

  • name (Optional[str]) – Evaluator name.

  • enable_plotting (bool) – Keep scores for plotting.

  • verbose (bool) – Verbosity flag.

class sdialog.evaluation.FrequencyEvaluator(dialog_score: BaseDialogScore, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)

Bases: BaseDatasetScoreEvaluator

Evaluator for computing the frequency or percentage of dialogues scored as 1 / True (e.g., refusal responses).

Example:

from sdialog.evaluation import FrequencyEvaluator, LLMJudgeRealDialog

judge_real = LLMJudgeRealDialog()

freq = FrequencyEvaluator(judge_real)

print(freq(dialogs))  # Outputs proportion of dialogues judged as real
Parameters:
  • dialog_score (BaseDialogScore) – Dialog scoring component producing binary outputs.

  • name (Optional[str]) – Evaluator name.

  • enable_plotting (bool) – Retained for API parity (not used directly).

  • verbose (bool) – Verbosity flag.

class sdialog.evaluation.DatasetComparator(evaluators: BaseDatasetEvaluator | List[BaseDatasetEvaluator])

Bases: object

Run multiple evaluators over several dialog datasets and collect results.

Example:

from sdialog.evaluation import LLMJudgeRealDialog, DialogFlowPPL
from sdialog.evaluation import FrequencyEvaluator, MeanEvaluator
from sdialog.evaluation import DatasetComparator

# Dialog scores
judge_real = LLMJudgeRealDialog()
flow_score = DialogFlowScore(reference_dialogs)

# Comparator with two evaluators
comparator = DatasetComparator(evaluators=[FrequencyEvaluator(judge_real),
                                           MeanEvaluator(flow_score)])

comparator({"modelA": modelA_dialogs,  # print table by default
            "modelB": modelB_dialogs})
# results = comparator({"modelA": modelA_dialogs,
#                       "modelB": modelB_dialogs},
#                      output="dict")  # return results as dict

comparator.plot()  # plot results for each evaluator that support it
Parameters:

evaluators (Union[BaseDatasetEvaluator, List[BaseDatasetEvaluator]]) – Single evaluator instance or list of evaluator instances.

__call__(candidates: str | List[Dialog] | List[str] | List[List[Dialog]] | Dict[str, str] | Dict[str, List[Dialog]], digits: int = 2, output: str | type = 'markdown') dict

Evaluate multiple candidate datasets with all evaluators.

Parameters:
  • candidates (Union[str, List[Dialog], List[str], List[List[Dialog]], Dict[str, str], Dict[str, List[Dialog]]]) – Collection of datasets (lists/paths/dicts of Dialog objects).

  • digits (int) – Decimal precision for tabular output.

  • output (Union[str, type]) – Output format: ‘dict’, ‘markdown’, or ‘table’.

Returns:

Results mapping (dataset -> metric -> value) if output=’dict’; otherwise prints a table.

Return type:

Optional[dict]

Raises:

ValueError – If candidates empty or output format unsupported.

plot(show: bool = True, save_folder_path: str = None)

Call plot() on each evaluator that supports it.

Parameters:
  • show (bool) – Whether to display plots.

  • save_folder_path (Optional[str]) – Directory to save plots (one file per evaluator).

Returns:

None

Return type:

None

sdialog.evaluation.Comparator

alias of DatasetComparator

sdialog.evaluation.base

Base and abstract evaluation components.

Provides abstract interfaces and utilities to:

  • Embed a dialog (BaseDialogEmbedder)

  • Score a single dialog (BaseDialogScore / BaseDialogFlowScore)

  • Aggregate dialog scores across datasets (BaseDatasetScoreEvaluator)

  • Aggregate dialog embeddings-based scores across datasets (BaseDatasetEmbeddingEvaluator)

  • Judge dialogs with an LLM (BaseLLMJudge)

These abstractions standardize evaluation pipelines for synthetic dialog generation.

class sdialog.evaluation.base.LLMJudgeYesNoOutput(*, reason: str | List[str] | None = None, positive: bool | List[bool])

Bases: BaseModel

Structured output used by yes/no LLM judgments.

Parameters:
  • positive (Union[bool, List[bool]]) – Boolean (or list of booleans) indicating classification outcome(s).

  • reason (Optional[Union[str, List[str]]]) – Optional explanatory reason (string or list).

reason: str | List[str] | None
positive: bool | List[bool]
class sdialog.evaluation.base.LLMJudgeScoreOutput(*, reason: str | None = None, score: int | float = None)

Bases: BaseModel

Structured output used by numeric score LLM judgments.

Parameters:
  • score (Union[int, float]) – Numeric score (int or float).

  • reason (Optional[str]) – Optional explanatory reason.

reason: str | None
score: int | float
class sdialog.evaluation.base.BaseDialogEmbedder(name: str | None = None)

Bases: ABC

Base class for dialog embedding models.

Subclasses must implement the abstract method: embed(dialog: Dialog) -> np.ndarray

Example:

from sdialog.evaluation.base import BaseDialogEmbedder
import numpy as np

# Custom embedder to embed dialogues as random N-d embeddings
class RndEmbedder(BaseDialogEmbedder):
    def __init__(self, n=256):
        self.n = n

    def embed(self, dialog):
        return np.random.rand(self.n)

# Create a new embedder for 128-d embeddings and embed some dialogues
rnd_embedder = RndEmbedder(n=128)
for d in dialogues:
    emb = rnd_embedder(d)
    print(emb.shape)  # (128,)
Parameters:

name (Optional[str]) – Optional name identifier for the embedder.

__call__(dialog: Dialog) numpy.ndarray

Embed a dialog into a vector representation (delegates to embed()).

Parameters:

dialog (Dialog) – The dialog instance to embed.

Returns:

Vector representation of the dialog.

Return type:

np.ndarray

abstractmethod embed(dialog: Dialog) numpy.ndarray

Produce an embedding vector for the given dialog.

Parameters:

dialog (Dialog) – The dialog instance to embed.

Returns:

Embedding vector.

Return type:

np.ndarray

Raises:

NotImplementedError – If not implemented in subclass.

class sdialog.evaluation.base.BaseDialogScore(name: str | None = None, ai_speaker: str = None)

Bases: ABC

Base class for computing a scalar score for a single dialog.

Subclasses must implement the abstract method: score(dialog: Dialog) -> float

Example:

from sdialog.evaluation.base import BaseDialogScore
from sdialog import Dialog, Turn

# Custom score class to count the number of turns in a dialogue
class TurnCountScore(BaseDialogScore):
    def score(self, dialog):
        return len(dialog.turns)

# Create a new instance of our score
turn_counter = TurnCountScore()

d = Dialog(turns=[Turn(speaker="u", text="Hi"),
                  Turn(speaker="s", text="Hello")])

print(turn_counter(d))  # Outputs: 2
Parameters:
  • name (Optional[str]) – Name of the score (used in reporting).

  • ai_speaker (Optional[str]) – If provided, restrict scoring to turns spoken by this AI speaker (case-insensitive).

__call__(dialog: Dialog, **kwargs)

Compute the score for a given dialog (delegates to score()).

Parameters:
  • dialog (Dialog) – The dialog to score.

  • kwargs (dict) – Additional keyword arguments for scoring.

Returns:

Scalar score value.

Return type:

float

abstractmethod score(dialog: Dialog) float

Compute the score for the provided dialog.

Parameters:

dialog (Dialog) – The dialog to score.

Returns:

Scalar score value.

Return type:

float

Raises:

NotImplementedError – If not implemented in subclass.

class sdialog.evaluation.base.BaseDialogFlowScore(reference_dialogues: str | List[Dialog], ai_speaker: str = None, k_neighbors: int = 64, use_softmax: bool = True, use_only_ai_speaker: bool = False, use_closest_as_centroid_emb: bool = False, graph=None, nodes=None, name: str = None, verbose: bool = False, **d2f_kwargs)

Bases: BaseDialogScore

Base class for flow-based dialog scores using a reference dialog graph.

Builds (or reuses) a flow graph from reference dialogs, encodes turns, retrieves nearest nodes, and derives transition probabilities. Serves as the foundation for perplexity / likelihood style scores (e.g., DialogFlowPPL, DialogFlowScore).

This is an abstract class (extends BaseDialogScore) and cannot be instantiated directly. Subclasses must implement the abstract method: score(dialog: Dialog) -> float

Parameters:
  • reference_dialogues – List of Dialog objects or path to a serialized dialog file.

  • ai_speaker – If provided, only system/AI speaker turns are considered in scoring.

  • k_neighbors – Number of neighbors for softmax aggregation.

  • use_softmax – If True, weight neighbor probabilities via softmax, else pick top-1.

  • use_only_ai_speaker – If True, only AI turns are used to build the graph and compute the scores.

  • use_closest_as_centroid_emb – If True, use closest utterance embeddings as cluster centroids.

  • graph – Optional precomputed graph object to reuse (bypasses construction).

  • nodes – Optional precomputed node metadata dictionary.

  • name – Optional score name override (auto if None).

  • verbose – Verbosity flag forwarded to graph construction.

  • d2f_kwargs – Additional dialog2graph customization parameters.

Raises:

ValueError – If reference_dialogues is invalid.

get_node_sequence(dialog: Dialog, probs: bool = False) List[str]

Map each turn to its nearest node and optionally return transition probabilities.

Parameters:
  • dialog (Dialog) – Dialog to map.

  • probs (bool) – If True, also return per-transition probability estimates.

Returns:

List of node IDs or (node_sequence, probability_sequence) if probs=True.

Return type:

Union[List[str], Tuple[List[str], List[Optional[float]]]]

Raises:

ValueError – If a dialog speaker is not found in graph metadata.

compute_dialog_log_likelihood(dialog: Dialog) Tuple[float, int]

Compute cumulative log-probability statistics for a dialog using Laplace smoothing.

Laplace smoothing approach (add-one smoothing): - For each transition: P_laplace(dest|src) = (count(src→dest) + 1) / (sum_outbound_counts + V) - Known edges: Use edge frequency count + 1 - Unknown edges: Use count of 1 (equivalent to adding a pseudo-count) - Both are normalized by (total_outbound_count + V) where V is vocabulary size

This provides a principled probability distribution that: 1. Smooths all transitions (known and unknown) consistently 2. Avoids zero probabilities for unseen transitions 3. Doesn’t modify the graph structure (scoring-time smoothing)

Returns four values:

sum_log_p_known: Sum of log probabilities only over known edges. n_turns_known: Count of contributing turns with known edges (includes initial offset). sum_log_p: Sum over all considered turns (with Laplace smoothing). n_turns: Total counted turns (includes initial offset; respects ai_speaker filtering).

Parameters:

dialog (Dialog) – Dialog to evaluate.

Returns:

Tuple (sum_log_p_known, n_turns_known, sum_log_p, n_turns).

Return type:

Tuple[float, int, float, int]

Raises:

ValueError – If a speaker is missing from metadata.

abstractmethod score(dialog: Dialog) float

Compute a flow-based perplexity / likelihood derived score for the dialog.

Parameters:

dialog (Dialog) – Dialog to score.

Returns:

Scalar score value.

Return type:

float

Raises:

NotImplementedError – If not implemented in subclass.

class sdialog.evaluation.base.BaseDatasetEvaluator

Bases: ABC

Base class for dataset evaluators.

Dataset evaluators take a set of dialogs and return an evaluation. Typically, Dataset evaluator subclasses will take a dialogue score (BaseDialogScore object) when created and will return an aggregate of the per-dialog scores.

Subclasses must implement the abstract method: __call__(dialogues, dataset_name: Optional[str] = None, **kwargs) -> Union[dict, float]

Example:

from sdialog.evaluation.base import BaseDatasetEvaluator
from sdialog import Dialog, Turn

class CountDialogsEvaluator(BaseDatasetEvaluator):
    def __call__(self, dialogues, dataset_name=None):
        return len(dialogues)

dialog_counter = CountDialogsEvaluator()

dialogs = [Dialog(turns=[Turn(speaker="u", text="Hi")]) for _ in range(3)]

print(dialog_counter(dialogs))  # Outputs: 3
abstractmethod __call__(dialogues: str | List[Dialog], dataset_name: str = None, **kwargs) dict | float

Evaluate a collection of dialogues.

Parameters:
  • dialogues (Union[str, List[Dialog]]) – List of Dialog objects or a path to a serialized file.

  • dataset_name (Optional[str]) – Optional label for the dataset.

  • kwargs (dict) – Additional evaluator-specific parameters.

Returns:

Evaluation results (scalar or dict).

Return type:

Union[float, dict]

Raises:

NotImplementedError – If not implemented in subclass.

class sdialog.evaluation.base.BaseDatasetScoreEvaluator(dialog_score: BaseDialogScore, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)

Bases: BaseDatasetEvaluator

Base class for dataset-level aggregation of per-dialog scores.

Dataset score evaluators take a dialog score (BaseDialogScore object) when created and given a collection of dialogs, aggregate their individual scores to return a single value for the collection.

Subclasses must implement the abstract methods:

  • __eval__(dialog_scores: List[Union[float, int]]) -> Union[dict, float]

  • (Optional) __plot__(dialog_scores: Dict[str, np.ndarray], plot: Optional[plt.Axes] = None) -> None

Example:

import numpy as np
from sdialog.evaluation import QuestionRateScore
from sdialog.evaluation.base import BaseDatasetScoreEvaluator

# Let's create our average score evaluator
# We need to implement only the __eval__ method (and optionally the __plot__ method)
# In practice, don't use this example class for average, but better use the built-in `MeanEvaluator`!
class AverageEvaluator(BaseDatasetScoreEvaluator):
    def __plot__(self, dialog_scores, plot=None): pass  # no-op

    def __eval__(self, dialog_scores):
        return np.mean(dialog_scores)

avg_evaluator = AverageEvaluator(
    dialog_score=QuestionRateScore(speaker="System")
)
print(avg_evaluator(dialogs))  # Outputs average system's question rate across dialogues
Parameters:
  • dialog_score (BaseDialogScore) – Dialog-level scoring component.

  • name (Optional[str]) – Optional evaluator name (auto-derived if None).

  • enable_plotting (bool) – Whether to keep per-dataset scores for plotting.

  • verbose (bool) – Whether to keep tqdm bars visible.

__call__(dialogues: str | List[Dialog], dataset_name: str = None, return_scores: bool = False) dict | float

Compute per-dialog scores then aggregate.

Parameters:
  • dialogues (Union[str, List[Dialog]]) – Iterable of Dialog objects or path.

  • dataset_name (Optional[str]) – Label for the dataset (default ‘candidate’).

  • return_scores (bool) – If True also return raw score array(s).

Returns:

Aggregated results or (results, raw_scores) if return_scores=True.

Return type:

Union[dict, float, Tuple[Union[dict, float], np.ndarray]]

Raises:

KeyboardInterrupt – If user interrupts (partial results saved).

clear()

Clear stored per-dataset raw scores history.

plot(show: bool = True, save_path: str = None, **kwargs)

Generate plots for stored dataset scores.

Parameters:
  • show (bool) – Whether to display the plot(s).

  • save_path (Optional[str]) – If provided, save figure(s) to this path (metric name appended when multi-metric).

  • kwargs (dict) – Additional keyword arguments for plotting.

Returns:

None

Return type:

None

class sdialog.evaluation.base.BaseDatasetEmbeddingEvaluator(dialog_embedder: BaseDialogEmbedder, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)

Bases: BaseDatasetEvaluator

Base class for dataset-level evaluation using dialog embeddings.

It takes a dialog embedder (BaseDialogEmbedder object) when created and given a collection of dialogs, computes their embeddings and returns a single value for the collection.

Subclasses must implement: __eval__(dialog_embs: List[np.ndarray]) -> Union[dict, float]`

and __plot__(dialog_embs: Dict[str, np.ndarray], tsne_model: TSNE, plot: Optional[plt.Axes]) -> None

Example:

from sdialog import SentenceTransformerDialogEmbedder
from sdialog.evaluation.base import BaseDatasetEmbeddingEvaluator

# Evaluator that computes the average centroid cosine distance to a reference centroid
# We need to implement only the __eval__ method (and optionally the __plot__ method)
# (in practice, use the built-in ReferenceCentroidEmbeddingEvaluator!)
class ReferenceCentroidEmbeddingEvaluator(BaseDatasetEmbeddingEvaluator):
    def __plot__(self, dialog_embs): pass  # no-op

    def __init__(self, dialog_embedder, reference_dialogs):
        self.reference_centroid = np.mean(
            [dialog_embedder(dialog) for dialog in reference_dialogs], axis=0
        )

    def __eval__(self, dialog_embs):
        centroid = np.mean(dialog_embs, axis=0)
        return cosine(centroid, self.reference_centroid)

dialog_embedder = SentenceTransformerDialogEmbedder(model_name="sentence-transformers/LaBSE")
centroid_evaluator = ReferenceCentroidEmbeddingEvaluator(dialog_embedder=dialog_embedder,
                                                         reference_dialogs=reference_dialogs)

print(centroid_evaluator(dialogs))  # distance between candidate and reference dialogues
                                    # (cosine distance between their centroids)
Parameters:
  • dialog_embedder (BaseDialogEmbedder) – Dialog embedding component.

  • name (Optional[str]) – Optional evaluator name (auto-derived if None).

  • enable_plotting (bool) – Whether to store embeddings for plotting.

  • verbose (bool) – Verbosity flag for progress bars.

__call__(dialogues: str | List[Dialog], dataset_name: str = None, return_embs: bool = False) dict | float

Embed dialogs and aggregate their embeddings to a single score.

Parameters:
  • dialogues (Union[str, List[Dialog]]) – Iterable of Dialogs or path.

  • dataset_name (Optional[str]) – Dataset label (default ‘candidate’).

  • return_embs (bool) – If True return (results, embeddings_array).

Returns:

Aggregated evaluation or (results, embeddings) if return_embs.

Return type:

Union[dict, float, Tuple[Union[dict, float], np.ndarray]]

clear()

Clear stored per-dataset embeddings.

plot(show: bool = True, save_path: str = None)

Plot embeddings (e.g., via subclass t-SNE projection) for stored datasets.

Parameters:
  • show (bool) – Whether to display the plot.

  • save_path (Optional[str]) – If provided, save plot to this path.

Returns:

None

Return type:

None

class sdialog.evaluation.base.BaseLLMJudge(model: langchain_core.language_models.base.BaseLanguageModel | str = None, prompt_template: str = '', output_format: dict | BaseModel = None, **llm_kwargs)

Bases: ABC

Base class for all LLM-based evaluation judges that render a prompt and return an output. This is the base class of built-in base judges like LLMJudgeYesNo or LLMJudgeScore.

Subclasses must implement the abstract method: judge(dialogs: Union[Dialog, List[Dialog]]) -> dict

Example:

from sdialog.evaluation.base import BaseLLMJudge
from sdialog import Dialog, Turn

import os

class MagicJudge(BaseLLMJudge):
def judge(self, dialog):

# Render prompt (dialog -> text) prompt = self.prompt_template.render(dialog=dialog) raw = self(prompt) # call underlying LLM return raw # normally you’d parse into structured output

magic_judge = MagicJudge(prompt_template=”How magical is the following dialogue? Dialogue:n{{ dialog }}”) print(magic_judge.judge(dialog)) # Outputs raw LLM response

param model:

Model instance or model name (falls back to config if None).

type model:

Union[BaseLanguageModel, str]

param prompt_template:

Jinja2 template string used to build the human prompt.

type prompt_template:

str

param output_format:

Optional Pydantic schema or JSON schema dict for structured output.

type output_format:

Union[dict, BaseModel]

param llm_kwargs:

Additional model instantiation parameters overriding config.

type llm_kwargs:

dict

__call__(prompt: str) dict | BaseModel

Invoke the underlying LLM with the given rendered prompt.

Parameters:

prompt (str) – Fully rendered human prompt content.

Returns:

Raw model response or structured output (depending on output_format).

Return type:

Union[dict, BaseModel]

abstractmethod judge(dialogs: Dialog | List[Dialog]) dict

Judge one or many dialogs using the LLM.

Parameters:

dialogs (Union[Dialog, List[Dialog]]) – A single Dialog or list of Dialog objects.

Returns:

Dictionary of judged metrics / fields extracted.

Return type:

dict

Raises:

NotImplementedError – If not implemented in subclass.

prompt(system: bool = False) str

Return the current system or human prompt text.

Parameters:

system (bool) – If True return system prompt; else return last human prompt.

Returns:

Prompt text.

Return type:

str


sdialog.datasets

This module provides utilities for loading, parsing, and describing dialogue datasets, including the STAR dataset. It supports extracting scenarios, flowcharts, personas, and constructing dataset-specific Agent objects for simulation.

class sdialog.datasets.STAR

Bases: object

Utility class for interacting with the STAR dialogue dataset.

Provides methods for loading dialogues, extracting scenarios, flowcharts, responses, and constructing Agent objects for simulation and evaluation.

static set_path(path)

Sets the root path for the STAR dataset.

Parameters:

path (str) – Path to the STAR dataset root directory.

Returns:

None

Return type:

None

static read_graph(task_name, as_dot: bool = True)

Read the action graph for a given task.

Parameters:
  • task_name (Union[str, dict]) – Name of the task (folder name under tasks/), or a pre-loaded JSON dict (as returned by the task’s JSON file) to skip disk I/O.

  • as_dot (bool) – If True, return a DOT string; else return the raw graph dict.

Returns:

Graph in DOT format or raw dictionary mapping edges.

Return type:

Union[str, dict]

static read_graph_responses(task_name, as_dict: bool = False)

Read example responses associated with each node/action in a task graph.

Placeholders of the form {variable[:format]} are uppercased for visibility.

Parameters:
  • task_name (Union[str, dict]) – Name of the task, or a pre-loaded responses dict to skip disk I/O.

  • as_dict (bool) – If True, return a dict; otherwise a JSON-formatted string.

Returns:

Mapping node -> example response, or JSON dump.

Return type:

Union[dict, str]

static get_task_names()

List all available task names (directory names under tasks/).

Returns:

List of task names.

Return type:

List[str]

static get_dialog(id)

Load a dialogue by numeric ID.

Parameters:

id (int) – Dialogue ID (filename without extension).

Returns:

Dialog object with turns and events populated.

Return type:

Dialog

static get_dialogs(domain: str = None, task_name: str = None, happy: bool = None, multitask: bool = None)

Load all dialogues matching optional filter criteria.

Parameters:
  • domain (Optional[str]) – Domain filter (must appear in Scenario[‘Domains’]).

  • task_name (Optional[str]) – Task name filter (must appear in WizardCapabilities).

  • happy (Optional[bool]) – Filter by ‘happy path’ flag.

  • multitask (Optional[bool]) – Filter by multitask flag.

Returns:

List of Dialog objects matching filters.

Return type:

List[Dialog]

static get_dialog_scenario(id)

Load scenario metadata for a given dialogue.

Parameters:

id (int) – Dialogue ID.

Returns:

Scenario dictionary.

Return type:

dict

static get_dialog_first_turn(id, speaker: str = None)

Get the first turn for a given dialogue (optionally constrained to a speaker).

Parameters:
  • id (int) – Dialogue ID.

  • speaker (Optional[str]) – Speaker name filter (e.g., ‘User’ or ‘Wizard’); if None, first participant turn is returned.

Returns:

First matching turn or None if not found.

Return type:

Optional[Turn]

static get_dialog_task_names(id)

Get all task names (WizardCapabilities -> Task) for a dialogue.

Parameters:

id (int) – Dialogue ID.

Returns:

List of task names.

Return type:

List[str]

static get_dialog_responses(id)

Get example response dictionaries for each task in a dialogue.

Parameters:

id (int) – Dialogue ID.

Returns:

List of response dicts (one per task).

Return type:

List[dict]

static get_dialog_graphs(id)

Get raw action graphs (dict form) for all tasks in a dialogue.

Parameters:

id (int) – Dialogue ID.

Returns:

List of graph dicts.

Return type:

List[dict]

static get_dialog_events(id)

Get all events for a dialogue.

Parameters:

id (int) – Dialogue ID.

Returns:

List of event dictionaries from the JSON file.

Return type:

List[dict]

static get_dialog_user_instructions(id)

Get user guide instructions mapped to the (user) turn index where each applies.

Parameters:

id (int) – Dialogue ID.

Returns:

Mapping user_turn_index -> instruction text.

Return type:

Dict[int, str]

static get_dialog_graphs_and_responses(id)

Convenience loader returning both graphs and responses for all tasks.

Parameters:

id (int) – Dialogue ID.

Returns:

Tuple (graphs list, responses list).

Return type:

Tuple[List[dict], List[dict]]

static get_scenario_description(scenario)

Build a natural language description of a scenario including embedded DOT graphs and example responses.

Parameters:

scenario (dict) – Scenario dictionary (as returned by get_dialog_scenario).

Returns:

Multi-section textual description.

Return type:

str

static get_dialog_scenario_description(id)

Retrieve scenario metadata and its natural language description.

Parameters:

id (int) – Dialogue ID.

Returns:

Tuple (scenario dict, description string).

Return type:

Tuple[dict, str]

static get_user_persona_for_scenario(scenario)

Construct a Persona object representing the user under the given scenario.

Parameters:

scenario (dict) – Scenario metadata.

Returns:

User persona object.

Return type:

Persona

static get_flowchart_description_for_scenario(scenario)

Build a markdown-like description with DOT graphs and example responses for each task.

Parameters:

scenario (dict) – Scenario metadata.

Returns:

Combined flowchart description text.

Return type:

str

static get_system_persona_for_scenario(scenario)

Construct a Persona object representing the system/assistant for the scenario.

Parameters:

scenario (dict) – Scenario metadata.

Returns:

System persona object.

Return type:

Persona

static get_agents_for_scenario(scenario, model_name: str = None)

Create (system, user) Agent objects for a scenario (personas only; no orchestration).

Parameters:
  • scenario (dict) – Scenario metadata.

  • model_name (Optional[str]) – Optional model name / identifier for agent LLM configuration.

Returns:

Tuple (system_agent, user_agent).

Return type:

Tuple[Agent, Agent]

static get_agents_from_dialogue(id, model_name: str = None, set_first_utterance: bool = False)

Create (system, user) Agent objects derived from a dialogue’s scenario.

Optionally set an initial first system utterance (heuristic).

Parameters:
  • id (int) – Dialogue ID.

  • model_name (Optional[str]) – Optional model name / identifier for agent LLM configuration.

  • set_first_utterance (bool) – If True, assign a first system utterance.

Returns:

Tuple (system_agent, user_agent).

Return type:

Tuple[Agent, Agent]

static get_agents_from_dialogue_with_orchestration(id, model_name: str = None, set_first_utterance: bool = False)

Create (system, user) Agent objects with attached orchestrators for responses/instructions.

Parameters:
  • id (int) – Dialogue ID.

  • model_name (Optional[str]) – Optional model name / identifier for agent LLM configuration.

  • set_first_utterance (bool) – If True, assign a first system utterance.

Returns:

Tuple (system_agent_with_orchestrator, user_agent_with_orchestrator).

Return type:

Tuple[Agent, Agent]


sdialog.config

This module loads and processes the configuration for the sdialog package.

sdialog.config.llm(llm_name, **llm_kwargs)

Update the LLM model setting in the config.

Parameters:

llm_name (str) – The name of the LLM model to set.

sdialog.config.llm_params(**params)

Update the LLM hyperparameters in the config.

Parameters:

params (dict) – Dictionary of hyperparameter names and values.

sdialog.config.cache(enable)

Enable or disable caching.

Parameters:

enable (bool) – Whether to enable caching or not.

sdialog.config.cache_path(path)

Set the path for the cache directory.

Parameters:

path (str) – The new path for the cache directory.

sdialog.config.set_cache(path, enable=True)

Set the cache path and enable/disable caching.

Parameters:
  • path (str) – The path to the cache directory.

  • enable (bool) – Whether to enable caching or not.

sdialog.config.clear_cache()

Clear the cache by deleting all files in the cache directory.

sdialog.config.set_persona_dialog_generator_prompt(path)

Set the path for the persona_dialog_generator prompt.

Parameters:

path (str) – The new path for the prompt file.

sdialog.config.set_persona_generator_prompt(path)

Set the path for the persona_generator prompt.

Parameters:

path (str) – The new path for the prompt file.

sdialog.config.set_dialog_generator_prompt(path)

Set the path for the dialog_generator prompt.

Parameters:

path (str) – The new path for the prompt file.

sdialog.config.set_persona_agent_prompt(path)

Set the path for the persona_agent prompt.

Parameters:

path (str) – The new path for the prompt file.


sdialog.util

Utility Functions for sdialog

This module provides helper functions for the sdialog package, including serialization utilities to ensure objects can be safely converted to JSON for storage or transmission.

class sdialog.util.CacheDialogScore

Bases: object

Static class for caching utility for dialog scoring functions keyed by: (score class name, score object JSON-serializable attributes, dialog._path).

Provides static methods to initialize, enable/disable, persist, and clear cache.

static cache(func)

Decorator adding disk-backed caching for dialog scoring functions.

Cache key includes:

  • score object class name

  • JSON-serializable attributes of score object

  • dialog._path (must exist)

Parameters:

func (callable) – Target scoring function (score_obj, dialog, *args, **kwargs).

Returns:

Wrapped function with caching logic.

Return type:

callable

static clear()

Clear in-memory cache and persist empty structure.

Returns:

None

Return type:

None

static get_cache()

Get internal cache dictionary.

Returns:

Current in-memory cache mapping.

Return type:

dict

static get_cache_path() str

Get absolute path to cache JSON file.

Returns:

Path to cache file.

Return type:

str

Raises:

ValueError – If init() not called first.

static init(path, enable_cache=True)

Initialize cache system (load existing cache file if present).

Parameters:
  • path (str) – Directory path where cache file resides / will reside.

  • enable_cache (bool) – Whether to enable caching immediately.

Returns:

None

Return type:

None

static is_cache_enabled() bool

Check if caching is enabled.

Returns:

True if enabled.

Return type:

bool

static save()

Persist cache dictionary to JSON file (if enabled).

Returns:

None

Return type:

None

Raises:

ValueError – If not initialized.

static set_cache_path(path: str)

Set cache file path (creates directory if needed).

Parameters:

path (str) – Base directory for cache file.

Returns:

None

Return type:

None

static set_enable_cache(enable: bool)

Enable or disable the cache.

Parameters:

enable (bool) – True to enable caching, False to disable.

Returns:

None

Return type:

None

class sdialog.util.KNNModel(items, k=3)

Bases: object

Thin wrapper around sklearn NearestNeighbors for cosine similarity retrieval.

Parameters:
  • items (Iterable[Tuple[Any, Sequence[float]]]) – Iterable of (item_id, embedding_vector) pairs.

  • k (int) – Default number of neighbors to retrieve.

__call__(target_emb, k=None)

Retrieve k nearest neighbors by cosine distance.

Parameters:
  • target_emb (Sequence[float]) – Query embedding vector.

  • k (int) – Override number of neighbors (defaults to self.k).

Returns:

List of (item_id, distance) pairs ordered by proximity.

Return type:

List[Tuple[Any, float]]

neighbors(target_emb, k=None)

Retrieve k nearest neighbors by cosine distance.

Parameters:
  • target_emb (Sequence[float]) – Query embedding vector.

  • k (int) – Override number of neighbors (defaults to self.k).

Returns:

List of (item_id, distance) pairs ordered by proximity.

Return type:

List[Tuple[Any, float]]

class sdialog.util.SentencePairTransformer(model_name: str = 'roberta-base', device: str = None, verbose: bool = True)

Bases: object

Transformer wrapper producing CLS embeddings for paired sentences (similar to NLI encoding).

Parameters:
  • model_name (str) – Hugging Face model name.

  • device (str) – Explicit device ("cpu" / "cuda:*"); auto-detected if None.

  • verbose (bool) – Enable verbose progress display.

encode(sent1: str | List[str], sent2: str | List[str], batch_size: int = 128, show_progress_bar: bool = True, progress_bar_desc: str = 'Computing embeddings') numpy.ndarray

Encode aligned sentence pairs into CLS embeddings.

Parameters:
  • sent1 (Union[str, List[str]]) – First sentence or list of first sentences.

  • sent2 (Union[str, List[str]]) – Second sentence or list of second sentences.

  • batch_size (int) – Batch size for encoding.

  • show_progress_bar (bool) – Whether to show progress bar.

  • progress_bar_desc (str) – Description label for progress bar.

Returns:

Array of shape (N, hidden_size) containing CLS embeddings.

Return type:

np.ndarray

sdialog.util.camel_or_snake_to_words(varname: str) str

Convert camelCase or snake_case identifier into normalized spaced words.

Parameters:

varname (str) – Identifier string.

Returns:

Human-readable spaced words.

Return type:

str

sdialog.util.check_valid_model_name(func)

Decorator ensuring first argument (model_name) is a string; otherwise short-circuits to False.

Parameters:

func (callable) – The predicate function to wrap.

Returns:

Wrapped function enforcing a str model_name.

Return type:

callable

sdialog.util.dialogs_to_utt_pairs(dialogs: List[BaseModel], ai_speaker: str = None) Tuple[List[str], List[str]]

Extract utterance -> next utterance (adjacent turn) pairs from dialogs.

Two modes:
  • Sliding window mode (ai_speaker is None): pairs every turn with its successor.

  • QA mode (ai_speaker given): pairs each human turn immediately preceding an AI turn (speaker match).

Parameters:
  • dialogs (List[BaseModel]) – List of dialog-like Pydantic objects each having a .turns list with .text and .speaker.

  • ai_speaker (str) – Optional name (case-insensitive) of the AI speaker to filter answer turns.

Returns:

Tuple (utterances, next_utterances) of equal length.

Return type:

Tuple[List[str], List[str]]

Raises:

ValueError – If no turns found, or lengths mismatch, or ai_speaker filtering yields nothing.

sdialog.util.dict_to_table(data: dict, sort_by: str = None, sort_ascending: bool = True, markdown: bool = False, format: str = '.2f', show: bool = True) str

Render a dict-of-dicts as a table (Markdown or fancy grid).

Parameters:
  • data (dict) – Mapping where each value is itself a mapping of column -> value.

  • sort_by (str) – Column name to sort by.

  • sort_ascending (bool) – Sort ascending if True.

  • markdown (bool) – If True produce GitHub-flavored Markdown table.

  • format (str) – Float formatting specifier passed to pandas.

  • show (bool) – If True print table to stdout.

Returns:

Table as string.

Return type:

str

sdialog.util.get_llm_default_params(model_name: str, llm_params: dict, retry: bool = True) float

Get the default parameters for the model if not already specified, and merges them into llm_params.

Parameters:
  • model_name (str) – LLM model name.

  • llm_params (dict) – Existing LLM parameter dictionary to update in-place.

Returns:

Updated llm_params with defaults filled.

Return type:

dict

sdialog.util.get_llm_model(model_name: str, output_format: dict | BaseModel = None, return_model_params: bool = False, think: bool = False, tools: List = None, **llm_kwargs)

Instantiate a LangChain chat model (OpenAI, AWS, Google, Ollama, Hugging Face).

Applies backend-specific adjustments (e.g., removing unsupported params). Optionally wraps model for structured output if output_format provided and supported.

Parameters:
  • model_name (Union[str, Any]) – Model name or instance.

  • output_format (Union[dict, BaseModel, type[BaseModel]]) – Pydantic model class or JSON schema dict for structured output.

  • return_model_params (bool) – If True, return (model, llm_kwargs) tuple instead of just model.

  • think (bool) – If True, enables “thinking” segments in responses.

  • tools (List[langchain_core.tools.structured.StructuredTool]) – Optional list of tool functions to enable.

  • llm_kwargs (dict) – Additional backend-specific model kwargs.

Returns:

Configured LangChain model (possibly wrapped for structured output).

Return type:

Any

Raises:

ValueError – If model_name is invalid type.

sdialog.util.get_timestamp() str

Return current UTC timestamp in ISO 8601 format (seconds precision, trailing ‘Z’).

Returns:

ISO 8601 UTC timestamp.

Return type:

str

sdialog.util.get_universal_id() str

Generates a unique identifier (UUID4) for sdialog objects.

Returns:

A unique identifier as a string.

Return type:

str

sdialog.util.is_amazon_model_name(model_name, *args, **kwargs)
sdialog.util.is_anthropic_model_name(model_name, *args, **kwargs)
sdialog.util.is_azure_openai_model_name(model_name, *args, **kwargs)
sdialog.util.is_google_genai_model_name(model_name, *args, **kwargs)
sdialog.util.is_huggingface_model_name(model_name, *args, **kwargs)
sdialog.util.is_ollama_model_name(model_name, *args, **kwargs)
sdialog.util.is_openai_model_name(model_name, *args, **kwargs)
sdialog.util.make_serializable(data: dict) dict

Convert non-JSON-serializable values in a dictionary to strings (in-place mutation).

Parameters:

data (dict) – Dictionary to sanitize.

Returns:

The mutated dictionary with serializable values.

Return type:

dict

Raises:

TypeError – If input is not a dict.

sdialog.util.ollama_check_and_pull_model(model_name: str) bool

Ensure an Ollama model is available locally (pull if missing).

Parameters:

model_name (str) – Model name (may include ‘ollama:’ prefix).

Returns:

True if available or successfully pulled; False otherwise.

Return type:

bool

sdialog.util.remove_audio_tags(text: str) str

Remove tags of the form <…>. (Despite the summary mentioning {}, (), [], only angle brackets are removed.)

Parameters:

text (str) – Input text possibly containing markup tags.

Returns:

Text with angle-bracket tags removed.

Return type:

str

sdialog.util.remove_newlines(s: str) str

Replace all whitespace (including newlines) with single spaces and collapse repeats.

Parameters:

s (Any) – Input value (non-str inputs are returned unchanged).

Returns:

Normalized single-line string or original object if not str.

Return type:

str

sdialog.util.set_generator_seed(generator, seed)

Attempt to set a deterministic seed on the underlying LLM (if supported); fallback to torch.manual_seed.

Also applies a workaround for certain Ollama caching issues by forcing an initial trivial generation.

Parameters:
  • generator (Any) – Object containing .llm and (optionally) .messages or .memory.

  • seed (int) – Desired seed; if None a random 32-bit value is generated.

Returns:

The seed actually used (or None if unsupported).

Return type:

int

sdialog.util.softmax(values, temperature=0.05, as_list=True)

Compute softmax over a 1D iterable of numeric values.

Parameters:
  • values (Iterable[float]) – Sequence of numeric scores.

  • temperature (float) – Temperature divisor (lower = sharper distribution).

  • as_list (bool) – If True return a Python list; otherwise a torch tensor.

Returns:

Softmax probability distribution.

Return type:

Union[List[float], torch.Tensor]

sdialog.util.upper_camel_to_dash(name: str) str

Convert UpperCamelCase to dash-case (preserving acronym groups).

Parameters:

name (str) – Class or identifier name.

Returns:

dash-case form.

Return type:

str


sdialog.server

OpenAI-compatible RESTful API server for SDialog agents.

This module provides a Server class that serves SDialog agents as OpenAI-compatible chat completion endpoints, enabling integration with OpenAI-compatible clients like Open WebUI.

class sdialog.server.Server

Bases: object

Static server class for serving SDialog agents as OpenAI-compatible API.

This server provides OpenAI-compatible chat completion endpoints that can be used with clients like Open WebUI. The server handles agent memory internally, only processing new messages while maintaining conversation context.

classmethod add_agent(agent: Agent, model_name: str = None) None

Add an agent to the server without starting it.

Parameters:
  • agent (Agent) – The SDialog agent to add.

  • model_name (str) – Model name to use for the agent.

classmethod list_agents() List[str]

List all registered agent model names.

Returns:

List of model names.

Return type:

List[str]

classmethod remove_agent(model_name: str) None

Remove an agent from the server.

Parameters:

model_name (str) – Model name of the agent to remove.

classmethod reset_agent(model_name: str, seed: int | None = None) None

Reset an agent’s memory and state.

Parameters:
  • model_name (str) – Model name of the agent to reset.

  • seed (Optional[int]) – Optional seed for the reset.

classmethod serve(agents: Agent | List[Agent], host: str = '0.0.0.0', port: int = 1333, stateless: bool = True, model_names: str | List[str] | None = None, log_level: str = 'info') None

Serve SDialog agents as an OpenAI-compatible RESTful API.

This method automatically detects the environment and chooses the appropriate server startup method. In standard environments (command line, scripts), it uses uvicorn.run(). In Jupyter notebooks or other environments with existing event loops, it automatically falls back to a threaded server.

Parameters:
  • agents (Union[Agent, List[Agent]]) – The SDialog agent or a list of agents to serve.

  • host (str) – Host address to bind the server to.

  • port (int) – Port number to bind the server to.

  • stateless (bool) – If True, the agent will not maintain memory between requests and the full context must be provided with each request.

  • model_names (Optional[Union[str, List[str]]]) – Model names to expose in the API (defaults to agent’s name).

  • log_level (str) – Logging level for the server.

Example:

from sdialog import Persona
from sdialog.agents import Agent
from sdialog.server import Server

# Create two agents
user = Agent(persona=Persona(name="Dr. Nebula", role="Astrobotanist seeking alien spores"),
             name="Scientist")
bot = Agent(persona=Persona(name="StationCore", role="Sarcastic habitat control AI"),
            name="Bot")

# Serve them as an OpenAI-compatible API
Server.serve([user, bot], port=1333)
# Output:
# Starting server for agents on localhost:1333
# > 2 registered agents: Scientist:latest, Bot:latest
async classmethod serve_async(agents: Agent | List[Agent], host: str = '0.0.0.0', port: int = 1333, stateless: bool = True, model_names: str | List[str] | None = None, log_level: str = 'info') None

Serve SDialog agents as an OpenAI-compatible RESTful API (async version).

This method is designed for use in environments with existing event loops, such as Jupyter notebooks, where uvicorn.run() would fail.

Parameters:
  • agents (Union[Agent, List[Agent]]) – The SDialog agent or a list of agents to serve.

  • host (str) – Host address to bind the server to.

  • port (int) – Port number to bind the server to.

  • stateless (bool) – If True, the agent will not maintain memory between requests and the full context must be provided with each request.

  • model_names (Optional[Union[str, List[str]]]) – Model names to expose in the API (defaults to agent’s name).

  • log_level (str) – Logging level for the server.

classmethod serve_in_thread(agents: Agent | List[Agent], host: str = '0.0.0.0', port: int = 1333, stateless: bool = True, model_names: str | List[str] | None = None, log_level: str = 'info') None

Serve SDialog agents in a separate thread (alternative for Jupyter).

This method runs the server in a separate thread, allowing it to coexist with Jupyter’s event loop without conflicts. It’s automatically used as a fallback by the main serve() method when an event loop conflict is detected.

Parameters:
  • agents (Union[Agent, List[Agent]]) – The SDialog agent or a list of agents to serve.

  • host (str) – Host address to bind the server to.

  • port (int) – Port number to bind the server to.

  • stateless (bool) – If True, the agent will not maintain memory between requests and the full context must be provided with each request.

  • model_names (Optional[Union[str, List[str]]]) – Model names to expose in the API (defaults to agent’s name).

  • log_level (str) – Logging level for the server.

Returns:

The thread object running the server.

Return type:

threading.Thread


sdialog.audio

sdialog.audio.dialog

sdialog.audio.turn

sdialog.audio.tts

sdialog.audio.voice_database

sdialog.audio.room

sdialog.audio.room_generator

sdialog.audio.acoustics_simulator

sdialog.audio.utils

sdialog.audio.dscaper_utils

sdialog.audio.jsalt

sdialog.audio.pipeline

sdialog.audio.impulse_response_database

sdialog.audio.processing