sdialog
This package provides utilities for generating synthetic dialogues using instruction-tuned large language models (LLMs). Dialogues are generated primarily via role-playing, where each agent is defined by a Persona object. The package supports dialogue orchestration, context management, and flexible serialization for downstream tasks.
Main components:
Dialog, Turn, Event: Data structures for representing dialogues and their events.
Context, Persona and Agent: For defining and simulating role-played agents in a given context.
Orchestrators: For controlling agent behavior during dialogue generation.
Evaluation: Utilities and metrics for assessing dialogue quality (coherence, turn balance, persona/goal adherence, safety screening, lexical/statistical reports) and for building reproducible evaluation pipelines.
Interpretability: Layer/token-level activation capture, inspection (Inspector, hooks), steering (directional modulation of representations), and instruction extraction utilities (see interpretability.py).
- class sdialog.Turn(*, speaker: str | None = None, text: str)
Bases:
BaseModelRepresents a single turn in a dialogue.
- Parameters:
speaker (Optional[str]) – The name or role of the speaker.
text (str) – The utterance text for this turn.
- speaker: str | None
- text: str
- prompt() str
Generates a prompt string for this turn.
- print()
- class sdialog.Event(*, agent: str | None = None, action: str, actionLabel: str | None = None, content: str | Dict | List, timestamp: int)
Bases:
BaseModelRepresents an event in a dialogue, which may be an utterance, instruction, or other action.
- Parameters:
agent (Optional[str]) – The agent responsible for the event (e.g., “user”, “system”).
action (str) – The type of event (e.g., “utter”, “instruct”).
actionLabel (Optional[str]) – A label describing the action (e.g., type of instruction).
content (Union[str, Dict, List]) – The content of the event.
timestamp (int) – The Unix timestamp of the event.
- agent: str | None
- action: str
- actionLabel: str | None
- content: str | Dict | List
- timestamp: int
- class sdialog.Dialog(*, version: str | None = <factory>, timestamp: str | None = <factory>, model: str | Dict | None = None, seed: int | None = None, id: int | str | None = <factory>, parentId: int | str | None = None, complete: bool | None = None, personas: dict[str, ~typing.Any] | None=None, context: str | dict[str, ~typing.Any] | None=None, scenario: dict | str | None = None, turns: List[Turn] | None = <factory>, events: List[Event] | None = None, notes: Any | None = None)
Bases:
BaseModelA pydantic model representing a conversational dialogue with rich metadata, container-like access to turns, text utilities, analytics, and I/O helpers.
- Parameters:
version (Optional[str]) – Version of the dialogue format.
timestamp (Optional[str]) – Timestamp of dialogue creation.
model (Optional[Union[str, Dict]]) – The model used to generate the dialogue.
seed (Optional[int]) – The random seed used for generation.
id (Optional[Union[int, str]]) – Unique ID for the dialogue.
parentId (Optional[Union[int, str]]) – ID of the parent dialogue, if any.
complete (Optional[bool]) – Whether the dialogue is complete.
personas (Optional[dict[str, Any]]) – Any is a subclass of MetaPersona.
context (Optional[Union[str, dict[str, Any]]]) – Shared context for the dialogue.
scenario (Optional[Union[dict, str]]) – Scenario description or metadata.
turns (Optional[List[Turn]]) – List of dialogue turns.
events (Optional[List[Event]]) – List of dialogue events (optional).
notes (Optional[str]) – Free-text notes or comments about the dialogue.
- version: str | None
- timestamp: str | None
- model: str | Dict | None
- seed: int | None
- id: int | str | None
- parentId: int | str | None
- complete: bool | None
- personas: dict[str, Any] | None
- context: str | dict[str, Any] | None
- scenario: dict | str | None
- notes: Any | None
- lower(in_place: bool = True) Dialog
Apply str.lower() to every turn’s text.
- Parameters:
in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.
- Returns:
The modified (or cloned) Dialog.
- Return type:
- upper(in_place: bool = True) Dialog
Apply str.upper() to every turn’s text.
- Parameters:
in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.
- Returns:
The modified (or cloned) Dialog.
- Return type:
- title(in_place: bool = True) Dialog
Apply str.title() to every turn’s text.
- Parameters:
in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.
- Returns:
The modified (or cloned) Dialog.
- Return type:
- capitalize(in_place: bool = True) Dialog
Apply str.capitalize() to every turn’s text.
- Parameters:
in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.
- Returns:
The modified (or cloned) Dialog.
- Return type:
- strip(chars: str = None, in_place: bool = True) Dialog
Apply str.strip(chars) to every turn’s text.
- Parameters:
chars (Optional[str]) – Characters to strip; if None, default whitespace is stripped.
in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.
- Returns:
The modified (or cloned) Dialog.
- Return type:
- replace(old: str, new: str, count: int = -1, in_place: bool = True) Dialog
Apply str.replace(old, new, count) to every turn’s text. If count < 0 all occurrences are replaced.
- Parameters:
old (str) – Substring to be replaced.
new (str) – Replacement substring.
count (int) – Maximum number of replacements per text; if < 0 replace all.
in_place (bool) – If True modify this Dialog in place; otherwise return a cloned transformed Dialog.
- Returns:
The modified (or cloned) Dialog.
- Return type:
- re_sub(pattern: str | Pattern, repl: str | callable, count: int = 0, flags: int = 0, in_place: bool = True) Dialog
Apply re.sub(pattern, repl, text, count=count, flags=flags) to every turn’s text. If pattern is a compiled regex, flags are ignored.
- Parameters:
pattern (Union[str, Pattern]) – Regex pattern (string or compiled).
repl (Union[str, Callable]) – Replacement string or callable.
count (int) – Max substitutions per text (0 means unlimited).
flags (int) – Regex flags (ignored if compiled pattern provided).
in_place (bool) – Mutate this Dialog if True; else return a cloned transformed Dialog.
- Returns:
The modified (or cloned) Dialog.
- Return type:
- length(mode: str = 'words', words_per_minute: int = 130) int
Returns the length of the dialogue according to the specified mode (number of words by default).
- Parameters:
mode (str) –
The mode for measuring length. Options:
"turns": Number of turns (default)"words": Total number of words in all turns"minutes"/"time": Approximate duration in minutes (words_per_minute/minute)
words_per_minute (int) – Words per minute for “minutes” mode (default is 130, which is a common estimate).
- Returns:
The computed length according to the mode.
- Return type:
int
- Raises:
ValueError – If an unknown mode is specified.
- clone(new_id: int = None) Dialog
Creates a deep copy of the dialogue.
- Parameters:
new_id (int, optional) – Optional ID to assign to the cloned dialog. If None, a new universal ID is generated.
- Returns:
A new Dialog object that is a deep copy of this one, with updated id and parentId.
- Return type:
- description(turn_template: str = None)
Returns a human-readable string representation of the dialogue.
- Parameters:
turn_template (str) – Template for formatting each turn (default “{speaker}: {text}”).
- Returns:
The formatted dialogue.
- Return type:
str
- prompt() str
Generates a prompt string for the entire dialogue.
- json(string: bool = False, indent: int = 2, ensure_ascii: bool = False)
Serializes the dialogue to JSON.
- Parameters:
string (bool) – If True, returns a JSON string; otherwise, returns a dict.
indent (int) – Indentation level for pretty-printing.
- Returns:
The serialized dialogue.
- Return type:
Union[str, dict]
- print(*a, **kw)
Pretty-prints a dialogue to the console, with optional scenario and orchestration details.
- Parameters:
scenario (bool) – If True, prints scenario information.
orchestration (bool) – If True, prints also orchestration events.
think (bool) – If True, prints “thinking” events.
all (bool) – If True, prints all types of events.
- to_file(path: str = None, type: str = 'auto', makedir: bool = True, overwrite: bool = True, ensure_ascii: bool = False)
Saves the dialogue to a file in JSON, CSV, or plain text format.
- Parameters:
path (str) – Output file path, if not provided, uses the same path used to load the dialogue.
type (str) – “json”, “csv”, “txt”, or “auto” (determined by file extension).
makedir (bool) – If True, creates parent directories as needed.
overwrite (bool) – If False and the file exists, raise FileExistsError instead of overwriting.
ensure_ascii (bool) – If True and type is “json”, escape non-ASCII characters in the output.
- to_audio(path: str = None, **kwargs: dict)
Convert the dialogue to an audio dialogue. This is a convenience wrapper around the full sdialog.audio.pipeline.to_audio function. All keyword arguments are passed to it.
- Parameters:
path (str) – Directory path for storing audio outputs.
dialog_dir_name (str) – Custom name for the dialogue directory.
dscaper_data_path (Optional[str]) – Path to dSCAPER data directory.
room_name (Optional[str]) – Custom name for the room configuration.
perform_tts (Optional[bool]) – Convert the dialog into audio using the text-to-speech engine.
perform_room_acoustics (Optional[bool]) – Enable room acoustics simulation and dSCAPER timeline generation.
tts_engine (BaseTTS) – Text-to-speech engine for audio generation.
voice_database (BaseVoiceDatabase) – Voice database for speaker selection.
dscaper_datasets (List[str]) – List of Hugging Face datasets for dSCAPER.
room (Room) – Room configuration for acoustics simulation.
speaker_positions (dict[Role, dict]) – Speaker positioning configuration.
background_effect (str) – Background audio effect type.
foreground_effect (str) – Foreground audio effect type.
foreground_effect_position (RoomPosition) – Position for foreground effects.
kwargs_pyroom (dict) – PyRoomAcoustics configuration parameters.
source_volumes (dict[SourceType, SourceVolume]) – Volume levels for different audio sources.
audio_file_format (str) – Audio file format (wav, mp3, flac).
seed (int) – Seed for random number generator.
re_sampling_rate (Optional[int]) – Re-sampling rate for the output audio.
recording_devices (Optional[List[Union[RecordingDevice, str]]]) – The identifiers of the recording devices to simulate.
impulse_response_database (Optional[ImpulseResponseDatabase]) – The database for impulse responses.
override_tts_audio (Optional[bool]) – Override the TTS audio if it already exists.
verbose (Optional[bool]) – Verbose mode for logging.
- Returns:
Audio dialogue with processed audio data.
- Return type:
“sdialog.audio.dialog.AudioDialog”
- Raises:
Exception – If the audio module is not installed.
- static from_huggingface(repo_id: str, local_dir: str = None, collapse_consecutive_speakers: bool = False, collapse_separator: str = '\n') List[Dialog] | Dict[str, List[Dialog]]
Loads dialogues from a HuggingFace dataset.
This method downloads a dataset from HuggingFace Hub and loads dialogues from it. The dataset must follow the SDialog format with a ‘data’ folder containing dialogue files. If the data folder contains train/test/val split subdirectories, dialogues are loaded from all splits and returned as a dictionary mapping split names to dialogue lists. Otherwise, all dialogues from the data folder are returned as a single list.
- Parameters:
repo_id (str) – HuggingFace repository ID (e.g., “sdialog/Primock-57”).
local_dir (str, optional) – Local directory to download to. If None, uses a temporary directory.
collapse_consecutive_speakers (bool) – If True, collapses consecutive turns by the same speaker.
collapse_separator (str) – Separator used when collapsing consecutive turns.
- Returns:
List of dialogs or dict mapping splits to lists of dialogs.
- Return type:
- Raises:
ImportError – If huggingface_hub is not installed.
ValueError – If the dataset is not a valid sdialog dataset.
- static from_folder(path: str, type: str = 'auto', txt_template: str = '{speaker}: {text}', csv_speaker_col: int | str = 'speaker', csv_text_col: int | str = 'text', collapse_consecutive_speakers: bool = False, collapse_separator: str = '\n') List[Dialog]
Loads all dialogues from a folder.
- Parameters:
path (str) – Path to the directory containing dialogue files.
type (str) –
"json","txt","csv","tsv", or"auto"(determined by file extension).txt_template (str) – Template for parsing text dialogue turns (default “{speaker}: {text}”).
csv_speaker_col (Union[int, str]) – Column identifier for speaker in CSV/TSV files (can be index or header name).
csv_text_col (Union[int, str]) – Column identifier for text in CSV/TSV files (can be index or header name).
collapse_consecutive_speakers (bool) – If True, collapses consecutive turns by the same speaker into one turn.
collapse_separator (str) – String used to join texts when collapsing consecutive turns (default:
"\n").
- Returns:
A list of loaded dialogue objects from the folder.
- Return type:
List[Dialog]
- Raises:
ValueError – If the path is not a directory.
- static from_file(path: str, type: str = 'auto', txt_template: str = '{speaker}: {text}', csv_speaker_col: int | str = 'speaker', csv_text_col: int | str = 'text', collapse_consecutive_speakers: bool = False, collapse_separator: str = '\n') Dialog | List[Dialog]
Loads a dialogue from a file.
- Parameters:
path (str) – Path to the dialogue file or directory. In case of a directory, all dialogues in the directory will be loaded and returned as a list of Dialog objects.
type (str) –
"json","txt","csv","tsv", or"auto"(determined by file extension).txt_template (str) – Template for parsing text dialogue turns (default “{speaker}: {text}”).
csv_speaker_col (Union[int, str]) – Column identifier for speaker in CSV/TSV files (can be index or header name).
csv_text_col (Union[int, str]) – Column identifier for text in CSV/TSV files (can be index or header name).
collapse_consecutive_speakers (bool) – If True, collapses consecutive turns by the same speaker into one turn.
collapse_separator (str) – String used to join texts when collapsing consecutive turns (default:
"\n").
- Returns:
The loaded dialogue object.
- Return type:
- Raises:
ValueError – If the file format is not recognized or if required columns are missing.
- static from_str(dialog_text: str, template: str = '{speaker}: {text}', default_speakers: List[str] = None, id: str | int = None) Dialog
Creates a Dialog object from a string representation of a dialogue.
- Parameters:
dialog_text (str) – The dialogue text, with each turn on a new line.
template (str) – The template for parsing each turn. Default is “{speaker}: {text}”.
default_speakers (List[str]) – Optional list of default speakers to use if no present in the text or template. The speakers will be assigned in order of appearance, in alternating turns. Default is None (speaker field will be empty in each turn).
id (Union[str, int]) – Optional ID for the dialogue. If not provided, a universal ID will be generated.
- Returns:
The created Dialog object.
- Return type:
- static from_dict(data: dict)
Creates a Dialog object from a dictionary.
- Parameters:
data (dict) – The dictionary containing dialogue data.
- Returns:
The created Dialog object.
- Return type:
- from_json(json_str: str)
Creates a Dialog object from a JSON string.
- Parameters:
json_str (str) – The JSON string containing dialogue data.
- Returns:
The created Dialog object.
- Return type:
- rename_speaker(old_name: str, new_name: str, case_sensitive: bool = False, in_events: bool = True) Dialog
Renames all occurrences of a speaker in the dialogue’s turns (and optionally events).
- Parameters:
old_name (str) – The current speaker name to replace.
new_name (str) – The new speaker name.
case_sensitive (bool) – Whether to match speaker names case-sensitively (default: False).
in_events (bool) – Whether to also rename in events’ agent fields (default: True).
- Returns:
Self (the same Dialog instance) after in-place modification.
- Return type:
- get_speakers(keep_case: bool = True) List[str]
Returns a list of unique speaker names in the dialogue.
- Parameters:
keep_case (bool) – Whether to keep the original case of speaker names or convert them to lowercase (default: True).
- Returns:
A list of unique speaker names.
- Return type:
List[str]
- class sdialog.Context(*, location: str | None = None, datetime: str | None = None, environment: str | None = None, objects: str | List[str] | None = None, participants_shared_knowledge: str | None = None, circumstances: str | List[str] | None = None, goals: str | List[str] | None = None, constraints: str | List[str] | None = None, topics: str | List[str] | None = None, style_guidelines: str | List[str] | None = None, notes: str | None = None)
Bases:
BaseAttributeModelDialogue-shared context class.
- Parameters:
location (Optional[str]) – Physical or virtual location where the dialogue occurs.
datetime (Optional[str]) – Timestamp or temporal setting relevant to the dialogue.
environment (Optional[str]) – Physical environment description, environmental conditions, or contextual atmosphere.
objects (Optional[Union[str, List[str]]]) – Relevant objects (single value or list of values).
participants_shared_knowledge (Optional[str]) – Information all participants are assumed to know.
circumstances (Optional[Union[str, List[str]]]) – Situational circumstances impacting the dialogue.
goals (Optional[Union[str, List[str]]]) – Stated or implicit goals of the participants.
constraints (Optional[Union[str, List[str]]]) – Limitations or constraints affecting actions or dialogue.
topics (Optional[Union[str, List[str]]]) – Main topics or themes (single or list).
style_guidelines (Optional[Union[str, List[str]]]) – Stylistic or formatting guidelines to follow.
notes (Optional[str]) – Additional free-form contextual notes.
- location: str | None
- datetime: str | None
- environment: str | None
- objects: str | List[str] | None
- circumstances: str | List[str] | None
- goals: str | List[str] | None
- constraints: str | List[str] | None
- topics: str | List[str] | None
- style_guidelines: str | List[str] | None
- notes: str | None
- static attributes(_cls=<class 'sdialog.Context'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.Instruction(*, text: str = None, events: Event | List[Event] | None = None)
Bases:
BaseModelRepresents an instruction to an agent, optionally with associated events.
- Parameters:
- text: str
sdialog.base
Model foundations for sdialog root module.
Provides:
Metadata: common provenance fields (version, timestamp, ids).
BaseAttributeModel: pydantic-based abstract base for persona/context-like objects with cloning, serialization, and dynamic subclass discovery utilities.
- class sdialog.base.Metadata(*, version: str | None = <factory>, timestamp: str | None = <factory>, model: str | Dict | None = None, seed: int | None = None, id: int | str | None = <factory>, parentId: int | str | None = None, className: str = None, notes: str | None = None)
Bases:
BaseModelMetadata class for object, context and other objects.
- Parameters:
version (Optional[str]) – Version of the object format (matches sdialog version).
timestamp (Optional[str]) – Timestamp of when the object was generated.
model (Optional[str]) – The model used to generate the object.
seed (Optional[int]) – The random seed used for object generation.
id (Optional[Union[int, str]]) – Unique identifier for the object.
parentId (Optional[Union[int, str]]) – ID of the parent object, if any.
notes (Optional[str]) – Free-text notes or comments about the generated object.
className (str) – The class name of the object (a subclass of BaseAttributeModel).
- version: str | None
- timestamp: str | None
- model: str | Dict | None
- seed: int | None
- id: int | str | None
- parentId: int | str | None
- className: str
- notes: str | None
- class sdialog.base.BaseAttributeModel
Bases:
BaseModel,ABCBase class for defining an attribute-based object.
Features:
Strict field control.
Automatic static attributes() helper listing declared fields.
Metadata tracking (id, parentId, version, timestamp).
Clone with optional field overrides and proper lineage linkage.
JSON / prompt serialization helpers.
- clone(new_id: int = None, **kwargs) BaseAttributeModel
Create a deep copy of this object with optional attribute overrides.
Metadata handling:
parentId of clone = original id (if present).
id of clone = new_id if provided else a new universal id.
Other metadata fields are copied.
- Parameters:
new_id (Optional[int]) – Optional new unique id for the clone.
kwargs (Any) – Field overrides applied to the cloned instance.
- Returns:
Independent cloned instance.
- Return type:
- description() str
Returns a string description of the object’s attributes.
- Returns:
Description of the object.
- Return type:
str
- print()
Pretty-prints the object, including its metadata information.
- json(string: bool = False, indent=2, output_metadata: bool = True)
Serializes the object to JSON.
- Parameters:
string (bool) – If True, returns a JSON string; otherwise, returns a dict.
indent (int) – Indentation level for pretty-printing.
output_metadata (bool) – Include the metadata in the serialization.
- Returns:
The serialized object.
- Return type:
Union[str, dict]
- prompt() str
Returns the textual representation of the object, used as part of the system prompt.
- Returns:
JSON string without metadata (intended for prompt inclusion).
- Return type:
str
- to_file(path: str, makedir: bool = True)
Saves the object to a file in either JSON or plain text format.
- Parameters:
path (str) – Output file path.
makedir (bool) – If True, creates parent directories as needed.
- static from_file(path: str, object_class: BaseAttributeModel | None = None)
Load an object from a JSON file.
- Parameters:
path (str) – Path to file.
object_class (Optional[BaseAttributeModel]) – Optional explicit subclass to force (bypasses className dispatch).
- Returns:
Loaded instance.
- Return type:
- Raises:
ValueError – If metadata/className is missing or unknown.
- static from_dict(data: dict, object_class: BaseAttributeModel | None = None)
Create an object instance from a dictionary.
Dispatch rules:
If object_class is provided and is a BaseAttributeModel subclass, it is used directly.
Else uses _metadata.className to resolve a registered subclass.
- Parameters:
data (dict) – Source dictionary (must include _metadata.className).
object_class (Optional[BaseAttributeModel]) – Optional explicit subclass.
- Returns:
Instantiated object.
- Return type:
- Raises:
ValueError – If className missing or cannot be resolved.
- static from_json(json_str: str, object_class: BaseAttributeModel | None = None)
Create an object instance from a JSON string.
- Parameters:
json_str (str) – JSON serialization including _metadata.className.
object_class (Optional[BaseAttributeModel]) – Optional explicit subclass override.
- Returns:
Instantiated object.
- Return type:
sdialog.personas
This module provides classes for defining personas (character profiles) and simulating agents that role-play these personas in synthetic dialogue generation.
- sdialog.personas.BasePersona
Abstract base class for defining personas. Alias for
sdialog.base.BaseAttributeModel
- class sdialog.personas.Persona(*, name: str = '', age: int | str = '', race: str = '', gender: str = '', language: str = 'English', role: str = '', background: str = '', personality: str = '', circumstances: str = '', rules: str = '')
Bases:
BaseAttributeModelStandard persona class with common attributes for role-play.
- Parameters:
name (str) – Name of the persona.
age (Union[int, str]) – Age of the persona (can be an int or a descriptive string like “middle-aged”).
race (str) – Race / ethnicity of the persona.
gender (str) – Gender of the persona.
language (str) – Preferred language of communication.
role (str) – Role, profession, or primary identity descriptor.
background (str) – Background or life history summary.
personality (str) – Personality traits summary (free text).
circumstances (str) – Current situational context (e.g., “recently moved”, “under stress”).
rules (str) – Constraints, style or behavioral rules to enforce.
- name: str
- age: int | str
- race: str
- gender: str
- language: str
- role: str
- background: str
- personality: str
- circumstances: str
- rules: str
- static attributes(_cls=<class 'sdialog.personas.Persona'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.ExtendedPersona(*, name: str = '', age: int | str = '', race: str = '', gender: str = '', language: str = 'English', weight: str | int | float = '', height: str | int | float = '', voice_characteristics: str = '', occupation: str = '', education: str = '', socioeconomic_status: str = '', interests: str = '', hobbies: str = '', politeness: str = '', forgetfulness: str = '', attentiveness: str = '', communication_style: str = '', empathy_level: str = '', political_views: str = '', religious_beliefs: str = '')
Bases:
BaseAttributeModelExtended persona class with additional demographic, personality, and background attributes.
- Parameters:
name (str) – Name of the persona.
age (Union[int, str]) – Age (numeric or descriptive string).
race (str) – Race / ethnicity.
gender (str) – Gender identity.
language (str) – Preferred language.
weight (Union[str, int, float]) – Weight (numeric with unit or descriptive string).
height (Union[str, int, float]) – Height (numeric with unit or descriptive string).
voice_characteristics (str) – Voice, accent, tone, pacing, etc.
occupation (str) – Current occupation or professional role.
education (str) – Education level or academic background.
socioeconomic_status (str) – Socioeconomic status descriptor.
interests (str) – General interests (comma-separated or free text).
hobbies (str) – Hobbies (comma-separated or free text).
politeness (str) – Politeness style/level.
forgetfulness (str) – Forgetfulness tendency.
attentiveness (str) – Attentiveness or focus tendency.
communication_style (str) – Style of communication (e.g., direct, verbose).
empathy_level (str) – Empathy level or descriptor.
political_views (str) – Political alignment (e.g., conservative, moderate, apolitical).
religious_beliefs (str) – Religious stance (e.g., religious, agnostic, atheist).
- name: str
- age: int | str
- race: str
- gender: str
- language: str
- weight: str | int | float
- height: str | int | float
- voice_characteristics: str
- occupation: str
- education: str
- socioeconomic_status: str
- interests: str
- hobbies: str
- politeness: str
- forgetfulness: str
- attentiveness: str
- communication_style: str
- empathy_level: str
- political_views: str
- religious_beliefs: str
- static attributes(_cls=<class 'sdialog.personas.ExtendedPersona'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Customer(*, name: str = '', age: int | str = '', gender: str = '', language: str = 'English', customer_id: str | int = '', occupation: str = '', account_tenure: str = '', membership_level: str = '', loyalty_status: str = '', fidelity_score: str | float | int = '', issue: str = '', issue_category: str = '', issue_description: str = '', issue_history: str = '', desired_outcome: str = '', knowledge_domain: str = '', technical_expertise: str = '', sentiment: str = '', anger_level: str = '', tiredness: str = '', patience_level: str = '', politeness: str = '', personality: str = '', instruction_following: str = '', forgetfulness: str = '', times_called: int | str = '', preferred_channel: str = '', prior_interactions_summary: str = '', urgency: str = '', rules: str = '')
Bases:
BaseAttributeModelPersona for a customer in a customer service interaction.
- Parameters:
name (str) – Customer name.
age (Union[int, str]) – Customer age (numeric or descriptive).
gender (str) – Customer gender.
language (str) – Preferred language.
customer_id (Union[str, int]) – Internal customer identifier.
occupation (str) – Customer occupation.
account_tenure (str) – How long they have been a customer (e.g., “2 years”).
membership_level (str) – Plan/tier (e.g., basic, premium).
loyalty_status (str) – Loyalty descriptor (e.g., loyal, at-risk).
fidelity_score (Union[str, float, int]) – Loyalty score (numeric or descriptive).
issue (str) – Short summary of current problem.
issue_category (str) – High-level category (billing, technical, etc.).
issue_description (str) – Detailed issue description.
issue_history (str) – Brief summary of related past issues.
desired_outcome (str) – Customer’s desired resolution / goal.
knowledge_domain (str) – Subject/domain familiarity (e.g., novice, expert).
technical_expertise (str) – Legacy field for backward compatibility.
sentiment (str) – Overall emotional tone (e.g., frustrated, neutral).
anger_level (str) – Anger intensity descriptor.
tiredness (str) – Fatigue level.
patience_level (str) – Patience descriptor.
politeness (str) – Politeness style (e.g., polite, curt).
personality (str) – Personality descriptor (e.g., analytical).
instruction_following (str) – Likelihood of following instructions.
forgetfulness (str) – Tendency to forget prior guidance.
times_called (Union[int, str]) – Number of prior contacts (numeric or descriptive).
preferred_channel (str) – Preferred support channel.
prior_interactions_summary (str) – Summary of earlier interactions.
urgency (str) – Perceived urgency (e.g., low, high).
rules (str) – Constraints or special handling notes.
- name: str
- age: int | str
- gender: str
- language: str
- customer_id: str | int
- occupation: str
- account_tenure: str
- membership_level: str
- loyalty_status: str
- fidelity_score: str | float | int
- issue: str
- issue_category: str
- issue_description: str
- issue_history: str
- desired_outcome: str
- knowledge_domain: str
- technical_expertise: str
- sentiment: str
- anger_level: str
- tiredness: str
- patience_level: str
- politeness: str
- personality: str
- instruction_following: str
- forgetfulness: str
- times_called: int | str
- preferred_channel: str
- prior_interactions_summary: str
- urgency: str
- rules: str
- static attributes(_cls=<class 'sdialog.personas.Customer'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.SupportAgent(*, name: str = '', language: str = 'English', agent_id: str | int = '', role: str = 'Customer Support Agent', experience_years: int | str = '', product_scope: str = '', product_knowledge_level: str = '', communication_style: str = '', empathy_level: str = '', politeness: str = '', resolution_authority_level: str = '', escalation_policy: str = '', average_handle_time: int | float | str = '', adherence_notes: str = '', stress_tolerance: str = '', performance_notes: str = '', rules: str = '')
Bases:
BaseAttributeModelPersona for a customer service / support agent.
- Parameters:
name (str) – Agent name.
language (str) – Working language.
agent_id (Union[str, int]) – Internal agent identifier.
role (str) – Agent role or queue designation.
experience_years (Union[int, str]) – Years (or range) of support experience.
product_scope (str) – Products or domains covered.
product_knowledge_level (str) – Knowledge depth (e.g., basic, expert).
communication_style (str) – Communication style (e.g., concise, empathetic).
empathy_level (str) – Empathy descriptor.
politeness (str) – Politeness level descriptor.
resolution_authority_level (str) – Authority level for resolutions/escalations.
escalation_policy (str) – Summary of escalation criteria/process.
average_handle_time (Union[int, float, str]) – Typical handling time (e.g., “6m”).
adherence_notes (str) – Notes on process or QA adherence.
stress_tolerance (str) – Stress handling capability descriptor.
performance_notes (str) – Performance KPIs or evaluation notes.
rules (str) – Internal rules, compliance reminders, or constraints.
- name: str
- language: str
- agent_id: str | int
- role: str
- experience_years: int | str
- product_scope: str
- product_knowledge_level: str
- communication_style: str
- empathy_level: str
- politeness: str
- resolution_authority_level: str
- escalation_policy: str
- average_handle_time: int | float | str
- adherence_notes: str
- stress_tolerance: str
- performance_notes: str
- rules: str
- static attributes(_cls=<class 'sdialog.personas.SupportAgent'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Patient(*, name: str = '', age: int | str = None, race: str = '', gender: str = '', language: str = 'English', forgetfulness: str | float = '', formality: str | float = '', hurriedness: str | float = '', openness: str | float = '', height: str | int | float = '', weight: str | int | float = '', occupation: str = '', marital_status: str = '', insurance: str = '', reason_for_visit: str = '', symptoms: str | List[str] = '', medical_history: str | List[str] = '', medical_conditions: str | List[str] = '', medications: str | List[str] = '', allergies: str | List[str] = '', family_history: str | List[str] = '')
Bases:
BaseAttributeModelPatient persona with essential / minimal plus behavioral and demographic attributes for dialogue generation.
- Parameters:
name (str) – Patient name.
age (Union[int, str]) – Patient age (numeric or descriptive).
race (str) – Race / ethnicity.
gender (str) – Gender identity.
language (str) – Preferred communication language.
forgetfulness (Union[str, float]) – Forgetfulness tendency (qualitative or numeric).
formality (Union[str, float]) – Formality of speech (qualitative or numeric scale).
hurriedness (Union[str, float]) – Degree of impatience / hurriedness.
openness (Union[str, float]) – Openness to share information.
height (Union[str, int, float]) – Height (numeric with unit or descriptive).
weight (Union[str, int, float]) – Weight (numeric with unit or descriptive).
occupation (str) – Occupation or employment status.
marital_status (str) – Marital status.
insurance (str) – Insurance provider / status.
reason_for_visit (str) – Chief complaint / presenting problem.
symptoms (Union[str, List[str]]) – Reported symptoms.
medical_history (Union[str, List[str]]) – Past medical history (string or list of conditions).
medical_conditions (Union[str, List[str]]) – Known diagnosed conditions (string or list).
medications (Union[str, List[str]]) – Current medications (string or list).
allergies (Union[str, List[str]]) – Known allergies (string or list).
family_history (Union[str, List[str]]) – Family medical history (string or list).
- name: str
- age: int | str
- race: str
- gender: str
- language: str
- forgetfulness: str | float
- formality: str | float
- hurriedness: str | float
- openness: str | float
- height: str | int | float
- weight: str | int | float
- occupation: str
- marital_status: str
- insurance: str
- reason_for_visit: str
- symptoms: str | List[str]
- medical_history: str | List[str]
- medical_conditions: str | List[str]
- medications: str | List[str]
- allergies: str | List[str]
- family_history: str | List[str]
- static attributes(_cls=<class 'sdialog.personas.Patient'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.ExtendedPatient(*, name: str = '', age: int | str = '', race: str = '', gender: str = '', language: str = 'English', weight: str | int | float = '', height: str | int | float = '', voice_characteristics: str = '', occupation: str = '', education: str = '', socioeconomic_status: str = '', interests: str = '', hobbies: str = '', politeness: str = '', forgetfulness: str = '', attentiveness: str = '', communication_style: str = '', empathy_level: str = '', political_views: str = '', religious_beliefs: str = '', reason_for_visit: str = '', symptoms: str | List[str] = '', vital_signs: str = '', health_literacy: str = '', medical_conditions: str | List[str] = '', medications: str | List[str] = '', allergies: str | List[str] = '', family_history: str | List[str] = '')
Bases:
ExtendedPersonaExtendedPatient persona with additional health-related attributes. Inherits all attributes from ExtendedPersona plus medical context fields.
- Parameters:
reason_for_visit (str) – Chief complaint or reason for consultation.
symptoms (Union[str, List[str]]) – Reported symptoms (free text or summarized list).
vital_signs (str) – Vital signs summary (e.g., “BP 120/80, HR 72”).
health_literacy (str) – Health literacy level descriptor.
medical_conditions (Union[str, List[str]]) – Known or chronic conditions (free text summary).
medications (Union[str, List[str]]) – Current medications summary.
allergies (Union[str, List[str]]) – Allergy list / summary.
family_history (Union[str, List[str]]) – Family medical history summary.
- reason_for_visit: str
- symptoms: str | List[str]
- vital_signs: str
- health_literacy: str
- medical_conditions: str | List[str]
- medications: str | List[str]
- allergies: str | List[str]
- family_history: str | List[str]
- static attributes(_cls=<class 'sdialog.personas.ExtendedPatient'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Doctor(*, name: str = '', age: int | str = '', race: str = '', gender: str = '', language: str = 'English', years_of_experience: int | str = '', specialty: str = '', forgetfulness: str = '', formality: str = '', hurriedness: str = '', openness: str = '')
Bases:
BaseAttributeModelDoctor persona with essential professional and behavioral attributes.
- Parameters:
name (str) – Doctor’s name.
age (Union[int, str]) – Doctor’s age (numeric or descriptive).
race (str) – Race / ethnicity.
gender (str) – Gender identity.
language (str) – Working language.
years_of_experience (Union[int, str]) – Years (or range) of medical practice.
specialty (str) – Medical specialty (as spelled in this class).
forgetfulness (str) – Forgetfulness tendency.
formality (str) – Formality level in communication.
hurriedness (str) – Degree of time pressure / haste.
openness (str) – Openness / approachability.
- name: str
- age: int | str
- race: str
- gender: str
- language: str
- years_of_experience: int | str
- specialty: str
- forgetfulness: str
- formality: str
- hurriedness: str
- openness: str
- static attributes(_cls=<class 'sdialog.personas.Doctor'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.ExtendedDoctor(*, name: str = '', age: int | str = '', race: str = '', gender: str = '', language: str = 'English', weight: str | int | float = '', height: str | int | float = '', voice_characteristics: str = '', occupation: str = '', education: str = '', socioeconomic_status: str = '', interests: str = '', hobbies: str = '', politeness: str = '', forgetfulness: str = '', attentiveness: str = '', communication_style: str = '', empathy_level: str = '', political_views: str = '', religious_beliefs: str = '', specialty: str = '', years_of_experience: int | str = '', certifications: str = '', work_experience: str = '')
Bases:
ExtendedPersonaExtendedDoctor persona adding professional credentials. Inherits all attributes from ExtendedPersona plus, the following ones.
- Parameters:
specialty (str) – Medical specialty / domain focus.
years_of_experience (Union[int, str]) – Years (or range) of clinical experience.
certifications (str) – Professional certifications / board statuses.
work_experience (str) – Summary of prior practice settings / roles.
- specialty: str
- years_of_experience: int | str
- certifications: str
- work_experience: str
- static attributes(_cls=<class 'sdialog.personas.ExtendedDoctor'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Nurse(*, name: str = '', age: int | str = '', gender: str = '', language: str = 'English', years_of_experience: int | str = '', specialty: str = '', shift: str = '', empathy_level: str = '', politeness: str = '', attentiveness: str = '', stress_tolerance: str = '')
Bases:
BaseAttributeModelNurse persona for healthcare dialogues.
- Parameters:
name (str) – Nurse name.
age (Union[int, str]) – Nurse age (numeric or descriptive).
gender (str) – Gender identity.
language (str) – Working language.
years_of_experience (Union[int, str]) – Years of nursing experience.
specialty (str) – Nursing specialty.
shift (str) – Typical work shift.
empathy_level (str) – Empathy descriptor.
politeness (str) – Politeness style.
attentiveness (str) – Attentiveness descriptor.
stress_tolerance (str) – Stress handling capability.
- name: str
- age: int | str
- gender: str
- language: str
- years_of_experience: int | str
- specialty: str
- shift: str
- empathy_level: str
- politeness: str
- attentiveness: str
- stress_tolerance: str
- static attributes(_cls=<class 'sdialog.personas.Nurse'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Pharmacist(*, name: str = '', age: int | str = '', gender: str = '', language: str = 'English', years_of_experience: int | str = '', workplace: str = '', expertise: str = '', politeness: str = '', communication_style: str = '')
Bases:
BaseAttributeModelPharmacist persona for healthcare dialogues.
- Parameters:
name (str) – Pharmacist name.
age (Union[int, str]) – Pharmacist age (numeric or descriptive).
gender (str) – Gender identity.
language (str) – Working language.
years_of_experience (Union[int, str]) – Years of pharmacy experience.
workplace (str) – Pharmacy or hospital name.
expertise (str) – Pharmaceutical expertise.
politeness (str) – Politeness style.
communication_style (str) – Communication style.
- name: str
- age: int | str
- gender: str
- language: str
- years_of_experience: int | str
- workplace: str
- expertise: str
- politeness: str
- communication_style: str
- static attributes(_cls=<class 'sdialog.personas.Pharmacist'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Caregiver(*, name: str = '', age: int | str = '', gender: str = '', relationship: str = '', experience_years: int | str = '', empathy_level: str = '', attentiveness: str = '')
Bases:
BaseAttributeModelCaregiver persona for healthcare dialogues.
- Parameters:
name (str) – Caregiver name.
age (Union[int, str]) – Caregiver age (numeric or descriptive).
gender (str) – Gender identity.
relationship (str) – Relationship to care recipient.
experience_years (Union[int, str]) – Years of caregiving experience.
empathy_level (str) – Empathy descriptor.
attentiveness (str) – Attentiveness descriptor.
- name: str
- age: int | str
- gender: str
- relationship: str
- experience_years: int | str
- empathy_level: str
- attentiveness: str
- static attributes(_cls=<class 'sdialog.personas.Caregiver'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Teacher(*, name: str = '', age: int | str = '', gender: str = '', subject: str = '', years_of_experience: int | str = '', education_level: str = '', politeness: str = '', communication_style: str = '')
Bases:
BaseAttributeModelTeacher persona for education dialogues.
- Parameters:
name (str) – Teacher name.
age (Union[int, str]) – Teacher age (numeric or descriptive).
gender (str) – Gender identity.
subject (str) – Teaching subject.
years_of_experience (Union[int, str]) – Years of teaching experience.
education_level (str) – Highest degree.
politeness (str) – Politeness style.
communication_style (str) – Communication style.
- name: str
- age: int | str
- gender: str
- subject: str
- years_of_experience: int | str
- education_level: str
- politeness: str
- communication_style: str
- static attributes(_cls=<class 'sdialog.personas.Teacher'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Student(*, name: str = '', age: int | str = '', gender: str = '', grade_level: str = '', major: str = '', interests: str = '', politeness: str = '')
Bases:
BaseAttributeModelStudent persona for education dialogues.
- Parameters:
name (str) – Student name.
age (Union[int, str]) – Student age (numeric or descriptive).
gender (str) – Gender identity.
grade_level (str) – Grade or year.
major (str) – Major or focus area.
interests (str) – Interests.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- grade_level: str
- major: str
- interests: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.Student'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.AcademicAdvisor(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', specialty: str = '', politeness: str = '')
Bases:
BaseAttributeModelAcademicAdvisor persona for education dialogues.
- Parameters:
name (str) – Advisor name.
age (Union[int, str]) – Advisor age (numeric or descriptive).
gender (str) – Gender identity.
years_of_experience (Union[int, str]) – Years of advising experience.
specialty (str) – Advising specialty.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- years_of_experience: int | str
- specialty: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.AcademicAdvisor'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.FinancialAdvisor(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', certifications: str = '', specialty: str = '', politeness: str = '')
Bases:
BaseAttributeModelFinancialAdvisor persona for finance dialogues.
- Parameters:
name (str) – Advisor name.
age (Union[int, str]) – Advisor age (numeric or descriptive).
gender (str) – Gender identity.
years_of_experience (Union[int, str]) – Years of financial advising experience.
certifications (str) – Certifications.
specialty (str) – Financial specialty.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- years_of_experience: int | str
- certifications: str
- specialty: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.FinancialAdvisor'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Banker(*, name: str = '', age: int | str = '', gender: str = '', branch: str = '', years_of_experience: int | str = '', politeness: str = '')
Bases:
BaseAttributeModelBanker persona for finance dialogues.
- Parameters:
name (str) – Banker name.
age (Union[int, str]) – Banker age (numeric or descriptive).
gender (str) – Gender identity.
branch (str) – Bank branch.
years_of_experience (Union[int, str]) – Years of banking experience.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- branch: str
- years_of_experience: int | str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.Banker'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.InsuranceAgent(*, name: str = '', age: int | str = '', gender: str = '', company: str = '', years_of_experience: int | str = '', specialty: str = '', politeness: str = '')
Bases:
BaseAttributeModelInsuranceAgent persona for finance dialogues.
- Parameters:
name (str) – Agent name.
age (Union[int, str]) – Agent age (numeric or descriptive).
gender (str) – Gender identity.
company (str) – Insurance company.
years_of_experience (Union[int, str]) – Years of insurance experience.
specialty (str) – Insurance specialty.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- company: str
- years_of_experience: int | str
- specialty: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.InsuranceAgent'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.StoreManager(*, name: str = '', age: int | str = '', gender: str = '', store_name: str = '', years_of_experience: int | str = '', politeness: str = '')
Bases:
BaseAttributeModelStoreManager persona for retail dialogues.
- Parameters:
name (str) – Manager name.
age (Union[int, str]) – Manager age (numeric or descriptive).
gender (str) – Gender identity.
store_name (str) – Store name.
years_of_experience (Union[int, str]) – Years of management experience.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- store_name: str
- years_of_experience: int | str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.StoreManager'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.SalesAssociate(*, name: str = '', age: int | str = '', gender: str = '', store_name: str = '', years_of_experience: int | str = '', politeness: str = '')
Bases:
BaseAttributeModelSalesAssociate persona for retail dialogues.
- Parameters:
name (str) – Associate name.
age (Union[int, str]) – Associate age (numeric or descriptive).
gender (str) – Gender identity.
store_name (str) – Store name.
years_of_experience (Union[int, str]) – Years of sales experience.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- store_name: str
- years_of_experience: int | str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.SalesAssociate'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Shopper(*, name: str = '', age: int | str = '', gender: str = '', shopping_goal: str = '', loyalty_status: str = '', politeness: str = '')
Bases:
BaseAttributeModelShopper persona for retail dialogues.
- Parameters:
name (str) – Shopper name.
age (Union[int, str]) – Shopper age (numeric or descriptive).
gender (str) – Gender identity.
shopping_goal (str) – Shopping goal.
loyalty_status (str) – Loyalty descriptor.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- shopping_goal: str
- loyalty_status: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.Shopper'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.HotelReceptionist(*, name: str = '', age: int | str = '', gender: str = '', hotel_name: str = '', years_of_experience: int | str = '', politeness: str = '')
Bases:
BaseAttributeModelHotelReceptionist persona for hospitality dialogues.
- Parameters:
name (str) – Receptionist name.
age (Union[int, str]) – Receptionist age (numeric or descriptive).
gender (str) – Gender identity.
hotel_name (str) – Hotel name.
years_of_experience (Union[int, str]) – Years of hospitality experience.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- hotel_name: str
- years_of_experience: int | str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.HotelReceptionist'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.TravelAgent(*, name: str = '', age: int | str = '', gender: str = '', agency_name: str = '', years_of_experience: int | str = '', politeness: str = '')
Bases:
BaseAttributeModelTravelAgent persona for travel dialogues.
- Parameters:
name (str) – Agent name.
age (Union[int, str]) – Agent age (numeric or descriptive).
gender (str) – Gender identity.
agency_name (str) – Travel agency name.
years_of_experience (Union[int, str]) – Years of travel experience.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- agency_name: str
- years_of_experience: int | str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.TravelAgent'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Tourist(*, name: str = '', age: int | str = '', gender: str = '', travel_goal: str = '', politeness: str = '')
Bases:
BaseAttributeModelTourist persona for travel dialogues.
- Parameters:
name (str) – Tourist name.
age (Union[int, str]) – Tourist age (numeric or descriptive).
gender (str) – Gender identity.
travel_goal (str) – Travel goal.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- travel_goal: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.Tourist'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Lawyer(*, name: str = '', age: int | str = '', gender: str = '', specialty: str = '', years_of_experience: int | str = '', politeness: str = '')
Bases:
BaseAttributeModelLawyer persona for legal dialogues.
- Parameters:
name (str) – Lawyer name.
age (Union[int, str]) – Lawyer age (numeric or descriptive).
gender (str) – Gender identity.
specialty (str) – Legal specialty.
years_of_experience (Union[int, str]) – Years of legal experience.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- specialty: str
- years_of_experience: int | str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.Lawyer'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Paralegal(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', politeness: str = '')
Bases:
BaseAttributeModelParalegal persona for legal dialogues.
- Parameters:
name (str) – Paralegal name.
age (Union[int, str]) – Paralegal age (numeric or descriptive).
gender (str) – Gender identity.
years_of_experience (Union[int, str]) – Years of paralegal experience.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- years_of_experience: int | str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.Paralegal'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.LegalClient(*, name: str = '', age: int | str = '', gender: str = '', case_type: str = '', politeness: str = '')
Bases:
BaseAttributeModelLegalClient persona for legal dialogues.
- Parameters:
name (str) – Client name.
age (Union[int, str]) – Client age (numeric or descriptive).
gender (str) – Gender identity.
case_type (str) – Type of legal case.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- case_type: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.LegalClient'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.ITSupportSpecialist(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', expertise_area: str = '', politeness: str = '')
Bases:
BaseAttributeModelITSupportSpecialist persona for tech support dialogues.
- Parameters:
name (str) – Specialist name.
age (Union[int, str]) – Specialist age (numeric or descriptive).
gender (str) – Gender identity.
years_of_experience (Union[int, str]) – Years of IT support experience.
expertise_area (str) – Area of technical expertise.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- years_of_experience: int | str
- expertise_area: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.ITSupportSpecialist'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.HelpdeskTechnician(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', politeness: str = '')
Bases:
BaseAttributeModelHelpdeskTechnician persona for tech support dialogues.
- Parameters:
name (str) – Technician name.
age (Union[int, str]) – Technician age (numeric or descriptive).
gender (str) – Gender identity.
years_of_experience (Union[int, str]) – Years of helpdesk experience.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- years_of_experience: int | str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.HelpdeskTechnician'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.EndUser(*, name: str = '', age: int | str = '', gender: str = '', device_type: str = '', issue_description: str = '', politeness: str = '')
Bases:
BaseAttributeModelEndUser persona for tech support dialogues.
- Parameters:
name (str) – End user name.
age (Union[int, str]) – End user age (numeric or descriptive).
gender (str) – Gender identity.
device_type (str) – Type of device used.
issue_description (str) – Description of technical issue.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- device_type: str
- issue_description: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.EndUser'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.CivilServant(*, name: str = '', age: int | str = '', gender: str = '', department: str = '', years_of_experience: int | str = '', politeness: str = '')
Bases:
BaseAttributeModelCivilServant persona for government dialogues.
- Parameters:
name (str) – Civil servant name.
age (Union[int, str]) – Civil servant age (numeric or descriptive).
gender (str) – Gender identity.
department (str) – Government department.
years_of_experience (Union[int, str]) – Years of public service.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- department: str
- years_of_experience: int | str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.CivilServant'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.SocialWorker(*, name: str = '', age: int | str = '', gender: str = '', years_of_experience: int | str = '', specialty: str = '', politeness: str = '')
Bases:
BaseAttributeModelSocialWorker persona for public service dialogues.
- Parameters:
name (str) – Social worker name.
age (Union[int, str]) – Social worker age (numeric or descriptive).
gender (str) – Gender identity.
years_of_experience (Union[int, str]) – Years of social work experience.
specialty (str) – Social work specialty.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- years_of_experience: int | str
- specialty: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.SocialWorker'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Citizen(*, name: str = '', age: int | str = '', gender: str = '', inquiry_topic: str = '', politeness: str = '')
Bases:
BaseAttributeModelCitizen persona for government dialogues.
- Parameters:
name (str) – Citizen name.
age (Union[int, str]) – Citizen age (numeric or descriptive).
gender (str) – Gender identity.
inquiry_topic (str) – Topic of inquiry.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- inquiry_topic: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.Citizen'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Chef(*, name: str = '', age: int | str = '', gender: str = '', restaurant_name: str = '', years_of_experience: int | str = '', cuisine_specialty: str = '', politeness: str = '')
Bases:
BaseAttributeModelChef persona for food service dialogues.
- Parameters:
name (str) – Chef name.
age (Union[int, str]) – Chef age (numeric or descriptive).
gender (str) – Gender identity.
restaurant_name (str) – Restaurant name.
years_of_experience (Union[int, str]) – Years of culinary experience.
cuisine_specialty (str) – Cuisine specialty.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- restaurant_name: str
- years_of_experience: int | str
- cuisine_specialty: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.Chef'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.Waiter(*, name: str = '', age: int | str = '', gender: str = '', restaurant_name: str = '', years_of_experience: int | str = '', politeness: str = '')
Bases:
BaseAttributeModelWaiter persona for food service dialogues.
- Parameters:
name (str) – Waiter name.
age (Union[int, str]) – Waiter age (numeric or descriptive).
gender (str) – Gender identity.
restaurant_name (str) – Restaurant name.
years_of_experience (Union[int, str]) – Years of service experience.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- restaurant_name: str
- years_of_experience: int | str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.Waiter'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
- class sdialog.personas.RestaurantCustomer(*, name: str = '', age: int | str = '', gender: str = '', dietary_preferences: str = '', politeness: str = '')
Bases:
BaseAttributeModelRestaurantCustomer persona for food service dialogues.
- Parameters:
name (str) – Customer name.
age (Union[int, str]) – Customer age (numeric or descriptive).
gender (str) – Gender identity.
dietary_preferences (str) – Dietary preferences.
politeness (str) – Politeness style.
- name: str
- age: int | str
- gender: str
- dietary_preferences: str
- politeness: str
- static attributes(_cls=<class 'sdialog.personas.RestaurantCustomer'>, print=False)
List (or pretty-print) public attribute field names for this subclass.
- Parameters:
print (bool) – If True, pretty-prints instead of returning the list.
- Returns:
List of attribute names (if print=False).
- Return type:
List[str] | None
sdialog.agents
This module provides classes for Agents and related utilities for simulating persona-conditioned dialogue with Large Language Models (LLMs). Agents maintain structured conversation memory, integrate orchestrators that inject dynamic (persistent or ephemeral) system instructions, and expose inspection / interpretability hooks for token- and layer-level analysis and optional representation steering.
- sdialog.agents.final_response_tool(func=None)
Decorator to mark a tool whose raw output should be returned directly as the agent response (bypassing the post-tool LLM synthesis step).
This is useful for pre-formatted outputs (e.g., large markdown tables) where token-by-token regeneration by the LLM is unnecessary.
Usage:
from sdialog.agents import final_response_tool @final_response_tool def my_tool(...) -> str: ...
- Parameters:
func (Optional[callable]) – The tool function to mark.
- Returns:
Decorated function.
- Return type:
callable
- class sdialog.agents.Agent(persona: BaseAttributeModel = None, name: str | None = None, context: str | Context | None = None, first_utterance: str | List[str] | None = None, dialogue_details: str = '', response_details: str = 'Unless necessary, responses SHOULD be only one utterance long, and SHOULD NOT contain many questions or topics in one single turn.', example_dialogs: List[Dialog] | None = None, tools: List | None = None, think: bool = False, thinking_pattern: str | None = '<think>(.*?)</think>', can_finish: bool = True, orchestrators: BaseOrchestrator | List[BaseOrchestrator] | None = None, inspectors: Inspector | List[Inspector] | None = None, preprocessing_fn: callable | None = None, postprocess_fn: callable | None = None, system_prompt: str | None = None, model: str | langchain_core.language_models.base.BaseLanguageModel = None, **llm_kwargs)
Bases:
objectAgent that simulates a persona-driven conversational actor using an LLM.
This class wraps:
A persona (traits / role)
Optional context + exemplar dialogues
Orchestrators (dynamic / persistent injected instructions)
Interpretability hooks (token / layer events, steering)
Simple dialogue loop utilities (dialog_with)
Example:
from sdialog import Persona, Context from sdialog.agents import Agent # Create two agents user = Agent(persona=Persona(name="Dr. Nebula", role="Astrobotanist seeking alien spores"), name="Scientist") bot = Agent(persona=Persona(name="StationCore", role="Sarcastic habitat control AI"), name="Bot") # Create an (optional) context for the conversation context = Context(location="Orbiting Research Station Theta-9", environment="Zero-gravity greenhouse", objects=["alien spores", "hydroponic garden", "research equipment"]) # Create a dialogue dialog = user.dialog_with(bot, context=context) # Print dialog dialog.print()
- Parameters:
persona (BasePersona) – The persona to role-play.
name (Optional[str]) – Name of the agent (defaults to persona.name if not provided).
context (Optional[Union[str, Context]]) – Optional default context for the agent’s conversations.
first_utterance (Optional[Union[str, List[str]]]) – Optional fixed first utterance or list of possible first utterances.
dialogue_details (str) – Additional details about the dialogue.
response_details (str) – Instructions for response style.
example_dialogs (Optional[List[Dialog]]) – Optional list of default example dialogues as a reference for the agent.
tools (Optional[List[callable]] Tools decorated with
@final_response_toolreturn their raw output directly as the final agent response.) – List of functions to be used as tools by the agent (if supported by the LLM).think (bool) – If True, enables “thinking” segments in responses (if supported by the LLM).
thinking_pattern (Optional[str]) – Regex pattern to manually identify “thinking” segments in responses.
can_finish (bool) – If True, agent can end the conversation.
orchestrators (Optional[Union[BaseOrchestrator, List[BaseOrchestrator]]]) – Orchestrators for agent behavior.
inspectors (Optional[Union[Inspector, List[Inspector]]]) – Inspector(s) to add to the agent.
preprocessing_fn (Optional[callable]) – Optional function to preprocess each input utterance before calling the LLM (input string, output string).
postprocess_fn (Optional[callable]) – Optional function to postprocess each output utterance after calling the LLM (input string, output string).
system_prompt (Optional[str]) – Custom system prompt to use as-is (takes precedence over persona; if provided, persona is disabled and this prompt is used directly).
model (Union[str, BaseLanguageModel], optional) – The LLM or model name to use (defaults to config[“llm”][“model”]).
llm_kwargs (dict) – Additional parameters for the LLM.
- property memory: List[langchain_core.messages.base.BaseMessage]
The conversation memory as a list of messages.
- property base_model
Return the underlying base (wrapped) model object (e.g., a HuggingFace Transformers model).
- Resolution order:
ChatHuggingFace wrapper: self.llm.llm.pipeline.model
Objects exposing pipeline.model
Objects exposing model
If none are found, self.llm is returned as a fallback.
- property tokenizer
Return the underlying tokenizer object (e.g., a HuggingFace Transformers tokenizer).
- Resolution order:
ChatHuggingFace wrapper: self.llm.llm.tokenizer
Objects exposing pipeline.tokenizer
Objects exposing tokenizer
- __call__(utterance: str | List[langchain_core.messages.base.BaseMessage] = '', return_events: bool = False, current_dialog: Dialog = None) str
Processes an input utterance and generates a response.
- Parameters:
utterance (Union[str, List[BaseMessage]]) – The input utterance from the other agent or, in case of stateless operation, the full context as a list of Langchain messages.
return_events (bool) – If True, returns a list of events instead of just the response string.
current_dialog (Dialog) – The current dialog state as a Dialog object for orchestrators.
- Returns:
The agent’s response or events, or None if finished.
- Return type:
Union[str, List[Event], None]
- serve(host: str = '0.0.0.0', port: int = 1333, stateless: bool = True, log_level: str = 'info')
Starts a REST API server to interact with the agent.
- Parameters:
host (str) – Host address to bind the server to.
port (int) – Port number to listen on.
stateless (bool) – If True, the server does not maintain conversation state (as such the full context must be provided with each request).
log_level (str) – Logging level for the server.
- response_lookahead(message: str = None)
Generates a response without updating the agent’s memory.
If message is None, predicts the next reply given current memory.
If message is provided, predicts a reply to that hypothetical input.
Notes: - Orchestrators and inspectors are not invoked. - Tools may be called, but their outputs are not persisted. - Only postprocess_fn is applied (no preprocessing).
- Parameters:
message (Optional[str]) – The hypothetical message to reply to (optional).
- Returns:
The predicted response text.
- Return type:
str
- add_orchestrators(orchestrators)
Adds orchestrators to the agent.
- Parameters:
orchestrators (Union[BaseOrchestrator, List[BaseOrchestrator]]) – Orchestrator(s) to add.
- add_inspectors(inspectors)
Adds inspectors to the agent.
- clear_orchestrators()
Removes all orchestrators from the agent.
- clear_inspectors()
Removes all inspectors from the agent.
- instruct(instruction: str, persist: bool = False)
Adds a system instruction to the agent’s memory.
- Parameters:
instruction (str) – The instruction text.
persist (bool) – If True, instruction persists across turns.
- set_first_utterances(utterances: str | List[str])
Sets the agent’s first utterance(s) for dialogue initialization.
- Parameters:
utterances (Union[str, List[str]]) – The greeting(s) to use.
- get_name(default: str = 'Me') str
Returns the agent’s name.
- Parameters:
default (str) – Fallback name if agent has no explicit name.
- Returns:
The agent’s name.
- Return type:
str
- prompt() str
Returns the current system prompt.
- Returns:
The system prompt.
- Return type:
str
- json(string: bool = False, indent=None)
Serializes the agent’s configuration and persona to JSON.
- Parameters:
string (bool) – If True, returns a JSON string; otherwise, returns a dict.
indent (int) – Indentation level for pretty-printing.
- Returns:
The serialized agent.
- Return type:
Union[str, dict]
- reset(seed: int = None, context: str | Context = None, example_dialogs: List[Dialog] = None)
Resets the agent’s memory and orchestrators, optionally reseeding the LLM. Also clears interpretability state and components if any.
- Parameters:
seed – Random seed for reproducibility (if None, generated).
context – Optional context override.
example_dialogs – Optional replacement example dialogs for prompt regeneration.
- dialog_with(agent: Agent, context: str | Context = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, max_turns: int = 100, id: int = None, parent_id: int = None, seed: int = None, notes: str = None, keep_bar: bool = True)
Simulates a dialogue between this agent and another Agent.
- Parameters:
agent (Agent) – The other agent to converse with.
context (Optional[Union[str, Context]]) – The context for the dialogue (optional).
example_dialogs (Optional[List[Dialog]]) – Example dialogues to guide the conversation (optional).
scenario (Optional[Union[dict, str]]) – Optional scenario metadata for the dialogue.
max_turns (int) – Maximum number of dialogue turns.
id (int) – Dialogue ID.
parent_id (int) – ID of the parent dialogue, if any.
seed (int) – Random seed for reproducibility.
notes (str) – Optional notes to include in the dialogue.
keep_bar (bool) – If True, keeps the progress bar visible.
- Returns:
The generated dialogue object.
- Return type:
- memory_dump(as_dict: bool = False) list
Returns a copy of the agent’s memory (list of messages).
- Parameters:
as_dict (bool) – If True, returns list of message dicts (serialization-friendly).
- Returns:
Conversation memory snapshot.
- Return type:
list
- talk_with(agent: Agent, context: str | Context = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, max_turns: int = 100, id: int = None, parent_id: int = None, seed: int = None, notes: str = None, keep_bar: bool = True)
Alias for
Agent.dialog_with().
sdialog.orchestrators
This module provides base and concrete classes for orchestrating agent behavior during synthetic dialogue generation. Orchestrators can inject instructions, control agent responses, and manage dialogue flow for more complex scenarios.
- class sdialog.orchestrators.SimpleReflexOrchestrator(condition: callable, instruction: str, persistent: bool = False, event_label: str = None)
Bases:
BaseOrchestratorSimple reflex orchestrator that provides fixed instructions when a condition matches.
Example:
from sdialog.orchestrators import SimpleReflexOrchestrator from sdialog.agents import Agent from sdialog.personas import Persona # If the last utterance contains 'quest', steer to suggest party planning (tutorial style) reflex = SimpleReflexOrchestrator( condition=lambda utt: "quest" in utt.lower(), instruction="Acknowledge the quest idea, then suggest one concrete themed activity." ) bob = Agent(persona=Persona(name="Bob", role="dad")) | reflex alice = Agent(persona=Persona(name="Alice", role="daughter")) dialog = bob.talk_with(alice) dialog.print(orchestration=True)
- Parameters:
condition (callable) – Predicate function receiving the last utterance; returns True to trigger.
instruction (str) – Instruction text to return when condition is satisfied.
persistent (bool) – Whether orchestrator persists across turns.
event_label (str) – Optional event label override.
- instruct(dialog: List[Turn], utterance: str) str
Return the configured instruction if condition holds.
- Parameters:
dialog (List[Turn]) – Current dialog (unused except for extensibility).
utterance (str) – Last opposite-party utterance.
- Returns:
Instruction text or None.
- Return type:
Union[str, None]
- class sdialog.orchestrators.LengthOrchestrator(min: int = None, max: int = None, persistent: bool = False, event_label: str = None)
Bases:
BaseOrchestratorOrchestrator that encourages continuation or termination based on current number of turns.
Example:
from sdialog.orchestrators import LengthOrchestrator from sdialog.personas import Persona from sdialog.agents import Agent # Keep dialogue going at least 8 turns; try to wrap by turn 12 len_orch = LengthOrchestrator(min=8, max=12) planner = Agent(persona=Persona(name="Planner", role="organizer")) guest = Agent(persona=Persona(name="Guest", role="participant")) # Attach orchestrator to planner planner = planner | len_orch dialog = planner.dialog_with(guest) dialog.print(orchestration=True)
- Parameters:
min (int) – Minimum turns before allowing termination (encourages continuation if not reached).
max (int) – Maximum turns threshold after which termination is enforced.
persistent (bool) – Whether orchestrator persists.
event_label (str) – Optional event label.
- class sdialog.orchestrators.ChangeMindOrchestrator(probability: float = 0.3, reasons: str | List[str] = None, max_times: int = 1, persistent: bool = False, event_label: str = None)
Bases:
BaseOrchestratorOrchestrator that probabilistically injects a ‘change your mind’ instruction a limited number of times.
Example:
from sdialog.orchestrators import ChangeMindOrchestrator from sdialog.personas import Persona from sdialog.agents import Agent # 40% chance (once) to pivot party theme justification changer = ChangeMindOrchestrator(probability=0.4, reasons=["a better surprise", "budget constraints"], max_times=1) alice = Agent(persona=Persona(name="Alice", role="daughter")) bob = Agent(persona=Persona(name="Bob", role="dad")) # Attach orchestrator to Bob bob = bob | changer dialog = alice.dialog_with(bob) dialog.print(orchestration=True)
- Parameters:
probability (float) – Probability (0-1) each eligible turn to trigger a mind-change.
reasons (Union[str, List[str]]) – Optional reason(s) appended; single string or list.
max_times (int) – Maximum number of injections allowed.
persistent (bool) – Persistence flag.
event_label (str) – Event label override.
- reset()
Reset internal counter of times triggered.
- Returns:
None
- Return type:
None
- instruct(dialog: List[Turn], utterance: str) str
Possibly return a mind-change instruction based on probability and remaining allowance.
- Parameters:
dialog (List[Turn]) – Current dialog state.
utterance (str) – Last opposite-party utterance.
- Returns:
Instruction text or None.
- Return type:
Union[str, None]
- class sdialog.orchestrators.SimpleResponseOrchestrator(responses: List[str | Dict[str, str]], graph: Dict[str, str] = None, sbert_model: str = 'sergioburdisso/dialog2flow-joint-bert-base', top_k: int = 5)
Bases:
BaseOrchestratorOrchestrator that suggests next responses based on semantic similarity against a response set (or action graph).
Example:
from sdialog.orchestrators import SimpleResponseOrchestrator from sdialog.agents import Agent from sdialog.personas import Persona canned = [ "Could you clarify that?", "Let me summarize the plan.", "That sounds exciting—tell me more.", "Maybe we should adjust the theme.", "Can you give one concrete example?" ] sugg = SimpleResponseOrchestrator(responses=canned, top_k=3) guide = Agent(persona=Persona(name="Guide", role="facilitator")) user = Agent(persona=Persona(name="User", role="participant")) # Attach orchestrator to guide guide = guide | sugg dialog = guide.dialog_with(user) dialog.print(orchestration=True)
- Parameters:
responses (List[Union[str, Dict[str, str]]]) – List (plain strings) or dict (action -> response) entries.
graph (Dict[str, str]) – Optional action transition graph (current_action -> next_action).
sbert_model (str) – SentenceTransformer model name.
top_k (int) – Number of top similar responses/actions to surface.
- instruct(dialog: List[Turn], utterance: str) str
Build an Instruction containing candidate responses (and events for traceability).
- Parameters:
dialog (List[Turn]) – Current dialog.
utterance (str) – Last opposite-party utterance (unused directly; similarity uses lookahead / last turn).
- Returns:
Instruction object with suggestion list.
- Return type:
- class sdialog.orchestrators.InstructionListOrchestrator(instructions: List[str | Dict[int, str]], persistent: bool = False)
Bases:
BaseOrchestratorOrchestrator that dispenses predefined instructions sequentially or by turn index mapping.
Example:
from sdialog.orchestrators import InstructionListOrchestrator from sdialog.personas import Persona from sdialog.agents import Agent steps = [ "Greet warmly and ask about preferred theme.", "Ask for constraints (budget / space).", "Suggest one fitting activity.", "Confirm decisions and wrap up politely." ] coach = Agent(persona=Persona(name="Coach", role="planner")) client = Agent(persona=Persona(name="Client", role="requester")) # Attach a new instance of InstructionListOrchestrator to the coach coach = coach | InstructionListOrchestrator(steps) dialog = coach.dialog_with(client) dialog.print(orchestration=True)
- Parameters:
instructions (List[Union[str, Dict[int, str]]]) – Either list (indexed per agent turn) or dict mapping agent turn index -> instruction.
persistent (bool) – Persistence flag.
sdialog.orchestrators.base
Base classes for creating custom orchestrators to guide Agent behavior during dialogue generation.
- class sdialog.orchestrators.base.BaseOrchestrator(target_agent=None, persistent: bool = None, event_label: str = None)
Bases:
ABCBase abstract class to create orchestrators that control or influence Agent behavior during dialogue generation. Abstract method
instruct()must be implemented by subclasses.- Responsibilities:
Observe dialogue (agent memory) and produce turn-level instructions.
Optionally emit events describing guidance injected.
Support persistence across turns when marked persistent.
Example:
from sdialog.orchestrators import BaseOrchestrator from sdialog.personas import Persona from sdialog.agents import Agent # Let's create our own orchestrator class EncourageDetailOrchestrator(BaseOrchestrator): def instruct(self, dialog, utterance): if utterance and len(utterance.split()) < 5: return "Add a bit more detail in your next reply." return None orch_encourage = EncourageDetailOrchestrator() bob = Agent(persona=Persona(role="Guide")) alice = Agent(persona=Persona(role="User")) # Let's orchestrate bob to provide more detailed answers if alice is brief bob = bob | orch_encourage dialog = bob.talk_with(alice) dialog.print(orchestration=True)
- Parameters:
target_agent (Agent) – Agent instance to orchestrate (can be set later).
persistent (bool) – Whether produced instructions should persist each turn automatically.
event_label (str) – Optional label to tag generated events; defaults to class name.
- __call__(current_dialog)
Produce an instruction for the target agent given current dialog state.
- Returns:
Instruction object/string or None if no instruction is produced.
- Return type:
Union[str, Instruction, None]
- json(string: bool = False, indent: int = None)
Serialize orchestrator configuration.
- Parameters:
string (bool) – If True returns JSON string; otherwise a dict.
indent (int) – Indentation for pretty JSON output (only if string=True).
- Returns:
Serialized configuration.
- Return type:
Union[str, dict]
- get_event_label() str
Get the label used for events generated by this orchestrator.
- Returns:
Event label.
- Return type:
str
- get_target_agent()
Get the currently assigned target agent.
- Returns:
Agent instance or None.
- Return type:
- is_persistent()
Whether this orchestrator is persistent.
- Returns:
True if persistent.
- Return type:
bool
- set_persistent(value: bool)
Set persistence flag.
- Parameters:
value (bool) – New persistence state.
- agent_response_lookahead()
Retrieve the agent’s lookahead response (preview of next response if available).
- Returns:
Lookahead response string.
- Return type:
str
- abstractmethod instruct(dialog: List[Turn], utterance: str) str
Abstract method: Subclasses are expected to implement this method. Implementations should analyze the dialog state and optionally the most recent utterance to produce an instruction for the target agent.
- Parameters:
dialog (List[Turn]) – Current reconstructed dialog (list of turns).
utterance (str) – Last opposite-party utterance (may be empty string).
- Returns:
Instruction text, Instruction object, or None if no action needed.
- Return type:
Union[str, Instruction, None]
- reset()
Reset any internal state (overridden in stateful orchestrators).
- Returns:
None
- Return type:
None
- class sdialog.orchestrators.base.BasePersistentOrchestrator(target_agent=None, persistent: bool = None, event_label: str = None)
Bases:
BaseOrchestratorPersistent orchestrator base class to create custom persistent orchestrators. Abstract method
instruct()must be implemented by subclasses.Automatically sets persistence to True; intended for orchestrators that maintain state across the whole dialogue unless explicitly removed.
Example:
from sdialog.orchestrators import BasePersistentOrchestrator from sdialog.personas import Persona from sdialog.agents import Agent # Let's create our custom persistent orchestrator to permanently flip tone after a trigger word class FlipToneOrchestrator(BasePersistentOrchestrator): def __init__(self, trigger=None): self.trigger = trigger def instruct(self, dialog, utterance): if self.trigger and self.trigger in utterance.lower(): return ("From now on adopt an annoyed, curt tone; keep answers short and a bit irritable.") # Let's create our agents alice = Agent(persona=Persona(name="Alice", role="daughter")) bob = Agent(persona=Persona(name="Bob", role="dad")) # Let's create our orchestrator using "sweet" as the trigger word orchestrator = FlipToneOrchestrator(trigger="sweet") # Let's attach the orchestrator to Alice alice = alice | orchestrator dialog = alice.dialog_with(bob) dialog.print(orchestration=True)
- abstractmethod instruct(dialog: List[Turn], utterance: str) str
Persistent variant of
BaseOrchestrator.instruct().- Parameters:
dialog (List[Turn]) – Current dialog state.
utterance (str) – Last opposite-party utterance.
- Returns:
Instruction text/object or None.
- Return type:
Union[str, Instruction, None]
- reset()
Reset internal persistent state (override as needed).
- Returns:
None
- Return type:
None
sdialog.generators
This module provides classes for generating synthetic dialogues using LLMs, including support for persona-based role-play and context-driven dialogue generation.
- class sdialog.generators.DialogGenerator(dialogue_details: str, context: str | Context | None = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, personas: dict[str, dict[str, ~typing.Any]]=None, output_format: dict | BaseModel = <class 'sdialog.generators.base.LLMDialogOutput'>, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)
Bases:
objectBase class for generating synthetic dialogues using an LLM.
Typical workflow:
Instantiate with default dialogue instructions and optional context / examples.
Call generate(…) to produce a Dialog (or raw structured output).
Example:
from sdialog.generators import DialogGenerator gen = DialogGenerator("Generate a short friendly greeting between two speakers") dialog = gen.generate() dialog.print()
- Parameters:
dialogue_details (str) – Instructions or details for the dialogue.
context (Optional[Union[str, Context]]) – The default context for the dialogue (optional).
example_dialogs (List[Dialog]) – Optional default list of example dialogues to guide the generation.
scenario (Optional[Union[dict, str]]) – Default scenario metadata for the dialogue.
personas (dict[str, dict[str, Any]]) – Optional personas (serialized) involved in the dialogue (e.g., for logging).
output_format (Union[dict, BaseModel]) – Output schema / model used to parse LLM output (or None for raw text).
model (Union[BaseLanguageModel, str]) – The LLM instance or model name to use.
llm_kwargs (dict) – Additional keyword arguments for the LLM (override config).
- prompt() str
Returns the current system prompt used for dialogue generation.
- Returns:
The system prompt string.
- Return type:
str
- generate(dialogue_details: str = None, context: str | Context | None = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, seed: int = None, id: int = None, parent_id: int = None, notes: str = None)
Generates a synthetic dialogue using the LLM.
- Parameters:
dialogue_details (str) – Override instructions / details for this generation.
context (Optional[Union[str, Context]]) – Override context for this generation.
example_dialogs (List[Dialog]) – Override example dialogues for few-shot style guidance.
scenario (Optional[Union[dict, str]]) – Override scenario metadata.
seed (int) – Random seed for reproducibility.
id (int) – Optional dialogue ID to assign (otherwise autogenerated).
parent_id (int) – Optional parent dialogue ID (thread linkage).
notes (str) – Optional free-form notes stored in metadata.
- Returns:
Dialog instance if output_format is LLMDialogOutput; BaseModel if custom schema; raw string if output_format is falsy.
- Return type:
Union[Dialog, BaseModel, str]
- __call__(dialogue_details: str = None, context: str | Context | None = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, seed: int = None, id: int = None, parent_id: int = None, notes: str = None)
Generates a synthetic dialogue using the LLM.
- Parameters:
dialogue_details (str) – Override instructions / details for this generation.
context (Optional[Union[str, Context]]) – Override context for this generation.
example_dialogs (List[Dialog]) – Override example dialogues for few-shot style guidance.
scenario (Optional[Union[dict, str]]) – Override scenario metadata.
seed (int) – Random seed for reproducibility.
id (int) – Optional dialogue ID to assign (otherwise autogenerated).
parent_id (int) – Optional parent dialogue ID (thread linkage).
notes (str) – Optional free-form notes stored in metadata.
- Returns:
Dialog instance if output_format is LLMDialogOutput; BaseModel if custom schema; raw string if output_format is falsy.
- Return type:
Union[Dialog, BaseModel, str]
- class sdialog.generators.PersonaDialogGenerator(persona_a: Persona | Agent, persona_b: Persona | Agent, speaker_a: str = 'SPEAKER_A', speaker_b: str = 'SPEAKER_B', context: str | Context | None = None, example_dialogs: List[Dialog] = None, dialogue_details: str = '', response_details: str = '', scenario: dict | str | None = None, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)
Bases:
DialogGeneratorGenerates dialogues between two personas (or Agents wrapping personas) using an LLM.
Example:
from sdialog.personas import Persona from sdialog.generators import PersonaDialogGenerator p1 = Persona(name="Alice", role="Curious student") p2 = Persona(name="Mentor", role="Helpful tutor") gen = PersonaDialogGenerator(p1, p2, dialogue_details="Explain one concept briefly.") dialog = gen() dialog.print()
- Parameters:
persona_a (Union[Persona, Agent]) – The first persona or an Agent containing one.
persona_b (Union[Persona, Agent]) – The second persona or an Agent containing one.
speaker_a (str) – Name/ID of the first speaker in the dialogue.
speaker_b (str) – Name/ID of the second speaker in the dialogue.
context (Optional[Union[str, Context]]) – Default context for the dialogue (optional).
example_dialogs (List[Dialog]) – Optional list of example dialogues for guidance.
dialogue_details (str) – Additional dialogue-level instructions.
response_details (str) – Style / formatting instructions for responses.
scenario (Optional[Union[dict, str]]) – Default scenario metadata.
model (Union[BaseLanguageModel, str]) – LLM instance or model name.
llm_kwargs (dict) – Extra LLM keyword arguments (override config).
- generate(context: str | Context | None = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, seed: int = None, id: int = None, parent_id: int = None, max_turns: int = 200, notes: str = None)
Generates a dialogue between two personas (or drives an Agent-to-Agent interaction).
- Parameters:
context (Optional[Union[str, Context]]) – Override context.
example_dialogs (List[Dialog]) – Override example dialogues.
scenario (Optional[Union[dict, str]]) – Override scenario metadata.
seed (int) – Random seed for reproducibility.
id (int) – Dialogue ID override.
parent_id (int) – Parent dialogue ID (thread).
max_turns (int) – Max turns (only applies when both participants are Agents).
notes (str) – Optional metadata notes.
- Returns:
Generated dialogue object.
- Return type:
- __call__(context: str | Context | None = None, example_dialogs: List[Dialog] = None, scenario: dict | str | None = None, seed: int = None, id: int = None, parent_id: int = None, max_turns: int = 200, notes: str = None)
Generates a dialogue between two personas (or drives an Agent-to-Agent interaction).
- Parameters:
context (Optional[Union[str, Context]]) – Override context.
example_dialogs (List[Dialog]) – Override example dialogues.
scenario (Optional[Union[dict, str]]) – Override scenario metadata.
seed (int) – Random seed for reproducibility.
id (int) – Dialogue ID override.
parent_id (int) – Parent dialogue ID (thread).
max_turns (int) – Max turns (only applies when both participants are Agents).
notes (str) – Optional metadata notes.
- Returns:
Generated dialogue object.
- Return type:
- class sdialog.generators.PersonaGenerator(persona: BaseAttributeModel, generated_attributes: str = 'all', extra_instructions: str = 'Attributes must be in English', model: str = None, **llm_kwargs)
Bases:
BaseAttributeModelGeneratorGenerates persona objects (subclasses of
sdialog.personas.BasePersona) with randomized or LLM-populated attributes (seesdialog.generators.BaseAttributeModelGeneratorfor more information).Example:
from sdialog.personas import Doctor from sdialog.generators import PersonaGenerator base_persona = Doctor(specialty="Cardiology") doctor_generator = PersonaGenerator(base_persona) doctor_generator.set( years_of_experience="{4-10}", gender=["male", "female", "non-binary"] ) doctor = doctor_generator.generate() doctor.print()
- Parameters:
persona (BasePersona) – Persona instance or class to generate.
generated_attributes (Union[str, list, dict]) – Strategy specifying which attributes to fill (“all”, list, or dict).
extra_instructions (str) – Additional instructions to include in the LLM prompt.
model (str) – LLM model name (optional).
llm_kwargs (dict) – Extra LLM keyword arguments.
- class sdialog.generators.ContextGenerator(context: Context = None, generated_attributes: str = 'all', extra_instructions: str = 'Attributes must be in English', model: str = None, **llm_kwargs)
Bases:
BaseAttributeModelGeneratorGenerates Context objects with randomized or LLM-populated attributes (see
sdialog.generators.BaseAttributeModelGeneratorfor more information).Example:
from sdialog import Context from sdialog.generators import ContextGenerator base_context = Context(location="Mars Forward Base Alpha") ctx_generator = ContextGenerator(base_context) ctx_generator.set( objects=get_objects_from_db, # callable function topics=["terraforming", "resource logistics", "crew morale"] circumstances="{csv:circumstances:./data/circumstances.csv}", goals="{llm:Suggest a realistic goal for the context}" ) my_context = ctx_generator.generate() my_context.print()
- Parameters:
context (Context) – Context instance or subclass to generate.
generated_attributes (Union[str, list, dict]) – Attribute selection strategy (“all”, list, or dict).
extra_instructions (str) – Additional instructions to include in the LLM prompt.
model (str) – LLM model name (optional).
llm_kwargs (dict) – Extra LLM keyword arguments.
- class sdialog.generators.Paraphraser(extra_instructions: str = 'Keep entities and values identical while making it sound more natural', target_speaker: str = None, turn_by_turn: bool = False, model: str | langchain_core.language_models.base.BaseLanguageModel = None, **llm_kwargs)
Bases:
objectParaphrases dialogue turns while preserving semantic entities/values.
Usage modes:
Whole dialogue paraphrasing (default, returns full set of possibly modified turns).
Turn-by-turn paraphrasing (stream-like, for smaller LLMs).
Example:
from sdialog.generators import Paraphraser # Assume 'original_dialog' is an existing `Dialog` with one of the speaker being "Bot" paraphraser = Paraphraser("Make the text sound more natural and less robotic", target_speaker="Bot") new_dialog = paraphraser(original_dialog) new_dialog.print()
- Parameters:
extra_instructions (str) – Additional style or behavior instructions for the paraphrase.
target_speaker (Optional[str]) – If provided, only paraphrases turns spoken by this speaker.
turn_by_turn (bool) – Whether to paraphrase one turn at a time.
model (Union[str, BaseLanguageModel]) – The LLM instance or model name to use (falls back to config if None).
llm_kwargs (dict) – Additional keyword arguments for the LLM.
- __call__(dialog: Dialog, target_speaker: str = None, seed: int = None) Dialog
Paraphrase a dialog (entirely or selectively by speaker).
- Parameters:
dialog (Dialog) – Source dialogue to paraphrase.
target_speaker (Optional[str]) – Override target speaker filter for this call.
seed (Optional[int]) – Optional random seed (used for reproducibility where supported).
- Returns:
New Dialog instance with paraphrased turns.
- Return type:
- Raises:
ValueError – (Indirectly) if underlying validation fails.
- prompt() str
Returns the combined system prompt and current instruction template.
- Returns:
Combined prompt preview.
- Return type:
str
sdialog.generators.base
Base and abstract classes for generators in sdialog.
- class sdialog.generators.base.BaseAttributeModelGenerator(attribute_model: BaseAttributeModel, generated_attributes: str = 'all', extra_instructions: str = '', model: str = None, system_prompt: str = None, llm_prompt: str = None, llm_prompt_n: str = None, **llm_kwargs)
Bases:
ABCAbstract class to create subclasses for generators with randomized and/or LLM-populated attributes.
Workflow:
Provide a target attribute model instance or class.
Configure attribute generation rules (e.g.
.set(...)orgenerated_attributes='all').Call
generate(n=...)to produce validated instances.
- Parameters:
attribute_model (BaseAttributeModel) – Instance or subclass of BaseAttributeModel to generate.
generated_attributes (Union[str, list, dict]) – Attribute selection strategy (“all”, iterable, or dict of rules).
extra_instructions (str) – Additional instructions to include in the LLM prompt.
model (str) – LLM model name (overrides config if provided).
system_prompt (str) – Override system prompt for generation.
llm_prompt (str) – Template for single-object generation.
llm_prompt_n (str) – Template for multi-object generation (n > 1).
llm_kwargs (dict) – Extra LLM instantiation parameters.
- prompt() str
Returns the single-object prompt template text.
- Returns:
The prompt string.
- Return type:
str
- set(**attributes)
Define per-attribute randomization / generation specifications as
attribute_name=<value>.Where
<value>can be:“*”: Defer to LLM.
A callable: Invoked (with current partial object as kwargs if compatible).
A list: Random element chosen.
A fixed scalar / str: Assigned directly.
A templated string
"{...}":"{min-max}": Random int in inclusive range."{txt:PATH}": Random non-empty line from file."{csv:COLUMN:PATH}": Random value from CSV column (name or index)."{tsv:COLUMN:PATH}": Same for TSV."{llm}": Defer to LLM."{llm:INSTRUCTION}": Defer with custom instruction.
Example:
from sdialog.generators import ContextGenerator ctx_gen = ContextGenerator() ctx_gen.set( location=["office", "home", "school"], objects=get_objects_from_db, # callable function circumstances="{csv:circumstances:./data/circumstances.csv}", goals="{llm:Suggest a realistic goal for the context}" ) my_context = ctx_gen.generate() my_context.print()
- Parameters:
attributes – Mapping of attribute name -> generation rule.
- Raises:
ValueError – If any attribute is not defined on the target model.
- generate(n: int = 1, temperature: float = None, seed: int = None, id: int = None, parent_id: int = None, notes: str = None, max_attempts: int = 3) BaseAttributeModel
Generate one or many model instances using random rules, templates, and/or LLM completion.
- Parameters:
n (int) – Number of instances to generate.
temperature (float) – LLM temperature (if LLM used).
seed (int) – Random seed for reproducibility.
id (int) – Optional explicit ID for single-object generation (each object gets its own if multiple).
parent_id (int) – Optional parent ID linkage.
notes (str) – Optional metadata notes.
max_attempts (int) – Maximum retries to fill missing attributes.
- Returns:
A single instance if n == 1, else a list of instances.
- Return type:
Union[BaseAttributeModel, List[BaseAttributeModel]]
- Raises:
ValueError – On missing files referenced in template specifications.
sdialog.interpretability
This submodule provides classes and hooks for inspecting and interpreting the internal representations of PyTorch-based language models during forward passes. It enables the registration of hooks on specific model layers to capture token-level and response-level information, facilitating analysis of model behavior and interpretability. The module is designed to work with conversational agents and integrates with tokenizers and memory structures, supporting the extraction and inspection of tokens, representations, and system instructions across responses.
Typical usage involves attaching one or more Inspector objects to an agent, accumulating response and token data during inference, and providing interfaces for downstream interpretability and analysis tasks.
- class sdialog.interpretability.DirectionSteerer(direction, inspector=None)
Bases:
BaseSteererConcrete Steerer binding a direction vector for additive or subtractive steering.
Example:
import torch from sdialog.agents import Agent from sdialog.interpretability import Inspector, DirectionSteerer agent = Agent() insp = Inspector(target='model.layers.5.post_attention_layernorm') agent = agent | insp direction = torch.randn(4096) # Random direction in activation space steer = DirectionSteerer(direction) # Add the direction (push activations along vector) insp = steer + insp # Or remove its projection: insp = steer - insp agent("Test prompt") # steering applied during generation
- Parameters:
direction (Union[torch.Tensor, np.ndarray]) – Direction vector (torch.Tensor or numpy array).
inspector (Optional[Inspector]) – Optional Inspector to bind immediately.
- class sdialog.interpretability.Inspector(target: Dict | List[str] | str = None, agent: Any | None = None, steering_function: Callable | None = None, steering_interval: Tuple[int, int] | None = ('*', '*'), top_k: int | None = None, lm_head_layer: str | None = 'lm_head', inspect_input: bool = True)
Bases:
objectMain class to manage layer hooks, cached activations, and optional steering functions for an Agent.
Example:
from sdialog.agents import Agent from sdialog.interpretability import Inspector agent = Agent() insp = Inspector(target='model.layers.2.post_attention_layernorm') agent = agent | insp # pipe attach agent("Explain gravity briefly.") # Generates first response agent("Sounds cool!") # Generates second response print("Num responses captured:", len(insp)) print("Last response, first token string:", insp[-1][0]) print("Last response, first token activation:", insp[-1][0].act) # Output: # Num responses captured: 2 # Last response, first token string: <bos> # Last response, first token activation: # tensor([[-0.0109, -0.1128, -0.1216, ..., -0.0157, 0.2100, -0.2637]])
- Parameters:
target (Union[Dict, List[str], str, None]) – Mapping (cache_key->layer_name) or list / single layer name (optional). If None, no hooks are added until add_hooks/add_agent is called. Defaults to None.
agent (Optional[Agent]) – Agent instance to attach to (optional). If provided with a non-empty target, hooks are registered immediately. Defaults to None.
steering_function (Optional[Union[Callable, List[Callable]]]) – Initial steering function or list of functions (optional). Applied to token activations during generation. Defaults to None.
steering_interval (Optional[Tuple[int, int]]) – (min_token, max_token) steering window (optional). Defaults to (“*”, “*”), where “*” means no lower or/and upper bound.
top_k (Optional[int]) – Number of top token predictions to store for each token. If None, logits are not captured. If -1, all tokens in the vocabulary are returned with their logits. Defaults to None.
lm_head_layer (Optional[str]) – Name of the language model head layer (e.g., “lm_head”). Defaults to “lm_head”. If the specified layer is not found, the code will attempt to auto-detect it.
inspect_input (bool) – If True (default), captures activations before the layer processes them (input activations). If False, captures activations after the layer processes them (output activations). Defaults to False.
- property input
- property vocab_size
Return the vocabulary size of the tokenizer.
- Returns:
Vocabulary size (number of tokens in the tokenizer’s vocabulary).
- Return type:
int
- Raises:
ValueError – If no agent is attached.
- add_agent(agent)
Attach an Agent after construction and (re)register hooks if target specified.
- Parameters:
agent (Agent) – Agent instance.
- add_steering_function(steering_function)
Adds a steering function to the inspector’s list of functions.
- Parameters:
steering_function (Callable) – Callable accepting activation tensor.
- add_hooks(target)
Adds hooks to the agent’s model based on the provided target mapping.
- Parameters:
target (Dict) – Dict mapping cache_key -> layer_name to append.
- Raises:
ValueError – If no agent is attached.
- recap()
Prints and returns the current hooks assigned to the inspector’s agent. Also prints the ‘target’ mapping in a clean, readable format. Includes any found instructions across responses.
- find_instructs(verbose=False)
Return list with ‘index’ and ‘content’ for each SystemMessage (excluding first memory) found in the agent’s memory. If verbose is True, also print each.
- Parameters:
verbose (bool) – If True, logs each found instruction.
- Returns:
List of dicts with keys ‘index’ and ‘content’.
- Return type:
List[Dict[str, Union[int, str]]]
sdialog.evaluation
Evaluation components for dialogue generation and analysis.
This module provides classes for evaluating dialogues, including LLM judges, metrics, and similarity scores.
- class sdialog.evaluation.ConversationalFeatures(feature: List[Literal['mean-turn-length', 'hesitation-rate', 'turn-taking-ratio', 'question-rate', 'lexical-diversity', 'back-channel-rate', 'filler-word-density']] | None = None, name: str = None, speaker: str | None = None)
Bases:
BaseDialogScoreCompute conversational and dialogue-specific features.
These metrics measure dialogue structure, speech patterns, and interaction dynamics rather than text readability.
Example:
from sdialog.evaluation import ConversationalFeatures # All conversational features scorer_all = ConversationalFeatures() # Single feature scorer_hes = ConversationalFeatures(feature="hesitation-rate") # Multiple features scorer_multi = ConversationalFeatures(feature=["question-rate", "lexical-diversity"]) print(scorer_all(dialog)) # dict with all feature values print(scorer_hes(dialog)) # single float (hesitation rate) print(scorer_multi(dialog)) # dict with selected features
- Parameters:
feature (Optional[List[Literal["mean-turn-length", "hesitation-rate", "turn-taking-ratio", "question-rate", "lexical-diversity", "back-channel-rate", "filler-word-density"]]]) –
List of feature names to compute. If
None(default) compute all. If the resulting set has size 1,__call__/scorereturns a single float; otherwise a dict. Available features:"mean-turn-length": average number of words per dialogue turn."hesitation-rate": percentage of hesitation tokens over total words (%)."turn-taking-ratio": distribution of turns between speakers(entropy-based, 0=monopolized, 1=balanced).
"question-rate": percentage of turns containing questions (%)."lexical-diversity": type-token ratio measuring vocabulary richness (0-1)."back-channel-rate": percentage of minimal response turns (%)."filler-word-density": percentage of filler words over total words (%).
name (str) – Internal score name (defaults to
"conversational_features"or the single feature name if only one provided).speaker (Optional[str]) – If set, only turns by this speaker (case-insensitive) are considered. Note: turn-taking-ratio ignores this parameter as it requires multiple speakers.
- static count_hesitations(text)
Count hesitation tokens in the provided text (e.g., uh, um, hmm).
- Parameters:
text (str) – Input text to search for hesitation markers.
- Returns:
Number of detected hesitation tokens in the provided text.
- Return type:
int
- static count_filler_words(text)
Count filler words in the provided text (e.g., like, you know, I mean, basically).
- Parameters:
text (str) – Input text to search for filler words.
- Returns:
Number of detected filler words in the provided text.
- Return type:
int
- static is_back_channel(turn_text)
Check if a turn is a back-channel response (minimal acknowledgment).
- Parameters:
turn_text (str) – Text of a single turn.
- Returns:
True if the turn is a back-channel response.
- Return type:
bool
- static calculate_turn_taking_ratio(dialog)
Calculate turn-taking balance using normalized entropy.
Returns a value between 0 (monopolized conversation) and 1 (perfectly balanced). Based on Shannon entropy normalized by maximum possible entropy.
- Parameters:
dialog (Dialog) – Dialog object with turns from multiple speakers.
- Returns:
Turn-taking ratio (0-1).
- Return type:
float
- score(dialog: Dialog) float | dict
Compute one or multiple conversational features for the dialogue.
- Parameters:
dialog (Dialog) – Dialogue instance to evaluate.
- Returns:
If exactly one feature is requested, returns a single
float. Otherwise returns adictmapping feature-name to numeric value.- Return type:
Union[float, dict]
- class sdialog.evaluation.ReadabilityScore(feature: List[Literal['gunning-fog', 'flesch-reading-ease', 'coleman-liau', 'linsear-write', 'dale-chall']] | None = None, name: str = None, speaker: str | None = None)
Bases:
BaseDialogScoreCompute one or multiple readability metrics for a dialogue text: Gunning Fog index, Flesch Reading Ease score, Coleman-Liau Index, Linsear Write metric, and Dale-Chall Readability Formula.
These metrics measure text complexity and reading difficulty, not dialogue structure.
Example:
from sdialog.evaluation import ReadabilityScore # All readability metrics scorer_all = ReadabilityScore() # Single metric scorer_flesch = ReadabilityScore(feature="flesch-reading-ease") # Subset of metrics scorer_subset = ReadabilityScore(feature=["gunning-fog", "coleman-liau"]) print(scorer_all(dialog)) # dict with all metric values print(scorer_flesch(dialog)) # single float (Flesch score) print(scorer_subset(dialog)) # dict with the two requested metrics
- Parameters:
feature (Optional[List[Literal["gunning-fog", "flesch-reading-ease", "coleman-liau", "linsear-write", "dale-chall"]]]) –
List of feature names to compute. If
None(default) compute all. If the resulting set has size 1,__call__/scorereturns a single float; otherwise a dict. Available features:"gunning-fog": Gunning Fog readability index."flesch-reading-ease": Flesch Reading Ease score."coleman-liau": Coleman-Liau Index."linsear-write": Linsear Write readability metric."dale-chall": Dale-Chall Readability Formula.
name (str) – Internal score name (defaults to
"readability_score"or the single feature name if only one provided).speaker (Optional[str]) – If set, only turns by this speaker (case-insensitive) are considered.
- static calculate_gunning_fog(text)
Compute the Gunning Fog index of the provided text.
- Parameters:
text (str) – Input text.
- Returns:
Gunning Fog index value.
- Return type:
float
- static calculate_flesch_reading_ease(text)
Compute the Flesch Reading Ease score of the provided text.
- Parameters:
text (str) – Input text.
- Returns:
Reading ease score.
- Return type:
float
- static calculate_coleman_liau(text)
Compute the Coleman-Liau Index of the provided text.
The Coleman-Liau Index estimates the U.S. grade level needed to understand the text. It uses character counts instead of syllable counts.
- Parameters:
text (str) – Input text.
- Returns:
Coleman-Liau Index value (minimum 0).
- Return type:
float
- static calculate_linsear_write(text)
Compute the Linsear Write readability metric of the provided text.
The Linsear Write formula estimates the U.S. grade level needed to understand the text. It focuses on easy vs. difficult words (based on syllable count).
- Parameters:
text (str) – Input text.
- Returns:
Linsear Write score.
- Return type:
float
- static calculate_dale_chall(text)
Compute the Dale-Chall Readability Formula score of the provided text.
The Dale-Chall formula uses a list of 3000 familiar words that 80% of 4th-grade students understand. Words not on this list are considered “difficult”.
Note: This implementation uses a simplified approximation based on word length and syllable count as a proxy for the Dale-Chall word list, since the full list is proprietary.
- Parameters:
text (str) – Input text.
- Returns:
Dale-Chall score.
- Return type:
float
- score(dialog: Dialog) float | dict
Compute one or multiple readability metrics for the dialogue.
- Parameters:
dialog (Dialog) – Dialogue instance to evaluate.
- Returns:
If exactly one metric is requested, returns a single
float. Otherwise returns adictmapping metric-name to numeric value.- Return type:
Union[float, dict]
- class sdialog.evaluation.MeanTurnLengthScore(name: str = None, speaker: str | None = None)
Bases:
ConversationalFeaturesCompute the mean turn length (average number of words per turn) for a dialogue.
This is a conversational metric that measures dialogue structure, not text readability.
Example:
from sdialog.evaluation import MeanTurnLengthScore scorer = MeanTurnLengthScore() print(scorer(dialog)) # Outputs mean turn length as float
- Parameters:
name (Optional[str]) – Optional score name (defaults to “mean-turn-length”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.TurnLength(name: str = None, speaker: str | None = None)
Bases:
BaseDialogScoreCompute individual turn lengths (number of words per turn) for a dialogue.
Returns a list of word counts for each turn in the dialogue. This is a granular metric that captures turn length distribution, often used as raw input for downstream aggregations (e.g., computing mean or median turn length).
Example:
from sdialog.evaluation import TurnLength scorer = TurnLength() lengths = scorer(dialog) # Returns list of integers print(lengths) # [5, 12, 3, 18, ...] words per turn # Filter by speaker scorer_system = TurnLength(speaker="System") system_lengths = scorer_system(dialog)
- Parameters:
name (Optional[str]) – Optional score name (defaults to “turn-length”).
speaker (Optional[str]) – If set, only turns by this speaker (case-insensitive) are considered.
- class sdialog.evaluation.HesitationRateScore(name: str = None, speaker: str | None = None)
Bases:
ConversationalFeaturesCompute the hesitation rate (percentage of hesitation tokens) for a dialogue.
This is a conversational metric that measures speech disfluencies, not text readability.
Example:
from sdialog.evaluation import HesitationRateScore scorer = HesitationRateScore() print(scorer(dialog)) # Outputs hesitation rate as percentage
- Parameters:
name (Optional[str]) – Optional score name (defaults to “hesitation-rate”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.TurnTakingRatioScore(name: str = None)
Bases:
ConversationalFeaturesCompute the turn-taking ratio (balance of conversation between speakers) for a dialogue.
Returns a value between 0 (monopolized) and 1 (perfectly balanced), based on normalized Shannon entropy of turn distribution across speakers.
Example:
from sdialog.evaluation import TurnTakingRatioScore scorer = TurnTakingRatioScore() print(scorer(dialog)) # Outputs turn-taking ratio (0-1)
- Parameters:
name (Optional[str]) – Optional score name (defaults to “turn-taking-ratio”).
- class sdialog.evaluation.QuestionRateScore(name: str = None, speaker: str | None = None)
Bases:
ConversationalFeaturesCompute the question rate (percentage of turns containing questions) for a dialogue.
This metric measures the interrogative nature of the conversation.
Example:
from sdialog.evaluation import QuestionRateScore scorer = QuestionRateScore() print(scorer(dialog)) # Outputs question rate as percentage
- Parameters:
name (Optional[str]) – Optional score name (defaults to “question-rate”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.LexicalDiversityScore(name: str = None, speaker: str | None = None)
Bases:
ConversationalFeaturesCompute the lexical diversity (type-token ratio) for a dialogue.
Measures vocabulary richness as the ratio of unique words to total words (0-1). Higher values indicate more varied vocabulary.
Example:
from sdialog.evaluation import LexicalDiversityScore scorer = LexicalDiversityScore() print(scorer(dialog)) # Outputs lexical diversity (0-1)
- Parameters:
name (Optional[str]) – Optional score name (defaults to “lexical-diversity”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.BackChannelRateScore(name: str = None, speaker: str | None = None)
Bases:
ConversationalFeaturesCompute the back-channel rate (percentage of minimal response turns) for a dialogue.
Back-channels are brief responses like “yeah”, “okay”, “I see” that indicate active listening without contributing substantial content.
Example:
from sdialog.evaluation import BackChannelRateScore scorer = BackChannelRateScore() print(scorer(dialog)) # Outputs back-channel rate as percentage
- Parameters:
name (Optional[str]) – Optional score name (defaults to “back-channel-rate”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.FillerWordDensityScore(name: str = None, speaker: str | None = None)
Bases:
ConversationalFeaturesCompute the filler word density (percentage of filler words) for a dialogue.
Filler words include “like”, “you know”, “I mean”, “basically”, etc. These differ from hesitations and indicate informal speech patterns.
Example:
from sdialog.evaluation import FillerWordDensityScore scorer = FillerWordDensityScore() print(scorer(dialog)) # Outputs filler word density as percentage
- Parameters:
name (Optional[str]) – Optional score name (defaults to “filler-word-density”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.GunningFogScore(name: str = None, speaker: str | None = None)
Bases:
ReadabilityScoreCompute the Gunning Fog readability index for a dialogue.
The Gunning Fog index estimates the years of formal education needed to understand the text on a first reading. Higher values indicate more complex text.
Example:
from sdialog.evaluation import GunningFogScore scorer = GunningFogScore() print(scorer(dialog)) # Outputs Gunning Fog index as float
- Parameters:
name (Optional[str]) – Optional score name (defaults to “gunning-fog”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.FleschReadingEaseScore(name: str = None, speaker: str | None = None)
Bases:
ReadabilityScoreCompute the Flesch Reading Ease score for a dialogue.
The Flesch Reading Ease score rates text on a 100-point scale. Higher scores indicate text that is easier to read. Scores typically range from 0 (very difficult) to 100 (very easy).
Example:
from sdialog.evaluation import FleschReadingEaseScore scorer = FleschReadingEaseScore() print(scorer(dialog)) # Outputs Flesch Reading Ease score as float
- Parameters:
name (Optional[str]) – Optional score name (defaults to “flesch-reading-ease”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.ColemanLiauScore(name: str = None, speaker: str | None = None)
Bases:
ReadabilityScoreCompute the Coleman-Liau Index for a dialogue.
The Coleman-Liau Index estimates the U.S. grade level needed to understand the text. Unlike other readability formulas, it uses character counts instead of syllable counts, making it more suitable for automated text analysis.
Example:
from sdialog.evaluation import ColemanLiauScore scorer = ColemanLiauScore() print(scorer(dialog)) # Outputs Coleman-Liau Index as float
- Parameters:
name (Optional[str]) – Optional score name (defaults to “coleman-liau”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.LinsearWriteScore(name: str = None, speaker: str | None = None)
Bases:
ReadabilityScoreCompute the Linsear Write readability metric for a dialogue.
The Linsear Write formula estimates the U.S. grade level needed to understand the text. It focuses on easy versus difficult words (based on syllable count) and is particularly useful for technical writing assessment.
Example:
from sdialog.evaluation import LinsearWriteScore scorer = LinsearWriteScore() print(scorer(dialog)) # Outputs Linsear Write score as float
- Parameters:
name (Optional[str]) – Optional score name (defaults to “linsear-write”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.DaleChallScore(name: str = None, speaker: str | None = None)
Bases:
ReadabilityScoreCompute the Dale-Chall Readability Formula score for a dialogue.
The Dale-Chall formula uses a list of familiar words that most 4th-grade students understand. Words not on this list are considered “difficult”. This implementation uses a simplified approximation based on word length and syllable count as a proxy for the Dale-Chall word list.
Example:
from sdialog.evaluation import DaleChallScore scorer = DaleChallScore() print(scorer(dialog)) # Outputs Dale-Chall score as float
- Parameters:
name (Optional[str]) – Optional score name (defaults to “dale-chall”).
speaker (Optional[str]) – If set, only turns by this speaker are considered.
- class sdialog.evaluation.ToolSequenceValidator(tool_names: List[str], name: str = 'tool-sequence-validator')
Bases:
BaseDialogScoreValidate that an agent used specific tools in the correct sequence during a dialogue.
This validator checks whether the agent called the specified tools in the expected order based on the dialogue’s event history. It returns 1 if the sequence is valid, 0 otherwise.
Tool names can be prefixed with
"not:"to indicate that the tool must NOT be called before subsequent tools in the list. This allows for flexible validation of tool usage patterns.Example 1: Basic sequence validation
from sdialog.evaluation import ToolSequenceValidator # Validate that tools were called in exact order validator = ToolSequenceValidator(["search_flights", "book_flight", "confirm_booking"]) score = validator(dialog) print(score) # 1 if sequence correct, 0 otherwise
Example 2: Using negative constraints
from sdialog.evaluation import ToolSequenceValidator # Ensure send_receipt is NOT called before charging payment # (don't send receipt before actually charging the customer) # But send_receipt may be called after charge_payment, or not at all validator = ToolSequenceValidator([ "not:send_receipt", "charge_payment", "update_inventory" ]) score = validator(dialog)
Example 3: With evaluators
from sdialog.evaluation import ToolSequenceValidator, FrequencyEvaluator validator = ToolSequenceValidator(["authenticate", "fetch_data", "logout"]) freq_eval = FrequencyEvaluator(validator) # Get percentage of dialogues with correct tool sequence percentage = freq_eval(dialogs) print(f"{percentage * 100:.1f}% of dialogues follow correct sequence")
- Parameters:
tool_names (List[str]) –
List of tool names defining the expected sequence. Each tool name can be: - A plain string (e.g.,
"search_flights"): tool must be called in sequence. - Prefixed with"not:"(e.g.,"not:verify_account"): tool must NOT becalled before the next required tool in the sequence, though it may be called after or omitted entirely.
name (str) – Custom score name (defaults to
"tool-sequence-validator").
Note
Tools must appear in the specified order within the dialogue’s event history.
The first tool in the sequence must come after at least one user utterance.
If a required tool (without
"not:"prefix) is missing, the score is 0.Tools with
"not:"prefix that don’t appear in the dialogue are ignored.
- score(dialog: Dialog) int
Compute the validation score for the dialogue’s tool usage sequence.
Extracts tool calls from the dialogue’s event history and validates that: 1. All required tools (without
"not:"prefix) are present. 2. Tools appear in the specified order. 3. Tools with"not:"prefix do not appear before subsequent tools. 4. The first tool call comes after at least one user utterance.- Parameters:
dialog (Dialog) – Dialogue instance to validate.
- Returns:
1 if the tool sequence is valid, 0 otherwise.
- Return type:
int
Note
Returns 0 if: - The dialogue has no events or tool_names is empty. - A required tool is missing from the event history. - Tools appear in incorrect order. - A
"not:"prefixed tool appears before subsequent tools.
- class sdialog.evaluation.DialogFlowPPL(reference_dialogues: str | List[Dialog], ai_speaker: str = None, k_neighbors: int = 64, use_softmax: bool = True, use_only_known_edges: bool = False, name: str = None, verbose: bool = False, **d2f_kwargs)
Bases:
BaseDialogFlowScoreCompute flow perplexity-like score of a dialogue against reference dialogues.
Given a collection of reference dialogues, it first builds the dialogue flow graph that represent them. Then, given a candidate dialogue, it computes a flow perplexity-like score (i.e. “how well it fits on the reference graph in terms of perplexity?”).
Example:
from sdialog.evaluation import DialogFlowPPL # reference_dialogs = [...] flow_ppl = DialogFlowPPL(reference_dialogs) value = flow_ppl(candidate_dialog) print("Flow Perplexity:", value)
- Parameters:
reference_dialogues (Union[str, List[Dialog]]) – List of reference dialogues or file path.
ai_speaker (Optional[str]) – If set, restrict scoring to AI/system turns.
k_neighbors (int) – Neighbor count for embedding lookup.
use_softmax (bool) – Whether to weight neighbors via softmax.
use_only_known_edges (bool) – If True, ignore unknown transitions (penalize less).
name (Optional[str]) – Custom score name override.
verbose (bool) – Verbosity flag.
d2f_kwargs (dict) – Extra kwargs to dialog2graph.
- class sdialog.evaluation.DialogFlowScore(reference_dialogues: str | List[Dialog], ai_speaker: str = None, k_neighbors: int = 64, use_softmax: bool = True, use_only_ai_speaker: bool = False, use_only_known_edges: bool = False, name: str = None, verbose: bool = False, graph=None, nodes=None, **d2f_kwargs)
Bases:
BaseDialogFlowScoreCompute flow likelihood score of a dialogue against reference dialogues.
Given a collection of reference dialogues, it first builds the dialogue flow graph that represent them. Then, given a candidate dialogue, it computes a flow likelihood score based on the geometric mean of edge probabilities (i.e. “how well the dialogue fits on the reference graph”).
Example:
from sdialog.evaluation import DialogFlowScore flow_score = DialogFlowScore(reference_dialogs) print(flow_score(candidate_dialog))
- Parameters:
reference_dialogues (Union[str, List[Dialog]]) – List of reference dialogues or file path.
ai_speaker (Optional[str]) – Restrict scoring to AI/system turns if provided.
k_neighbors (int) – Neighbor count for embedding lookup.
use_softmax (bool) – Whether to weight neighbors via softmax.
use_only_ai_speaker (bool) – If True, only AI turns are used to build the graph and compute the scores.
use_only_known_edges (bool) – If True, only known edges contribute.
name (Optional[str]) – Custom score name.
verbose (bool) – Verbosity flag.
graph (Any) – Pre-built graph (optional).
nodes (dict) – Pre-built node metadata (optional).
d2f_kwargs (dict) – Extra kwargs to dialog2graph.
- class sdialog.evaluation.LLMJudgeYesNo(prompt_template: str, reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)
Bases:
BaseDialogScore,BaseLLMJudgeLLM judge for classifying a dialogue as “yes or no” (boolean) output and reason.
Example:
from sdialog.evaluation import LLMJudgeYesNo magic_judge = LLMJudgeYesNo("Is this dialogue magical?", reason=True) result = magic_judge.judge(dialog) print(result.positive) print(result.reason)
- Parameters:
prompt_template (str) – Jinja2 template for judging prompt.
reason (bool) – Whether to request reason field.
model (Optional[Union[BaseLanguageModel, str]]) – Model instance or model name.
llm_kwargs (dict) – Extra LLM initialization kwargs.
- judge(dialogs: Dialog | List[Dialog], reason: bool = None, **template_kwargs) LLMJudgeYesNoOutput | int
Run judgment over one or multiple dialogues.
- Parameters:
- Returns:
Structured yes/no output model.
- Return type:
- class sdialog.evaluation.LLMJudgeScore(prompt_template: str, min_score: float = 1, max_score: float = 5, score_type: type = <class 'int'>, reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)
Bases:
BaseDialogScore,BaseLLMJudgeLLM judge for scoring a dialogue with a numerical score and optional reason.
Example 1:
from sdialog.evaluation import LLMJudgeScore magic_judge = LLMJudgeScore("From 1 to 5, how magical is this dialogue?", reason=True) result = magic_judge.judge(dialog) print(result.score) print(result.reason)
Example 2:
from sdialog.evaluation import LLMJudgeScore # You can use the `min_score`, `max_score`, `score_type` and/or `reason` parameters # as variables in your prompt template. prompt = ( "On a scale from {{ min_score }} to {{ max_score }}, " "how magical is this dialogue?" "Provide a {{ score_type }} score." ) magic_judge = LLMJudgeScore(prompt, min_score=1, max_score=10, score_type=int) result = magic_judge.judge(dialog) print(result.score) print(result.reason)
- Parameters:
prompt_template (str) – Jinja2 template text.
min_score (float) – Minimum allowed score.
max_score (float) – Maximum allowed score.
score_type (type) – int or float score type.
reason (bool) – Whether to request reason field.
model (Optional[Union[BaseLanguageModel, str]]) – Model instance or model name.
llm_kwargs (dict) – Extra LLM kwargs.
- judge(dialogs: Dialog | List[Dialog], reason: bool = None, **template_kwargs) LLMJudgeScoreOutput
Produce a numeric judgment for one or more dialogues.
- Parameters:
- Returns:
Structured output containing the score and an optional reason.
- Return type:
- class sdialog.evaluation.LLMJudgeRealDialog(reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)
Bases:
LLMJudgeYesNoLLM judge for classifying a dialogue as real (human) or synthetic (machine-generated), with boolean output and reason. Returns an instance of LLMJudgeYesNoOutput.
Example:
from sdialog.evaluation import LLMJudgeRealDialog judge_real = LLMJudgeRealDialog(reason=True) result = judge_real.judge(dialog) print("Real?", result.positive) print("Reason:", result.reason)
- Parameters:
reason (bool) – Whether to request reason.
model (Optional[Union[BaseLanguageModel, str]]) – Model instance or name.
llm_kwargs (dict) – Additional LLM kwargs.
- class sdialog.evaluation.LLMJudgeRealDialogLikertScore(reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)
Bases:
LLMJudgeScoreLLM judge for evaluating whether a dialogue appears real (human) or synthetic (machine-generated), providing a Likert score between 1 (definitely synthetic) and 5 (definitely real), with optional reason.
Example:
from sdialog.evaluation import LLMJudgeRealDialogLikertScore judge_real = LLMJudgeRealDialogLikertScore(reason=True) result = judge_real.judge(dialog) # score = judge_real(dialog) print("Likert Score:", result.score) # score from 1 to 5 print("Reason:", result.reason)
- Parameters:
reason (bool) – Request reason flag.
model (Optional[Union[BaseLanguageModel, str]]) – Model instance or name.
llm_kwargs (dict) – Extra LLM kwargs.
- class sdialog.evaluation.LLMJudgeRealDialogScore(min_score: int = 0, max_score: int = 10, reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)
Bases:
LLMJudgeScoreLLM judge for evaluating how “real” (human-like) or “synthetic” a dialogue appears on a configurable numeric range.
Example:
from sdialog.evaluation import LLMJudgeRealDialogScore judge_real = LLMJudgeRealDialogScore(min_score=0, max_score=10, reason=True) result = judge_real.judge(dialog) # score = judge_real(dialog) print("Score:", result.score) # score from 0 to 10 print("Reason:", result.reason)
- Parameters:
min_score (int) – Minimum realism score.
max_score (int) – Maximum realism score.
reason (bool) – Request reason flag.
model (Optional[Union[BaseLanguageModel, str]]) – Model instance or name.
llm_kwargs (dict) – Extra LLM kwargs.
- class sdialog.evaluation.LLMJudgeRefusal(reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)
Bases:
LLMJudgeYesNoLLM judge for evaluating if a dialogue contains a refusal response.
Example:
from sdialog.evaluation import LLMJudgeRefusal judge_refusal = LLMJudgeRefusal(reason=True) result = judge_refusal.judge(dialog) print("Refused?", result.positive) print("Reason:", result.reason)
- Parameters:
reason (bool) – Request reason flag.
model (Optional[Union[BaseLanguageModel, str]]) – Model instance or name.
llm_kwargs (dict) – Extra LLM kwargs.
- class sdialog.evaluation.LLMJudgePersonaAttributes(persona: BaseAttributeModel, speaker: str, reason: bool = False, model: langchain_core.language_models.base.BaseLanguageModel | str = None, **llm_kwargs)
Bases:
LLMJudgeYesNoLLM judge for evaluating if a speaker follows the persona attributes in a dialogue.
Example:
from sdialog.personas import Doctor from sdialog.evaluation import LLMJudgePersonaAttributes reference_persona = Doctor(name="Dr. Smith", specialty="cardiology") judge_persona = LLMJudgePersonaAttributes(persona=reference_persona, speaker="Doctor", reason=True) result = judge_persona.judge(dialog) print("Matches persona?", result.positive) print("Reason:", result.reason)
- Parameters:
persona (BasePersona) – Persona definition object.
speaker (str) – Target speaker in dialogue.
reason (bool) – Request reason flag.
model (Optional[Union[BaseLanguageModel, str]]) – Model instance or name.
llm_kwargs (dict) – Additional LLM kwargs.
- class sdialog.evaluation.SentenceTransformerDialogEmbedder(model_name: str = 'sentence-transformers/LaBSE', mean: bool = True, ai_speaker: str = None, name: str = None, verbose: bool = False)
Bases:
BaseDialogEmbedderDialog embedder using SentenceTransformer. Can embed a dialog as the mean of turn embeddings or as a single embedding of the whole dialog text.
Example:
from sdialog.evaluation import SentenceTransformerDialogEmbedder dialog_embedder = SentenceTransformerDialogEmbedder(model_name="sentence-transformers/LaBSE") emb = dialog_embedder(dialog) print(emb.shape)
- Parameters:
model_name (str) – SentenceTransformer model name.
mean (bool) – If True average per-turn embeddings; else encode concatenated text.
ai_speaker (Optional[str]) – If set, restrict embedding to AI/system turns only.
name (Optional[str]) – Optional custom embedder name.
verbose (bool) – Show progress bars for encoding.
- class sdialog.evaluation.ReferenceCentroidEmbeddingEvaluator(dialog_embedder: BaseDialogEmbedder, reference_dialogues: str | List[Dialog], name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)
Bases:
BaseDatasetEmbeddingEvaluatorEvaluator comparing candidate centroid to a reference centroid via cosine similarity.
Example:
from sdialog.evaluation import SentenceTransformerDialogEmbedder from sdialog.evaluation import ReferenceCentroidEmbeddingEvaluator dialog_embedder = SentenceTransformerDialogEmbedder() evaluator = ReferenceCentroidEmbeddingEvaluator(dialog_embedder, reference_dialogs) # How far are the candidate dialogs from the reference dialogues? (centroid-wise) print(evaluator(candidate_dialogs))
- Parameters:
dialog_embedder (BaseDialogEmbedder) – Dialog embedding component.
reference_dialogues (Union[str, List[Dialog]]) – List of reference Dialog objects or path.
name (Optional[str]) – Optional evaluator name.
enable_plotting (bool) – Store embeddings for plotting if True.
verbose (bool) – Verbosity flag.
- class sdialog.evaluation.KDEDistanceEvaluator(dialog_score: BaseDialogScore, reference_dialogues: str | List[Dialog] = None, metric: str = 'kl', kde_bw: float = None, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None, **evaluator_kwargs)
Bases:
BaseDatasetScoreEvaluatorEvaluate distribution divergence between reference and candidate dialog scores using KDE.
Example:
from sdialog.evaluation import KDEDistanceEvaluator, GunningFogScore # Any dialog score can be used, let's use `GunningFogScore` as an example kde_eval = KDEDistanceEvaluator(dialog_score=GunningFogScore(), reference_dialogues=reference_dialogs) print("KL divergence:", kde_eval(candidate_dialogs))
- Parameters:
dialog_score (BaseDialogScore) – Per-dialog scoring object.
reference_dialogues (Optional[Union[str, List[Dialog]]]) – Reference Dialog list or path (optional if score object has attribute).
metric (str) – Divergence metric: “kl”, “cs”, or “all”.
kde_bw (Optional[float]) – Bandwidth override for KDE.
name (Optional[str]) – Evaluator name.
enable_plotting (bool) – Keep distributions for plotting.
verbose (bool) – Verbosity flag.
evaluator_kwargs (dict) – Extra kwargs to parent initializer.
- class sdialog.evaluation.FrechetDistanceEvaluator(dialog_score: BaseDialogScore, reference_dialogues: str | List[Dialog] = None, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None, **evaluator_kwargs)
Bases:
BaseDatasetScoreEvaluatorEvaluate Frechet distance between Gaussian fits of reference and candidate score distributions.
Example:
from sdialog.evaluation import FrechetDistanceEvaluator, ConversationalFeatures # Any dialog score can be used, let's use `ConversationalFeatures` as an example turn_length = ConversationalFeatures(feature="mean-turn-length") fd_eval = FrechetDistanceEvaluator(dialog_score=turn_length, reference_dialogues=reference_dialogs) print("Frechet distance:", fd_eval(candidate_dialogs))
- Parameters:
dialog_score (BaseDialogScore) – Per-dialog scoring object.
reference_dialogues (Optional[Union[str, List[Dialog]]]) – List or path of reference dialogues.
name (Optional[str]) – Evaluator name.
enable_plotting (bool) – Retained for API parity (not used directly here).
verbose (bool) – Verbosity flag.
evaluator_kwargs (dict) – Extra parent kwargs.
- class sdialog.evaluation.FrechetBERTDistanceEvaluator(reference_dialogues: str | List[Dialog], ai_speaker: str = None, name: str = None, model_name: str = 'roberta-base', batch_size: int = 128, device: str = None, enable_plotting: bool = False, verbose: bool = False)
Bases:
BaseDatasetEvaluatorFrechet distance evaluator based on BERT sentence-pair embeddings. See: https://aclanthology.org/2021.findings-acl.193/
Example:
from sdialog.evaluation import FrechetBERTDistanceEvaluator fb_distance = FrechetBERTDistanceEvaluator(reference_dialogs) print(fb_distance(candidate_dialogs))
- Parameters:
reference_dialogues (Union[str, List[Dialog]]) – Reference dialogues (list or path).
ai_speaker (Optional[str]) – If set, restrict to AI response pairs.
name (Optional[str]) – Evaluator name.
model_name (str) – Underlying transformer model.
batch_size (int) – Batch size for encoding.
device (Optional[str]) – Torch device override.
enable_plotting (bool) – Store embeddings for later plotting.
verbose (bool) – Verbosity flag.
- __call__(dialogues: str | List[Dialog], dataset_name: str = 'candidate') float
Compute Frechet distance between reference embedding distribution and candidate.
- Parameters:
dialogues (Union[str, List[Dialog]]) – Candidate dialogues (list or path).
dataset_name (str) – Label for candidate dataset.
- Returns:
Frechet distance (>= 0).
- Return type:
float
- plot(show: bool = True, save_path: str = None)
Plot t-SNE projection of sentence-pair embeddings for reference and candidates.
- Parameters:
show (bool) – Display the figure.
save_path (Optional[str]) – Path to save figure (if provided).
- Returns:
None
- Return type:
None
- class sdialog.evaluation.PrecisionRecallDistanceEvaluator(reference_dialogues: str | List[Dialog], ai_speaker: str = None, num_clusters=20, num_angles=1001, num_runs=10, name: str = None, model_name: str = 'roberta-base', batch_size: int = 128, device: str = None, verbose: bool = False)
Bases:
BaseDatasetEvaluatorPrecision-Recall distance evaluator based on BERT embeddings. See: https://aclanthology.org/2021.findings-acl.193/
Example:
from sdialog.evaluation import PrecisionRecallDistanceEvaluator pr_distance = PrecisionRecallDistanceEvaluator(reference_dialogs) print(pr_distance(candidate_dialogs))
- Parameters:
reference_dialogues (Union[str, List[Dialog]]) – Reference dialogues (list or path).
ai_speaker (Optional[str]) – If set, restrict to AI response pairs.
num_clusters (int) – Number of k-means clusters.
num_angles (int) – Angular resolution for PRD curve.
num_runs (int) – Repetition count when distributions unbalanced.
name (Optional[str]) – Evaluator name.
model_name (str) – Underlying transformer model.
batch_size (int) – Batch size for embedding.
device (Optional[str]) – Torch device override.
verbose (bool) – Verbosity flag.
- __call__(dialogues: str | List[Dialog], dataset_name: str = None) dict | float
Compute maximum F1 score along PRD curve (averaged if size mismatch).
- Parameters:
dialogues (Union[str, List[Dialog]]) – Candidate dialogues (list or path).
dataset_name (Optional[str]) – Label for candidate dataset.
- Returns:
Max F1 value.
- Return type:
float
- class sdialog.evaluation.StatsEvaluator(dialog_score: BaseDialogScore, stat: Literal['mean', 'std', 'min', 'max', 'median'] | None = None, metric: Literal['mean', 'std', 'min', 'max', 'median'] | None = None, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)
Bases:
BaseDatasetScoreEvaluatorStatistics evaluator (mean/std/min/max/median).
Example:
from sdialog.evaluation import StatsEvaluator, LexicalDiversityScore # Any dialog score can be used, let's use `LexicalDiversityScore` as an example lexical_diversity = LexicalDiversityScore() stats_eval = StatsEvaluator(lexical_diversity) mean_eval = StatsEvaluator(lexical_diversity, stat="mean") stats = stats_eval(candidate_dialogs) mean = mean_eval(candidate_dialogs) # Print descriptive statistics for hesitation rate print(stats) # {'mean': ..., 'std': ..., ...} print("Mean hesitation rate:", mean) # Mean hesitation rate: ...
- Parameters:
dialog_score (BaseDialogScore) – Dialog scoring component.
stat (Optional[Literal["mean", "std", "min", "max", "median"]]) – Target statistic to return (one of ‘mean’, ‘std’, ‘min’, ‘max’, ‘median’). If None, return all.
metric (Optional[Literal["mean", "std", "min", "max", "median"]]) – Deprecated alias for stat.
name (Optional[str]) – Evaluator name.
enable_plotting (bool) – Keep per-dataset scores for plotting.
verbose (bool) – Verbosity flag.
- class sdialog.evaluation.MeanEvaluator(dialog_score: BaseDialogScore, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)
Bases:
StatsEvaluatorEvaluator for computing the mean of dialog scores. This class is a thin wrapper around StatsEvaluator with stat=”mean”.
Example:
from sdialog.evaluation import MeanEvaluator, ReadabilityScore flesch_score = ReadabilityScore(feature="flesch-reading-ease") mean_eval = MeanEvaluator(flesch_score) print("Average Flesch reading ease:", mean_eval(candidate_dialogs))
- Parameters:
dialog_score (BaseDialogScore) – Dialog scoring component.
name (Optional[str]) – Evaluator name.
enable_plotting (bool) – Keep scores for plotting.
verbose (bool) – Verbosity flag.
- class sdialog.evaluation.FrequencyEvaluator(dialog_score: BaseDialogScore, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)
Bases:
BaseDatasetScoreEvaluatorEvaluator for computing the frequency or percentage of dialogues scored as 1 / True (e.g., refusal responses).
Example:
from sdialog.evaluation import FrequencyEvaluator, LLMJudgeRealDialog judge_real = LLMJudgeRealDialog() freq = FrequencyEvaluator(judge_real) print(freq(dialogs)) # Outputs proportion of dialogues judged as real
- Parameters:
dialog_score (BaseDialogScore) – Dialog scoring component producing binary outputs.
name (Optional[str]) – Evaluator name.
enable_plotting (bool) – Retained for API parity (not used directly).
verbose (bool) – Verbosity flag.
- class sdialog.evaluation.DatasetComparator(evaluators: BaseDatasetEvaluator | List[BaseDatasetEvaluator])
Bases:
objectRun multiple evaluators over several dialog datasets and collect results.
Example:
from sdialog.evaluation import LLMJudgeRealDialog, DialogFlowPPL from sdialog.evaluation import FrequencyEvaluator, MeanEvaluator from sdialog.evaluation import DatasetComparator # Dialog scores judge_real = LLMJudgeRealDialog() flow_score = DialogFlowScore(reference_dialogs) # Comparator with two evaluators comparator = DatasetComparator(evaluators=[FrequencyEvaluator(judge_real), MeanEvaluator(flow_score)]) comparator({"modelA": modelA_dialogs, # print table by default "modelB": modelB_dialogs}) # results = comparator({"modelA": modelA_dialogs, # "modelB": modelB_dialogs}, # output="dict") # return results as dict comparator.plot() # plot results for each evaluator that support it
- Parameters:
evaluators (Union[BaseDatasetEvaluator, List[BaseDatasetEvaluator]]) – Single evaluator instance or list of evaluator instances.
- __call__(candidates: str | List[Dialog] | List[str] | List[List[Dialog]] | Dict[str, str] | Dict[str, List[Dialog]], digits: int = 2, output: str | type = 'markdown') dict
Evaluate multiple candidate datasets with all evaluators.
- Parameters:
candidates (Union[str, List[Dialog], List[str], List[List[Dialog]], Dict[str, str], Dict[str, List[Dialog]]]) – Collection of datasets (lists/paths/dicts of Dialog objects).
digits (int) – Decimal precision for tabular output.
output (Union[str, type]) – Output format: ‘dict’, ‘markdown’, or ‘table’.
- Returns:
Results mapping (dataset -> metric -> value) if output=’dict’; otherwise prints a table.
- Return type:
Optional[dict]
- Raises:
ValueError – If candidates empty or output format unsupported.
- plot(show: bool = True, save_folder_path: str = None)
Call plot() on each evaluator that supports it.
- Parameters:
show (bool) – Whether to display plots.
save_folder_path (Optional[str]) – Directory to save plots (one file per evaluator).
- Returns:
None
- Return type:
None
- sdialog.evaluation.Comparator
alias of
DatasetComparator
sdialog.evaluation.base
Base and abstract evaluation components.
Provides abstract interfaces and utilities to:
Embed a dialog (BaseDialogEmbedder)
Score a single dialog (BaseDialogScore / BaseDialogFlowScore)
Aggregate dialog scores across datasets (BaseDatasetScoreEvaluator)
Aggregate dialog embeddings-based scores across datasets (BaseDatasetEmbeddingEvaluator)
Judge dialogs with an LLM (BaseLLMJudge)
These abstractions standardize evaluation pipelines for synthetic dialog generation.
- class sdialog.evaluation.base.LLMJudgeYesNoOutput(*, reason: str | List[str] | None = None, positive: bool | List[bool])
Bases:
BaseModelStructured output used by yes/no LLM judgments.
- Parameters:
positive (Union[bool, List[bool]]) – Boolean (or list of booleans) indicating classification outcome(s).
reason (Optional[Union[str, List[str]]]) – Optional explanatory reason (string or list).
- reason: str | List[str] | None
- positive: bool | List[bool]
- class sdialog.evaluation.base.LLMJudgeScoreOutput(*, reason: str | None = None, score: int | float = None)
Bases:
BaseModelStructured output used by numeric score LLM judgments.
- Parameters:
score (Union[int, float]) – Numeric score (int or float).
reason (Optional[str]) – Optional explanatory reason.
- reason: str | None
- score: int | float
- class sdialog.evaluation.base.BaseDialogEmbedder(name: str | None = None)
Bases:
ABCBase class for dialog embedding models.
Subclasses must implement the abstract method:
embed(dialog: Dialog) -> np.ndarrayExample:
from sdialog.evaluation.base import BaseDialogEmbedder import numpy as np # Custom embedder to embed dialogues as random N-d embeddings class RndEmbedder(BaseDialogEmbedder): def __init__(self, n=256): self.n = n def embed(self, dialog): return np.random.rand(self.n) # Create a new embedder for 128-d embeddings and embed some dialogues rnd_embedder = RndEmbedder(n=128) for d in dialogues: emb = rnd_embedder(d) print(emb.shape) # (128,)
- Parameters:
name (Optional[str]) – Optional name identifier for the embedder.
- class sdialog.evaluation.base.BaseDialogScore(name: str | None = None, ai_speaker: str = None)
Bases:
ABCBase class for computing a scalar score for a single dialog.
Subclasses must implement the abstract method:
score(dialog: Dialog) -> floatExample:
from sdialog.evaluation.base import BaseDialogScore from sdialog import Dialog, Turn # Custom score class to count the number of turns in a dialogue class TurnCountScore(BaseDialogScore): def score(self, dialog): return len(dialog.turns) # Create a new instance of our score turn_counter = TurnCountScore() d = Dialog(turns=[Turn(speaker="u", text="Hi"), Turn(speaker="s", text="Hello")]) print(turn_counter(d)) # Outputs: 2
- Parameters:
name (Optional[str]) – Name of the score (used in reporting).
ai_speaker (Optional[str]) – If provided, restrict scoring to turns spoken by this AI speaker (case-insensitive).
- class sdialog.evaluation.base.BaseDialogFlowScore(reference_dialogues: str | List[Dialog], ai_speaker: str = None, k_neighbors: int = 64, use_softmax: bool = True, use_only_ai_speaker: bool = False, use_closest_as_centroid_emb: bool = False, graph=None, nodes=None, name: str = None, verbose: bool = False, **d2f_kwargs)
Bases:
BaseDialogScoreBase class for flow-based dialog scores using a reference dialog graph.
Builds (or reuses) a flow graph from reference dialogs, encodes turns, retrieves nearest nodes, and derives transition probabilities. Serves as the foundation for perplexity / likelihood style scores (e.g., DialogFlowPPL, DialogFlowScore).
This is an abstract class (extends
BaseDialogScore) and cannot be instantiated directly. Subclasses must implement the abstract method:score(dialog: Dialog) -> float- Parameters:
reference_dialogues – List of Dialog objects or path to a serialized dialog file.
ai_speaker – If provided, only system/AI speaker turns are considered in scoring.
k_neighbors – Number of neighbors for softmax aggregation.
use_softmax – If True, weight neighbor probabilities via softmax, else pick top-1.
use_only_ai_speaker – If True, only AI turns are used to build the graph and compute the scores.
use_closest_as_centroid_emb – If True, use closest utterance embeddings as cluster centroids.
graph – Optional precomputed graph object to reuse (bypasses construction).
nodes – Optional precomputed node metadata dictionary.
name – Optional score name override (auto if None).
verbose – Verbosity flag forwarded to graph construction.
d2f_kwargs – Additional dialog2graph customization parameters.
- Raises:
ValueError – If reference_dialogues is invalid.
- get_node_sequence(dialog: Dialog, probs: bool = False) List[str]
Map each turn to its nearest node and optionally return transition probabilities.
- Parameters:
dialog (Dialog) – Dialog to map.
probs (bool) – If True, also return per-transition probability estimates.
- Returns:
List of node IDs or (node_sequence, probability_sequence) if probs=True.
- Return type:
Union[List[str], Tuple[List[str], List[Optional[float]]]]
- Raises:
ValueError – If a dialog speaker is not found in graph metadata.
- compute_dialog_log_likelihood(dialog: Dialog) Tuple[float, int]
Compute cumulative log-probability statistics for a dialog using Laplace smoothing.
Laplace smoothing approach (add-one smoothing): - For each transition: P_laplace(dest|src) = (count(src→dest) + 1) / (sum_outbound_counts + V) - Known edges: Use edge frequency count + 1 - Unknown edges: Use count of 1 (equivalent to adding a pseudo-count) - Both are normalized by (total_outbound_count + V) where V is vocabulary size
This provides a principled probability distribution that: 1. Smooths all transitions (known and unknown) consistently 2. Avoids zero probabilities for unseen transitions 3. Doesn’t modify the graph structure (scoring-time smoothing)
- Returns four values:
sum_log_p_known: Sum of log probabilities only over known edges. n_turns_known: Count of contributing turns with known edges (includes initial offset). sum_log_p: Sum over all considered turns (with Laplace smoothing). n_turns: Total counted turns (includes initial offset; respects ai_speaker filtering).
- Parameters:
dialog (Dialog) – Dialog to evaluate.
- Returns:
Tuple (sum_log_p_known, n_turns_known, sum_log_p, n_turns).
- Return type:
Tuple[float, int, float, int]
- Raises:
ValueError – If a speaker is missing from metadata.
- class sdialog.evaluation.base.BaseDatasetEvaluator
Bases:
ABCBase class for dataset evaluators.
Dataset evaluators take a set of dialogs and return an evaluation. Typically, Dataset evaluator subclasses will take a dialogue score (BaseDialogScore object) when created and will return an aggregate of the per-dialog scores.
Subclasses must implement the abstract method:
__call__(dialogues, dataset_name: Optional[str] = None, **kwargs) -> Union[dict, float]Example:
from sdialog.evaluation.base import BaseDatasetEvaluator from sdialog import Dialog, Turn class CountDialogsEvaluator(BaseDatasetEvaluator): def __call__(self, dialogues, dataset_name=None): return len(dialogues) dialog_counter = CountDialogsEvaluator() dialogs = [Dialog(turns=[Turn(speaker="u", text="Hi")]) for _ in range(3)] print(dialog_counter(dialogs)) # Outputs: 3
- abstractmethod __call__(dialogues: str | List[Dialog], dataset_name: str = None, **kwargs) dict | float
Evaluate a collection of dialogues.
- Parameters:
dialogues (Union[str, List[Dialog]]) – List of Dialog objects or a path to a serialized file.
dataset_name (Optional[str]) – Optional label for the dataset.
kwargs (dict) – Additional evaluator-specific parameters.
- Returns:
Evaluation results (scalar or dict).
- Return type:
Union[float, dict]
- Raises:
NotImplementedError – If not implemented in subclass.
- class sdialog.evaluation.base.BaseDatasetScoreEvaluator(dialog_score: BaseDialogScore, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)
Bases:
BaseDatasetEvaluatorBase class for dataset-level aggregation of per-dialog scores.
Dataset score evaluators take a dialog score (BaseDialogScore object) when created and given a collection of dialogs, aggregate their individual scores to return a single value for the collection.
Subclasses must implement the abstract methods:
__eval__(dialog_scores: List[Union[float, int]]) -> Union[dict, float](Optional)
__plot__(dialog_scores: Dict[str, np.ndarray], plot: Optional[plt.Axes] = None) -> None
Example:
import numpy as np from sdialog.evaluation import QuestionRateScore from sdialog.evaluation.base import BaseDatasetScoreEvaluator # Let's create our average score evaluator # We need to implement only the __eval__ method (and optionally the __plot__ method) # In practice, don't use this example class for average, but better use the built-in `MeanEvaluator`! class AverageEvaluator(BaseDatasetScoreEvaluator): def __plot__(self, dialog_scores, plot=None): pass # no-op def __eval__(self, dialog_scores): return np.mean(dialog_scores) avg_evaluator = AverageEvaluator( dialog_score=QuestionRateScore(speaker="System") ) print(avg_evaluator(dialogs)) # Outputs average system's question rate across dialogues
- Parameters:
dialog_score (BaseDialogScore) – Dialog-level scoring component.
name (Optional[str]) – Optional evaluator name (auto-derived if None).
enable_plotting (bool) – Whether to keep per-dataset scores for plotting.
verbose (bool) – Whether to keep tqdm bars visible.
- __call__(dialogues: str | List[Dialog], dataset_name: str = None, return_scores: bool = False) dict | float
Compute per-dialog scores then aggregate.
- Parameters:
dialogues (Union[str, List[Dialog]]) – Iterable of Dialog objects or path.
dataset_name (Optional[str]) – Label for the dataset (default ‘candidate’).
return_scores (bool) – If True also return raw score array(s).
- Returns:
Aggregated results or (results, raw_scores) if return_scores=True.
- Return type:
Union[dict, float, Tuple[Union[dict, float], np.ndarray]]
- Raises:
KeyboardInterrupt – If user interrupts (partial results saved).
- clear()
Clear stored per-dataset raw scores history.
- plot(show: bool = True, save_path: str = None, **kwargs)
Generate plots for stored dataset scores.
- Parameters:
show (bool) – Whether to display the plot(s).
save_path (Optional[str]) – If provided, save figure(s) to this path (metric name appended when multi-metric).
kwargs (dict) – Additional keyword arguments for plotting.
- Returns:
None
- Return type:
None
- class sdialog.evaluation.base.BaseDatasetEmbeddingEvaluator(dialog_embedder: BaseDialogEmbedder, name: str = None, enable_plotting: bool = True, verbose: bool = False, plot_title: str = None, plot_xlabel: str = None, plot_ylabel: str = None)
Bases:
BaseDatasetEvaluatorBase class for dataset-level evaluation using dialog embeddings.
It takes a dialog embedder (BaseDialogEmbedder object) when created and given a collection of dialogs, computes their embeddings and returns a single value for the collection.
- Subclasses must implement:
__eval__(dialog_embs: List[np.ndarray]) -> Union[dict, float]` and
__plot__(dialog_embs: Dict[str, np.ndarray], tsne_model: TSNE, plot: Optional[plt.Axes]) -> None
Example:
from sdialog import SentenceTransformerDialogEmbedder from sdialog.evaluation.base import BaseDatasetEmbeddingEvaluator # Evaluator that computes the average centroid cosine distance to a reference centroid # We need to implement only the __eval__ method (and optionally the __plot__ method) # (in practice, use the built-in ReferenceCentroidEmbeddingEvaluator!) class ReferenceCentroidEmbeddingEvaluator(BaseDatasetEmbeddingEvaluator): def __plot__(self, dialog_embs): pass # no-op def __init__(self, dialog_embedder, reference_dialogs): self.reference_centroid = np.mean( [dialog_embedder(dialog) for dialog in reference_dialogs], axis=0 ) def __eval__(self, dialog_embs): centroid = np.mean(dialog_embs, axis=0) return cosine(centroid, self.reference_centroid) dialog_embedder = SentenceTransformerDialogEmbedder(model_name="sentence-transformers/LaBSE") centroid_evaluator = ReferenceCentroidEmbeddingEvaluator(dialog_embedder=dialog_embedder, reference_dialogs=reference_dialogs) print(centroid_evaluator(dialogs)) # distance between candidate and reference dialogues # (cosine distance between their centroids)
- Parameters:
dialog_embedder (BaseDialogEmbedder) – Dialog embedding component.
name (Optional[str]) – Optional evaluator name (auto-derived if None).
enable_plotting (bool) – Whether to store embeddings for plotting.
verbose (bool) – Verbosity flag for progress bars.
- __call__(dialogues: str | List[Dialog], dataset_name: str = None, return_embs: bool = False) dict | float
Embed dialogs and aggregate their embeddings to a single score.
- Parameters:
dialogues (Union[str, List[Dialog]]) – Iterable of Dialogs or path.
dataset_name (Optional[str]) – Dataset label (default ‘candidate’).
return_embs (bool) – If True return (results, embeddings_array).
- Returns:
Aggregated evaluation or (results, embeddings) if return_embs.
- Return type:
Union[dict, float, Tuple[Union[dict, float], np.ndarray]]
- clear()
Clear stored per-dataset embeddings.
- plot(show: bool = True, save_path: str = None)
Plot embeddings (e.g., via subclass t-SNE projection) for stored datasets.
- Parameters:
show (bool) – Whether to display the plot.
save_path (Optional[str]) – If provided, save plot to this path.
- Returns:
None
- Return type:
None
- Subclasses must implement:
- class sdialog.evaluation.base.BaseLLMJudge(model: langchain_core.language_models.base.BaseLanguageModel | str = None, prompt_template: str = '', output_format: dict | BaseModel = None, **llm_kwargs)
Bases:
ABCBase class for all LLM-based evaluation judges that render a prompt and return an output. This is the base class of built-in base judges like
LLMJudgeYesNoorLLMJudgeScore.Subclasses must implement the abstract method:
judge(dialogs: Union[Dialog, List[Dialog]]) -> dictExample:
from sdialog.evaluation.base import BaseLLMJudge from sdialog import Dialog, Turn
import os
- class MagicJudge(BaseLLMJudge):
- def judge(self, dialog):
# Render prompt (dialog -> text) prompt = self.prompt_template.render(dialog=dialog) raw = self(prompt) # call underlying LLM return raw # normally you’d parse into structured output
magic_judge = MagicJudge(prompt_template=”How magical is the following dialogue? Dialogue:n{{ dialog }}”) print(magic_judge.judge(dialog)) # Outputs raw LLM response
- param model:
Model instance or model name (falls back to config if None).
- type model:
Union[BaseLanguageModel, str]
- param prompt_template:
Jinja2 template string used to build the human prompt.
- type prompt_template:
str
- param output_format:
Optional Pydantic schema or JSON schema dict for structured output.
- type output_format:
Union[dict, BaseModel]
- param llm_kwargs:
Additional model instantiation parameters overriding config.
- type llm_kwargs:
dict
- __call__(prompt: str) dict | BaseModel
Invoke the underlying LLM with the given rendered prompt.
- Parameters:
prompt (str) – Fully rendered human prompt content.
- Returns:
Raw model response or structured output (depending on output_format).
- Return type:
Union[dict, BaseModel]
- prompt(system: bool = False) str
Return the current system or human prompt text.
- Parameters:
system (bool) – If True return system prompt; else return last human prompt.
- Returns:
Prompt text.
- Return type:
str
sdialog.datasets
This module provides utilities for loading, parsing, and describing dialogue datasets, including the STAR dataset. It supports extracting scenarios, flowcharts, personas, and constructing dataset-specific Agent objects for simulation.
- class sdialog.datasets.STAR
Bases:
objectUtility class for interacting with the STAR dialogue dataset.
Github: https://github.com/RasaHQ/STAR
Provides methods for loading dialogues, extracting scenarios, flowcharts, responses, and constructing Agent objects for simulation and evaluation.
- static set_path(path)
Sets the root path for the STAR dataset.
- Parameters:
path (str) – Path to the STAR dataset root directory.
- Returns:
None
- Return type:
None
- static read_graph(task_name, as_dot: bool = True)
Read the action graph for a given task.
- Parameters:
task_name (Union[str, dict]) – Name of the task (folder name under tasks/), or a pre-loaded JSON dict (as returned by the task’s JSON file) to skip disk I/O.
as_dot (bool) – If True, return a DOT string; else return the raw graph dict.
- Returns:
Graph in DOT format or raw dictionary mapping edges.
- Return type:
Union[str, dict]
- static read_graph_responses(task_name, as_dict: bool = False)
Read example responses associated with each node/action in a task graph.
Placeholders of the form {variable[:format]} are uppercased for visibility.
- Parameters:
task_name (Union[str, dict]) – Name of the task, or a pre-loaded responses dict to skip disk I/O.
as_dict (bool) – If True, return a dict; otherwise a JSON-formatted string.
- Returns:
Mapping node -> example response, or JSON dump.
- Return type:
Union[dict, str]
- static get_task_names()
List all available task names (directory names under tasks/).
- Returns:
List of task names.
- Return type:
List[str]
- static get_dialog(id)
Load a dialogue by numeric ID.
- Parameters:
id (int) – Dialogue ID (filename without extension).
- Returns:
Dialog object with turns and events populated.
- Return type:
- static get_dialogs(domain: str = None, task_name: str = None, happy: bool = None, multitask: bool = None)
Load all dialogues matching optional filter criteria.
- Parameters:
domain (Optional[str]) – Domain filter (must appear in Scenario[‘Domains’]).
task_name (Optional[str]) – Task name filter (must appear in WizardCapabilities).
happy (Optional[bool]) – Filter by ‘happy path’ flag.
multitask (Optional[bool]) – Filter by multitask flag.
- Returns:
List of Dialog objects matching filters.
- Return type:
List[Dialog]
- static get_dialog_scenario(id)
Load scenario metadata for a given dialogue.
- Parameters:
id (int) – Dialogue ID.
- Returns:
Scenario dictionary.
- Return type:
dict
- static get_dialog_first_turn(id, speaker: str = None)
Get the first turn for a given dialogue (optionally constrained to a speaker).
- Parameters:
id (int) – Dialogue ID.
speaker (Optional[str]) – Speaker name filter (e.g., ‘User’ or ‘Wizard’); if None, first participant turn is returned.
- Returns:
First matching turn or None if not found.
- Return type:
Optional[Turn]
- static get_dialog_task_names(id)
Get all task names (WizardCapabilities -> Task) for a dialogue.
- Parameters:
id (int) – Dialogue ID.
- Returns:
List of task names.
- Return type:
List[str]
- static get_dialog_responses(id)
Get example response dictionaries for each task in a dialogue.
- Parameters:
id (int) – Dialogue ID.
- Returns:
List of response dicts (one per task).
- Return type:
List[dict]
- static get_dialog_graphs(id)
Get raw action graphs (dict form) for all tasks in a dialogue.
- Parameters:
id (int) – Dialogue ID.
- Returns:
List of graph dicts.
- Return type:
List[dict]
- static get_dialog_events(id)
Get all events for a dialogue.
- Parameters:
id (int) – Dialogue ID.
- Returns:
List of event dictionaries from the JSON file.
- Return type:
List[dict]
- static get_dialog_user_instructions(id)
Get user guide instructions mapped to the (user) turn index where each applies.
- Parameters:
id (int) – Dialogue ID.
- Returns:
Mapping user_turn_index -> instruction text.
- Return type:
Dict[int, str]
- static get_dialog_graphs_and_responses(id)
Convenience loader returning both graphs and responses for all tasks.
- Parameters:
id (int) – Dialogue ID.
- Returns:
Tuple (graphs list, responses list).
- Return type:
Tuple[List[dict], List[dict]]
- static get_scenario_description(scenario)
Build a natural language description of a scenario including embedded DOT graphs and example responses.
- Parameters:
scenario (dict) – Scenario dictionary (as returned by get_dialog_scenario).
- Returns:
Multi-section textual description.
- Return type:
str
- static get_dialog_scenario_description(id)
Retrieve scenario metadata and its natural language description.
- Parameters:
id (int) – Dialogue ID.
- Returns:
Tuple (scenario dict, description string).
- Return type:
Tuple[dict, str]
- static get_user_persona_for_scenario(scenario)
Construct a Persona object representing the user under the given scenario.
- Parameters:
scenario (dict) – Scenario metadata.
- Returns:
User persona object.
- Return type:
- static get_flowchart_description_for_scenario(scenario)
Build a markdown-like description with DOT graphs and example responses for each task.
- Parameters:
scenario (dict) – Scenario metadata.
- Returns:
Combined flowchart description text.
- Return type:
str
- static get_system_persona_for_scenario(scenario)
Construct a Persona object representing the system/assistant for the scenario.
- Parameters:
scenario (dict) – Scenario metadata.
- Returns:
System persona object.
- Return type:
- static get_agents_for_scenario(scenario, model_name: str = None)
Create (system, user) Agent objects for a scenario (personas only; no orchestration).
- static get_agents_from_dialogue(id, model_name: str = None, set_first_utterance: bool = False)
Create (system, user) Agent objects derived from a dialogue’s scenario.
Optionally set an initial first system utterance (heuristic).
- static get_agents_from_dialogue_with_orchestration(id, model_name: str = None, set_first_utterance: bool = False)
Create (system, user) Agent objects with attached orchestrators for responses/instructions.
- Parameters:
id (int) – Dialogue ID.
model_name (Optional[str]) – Optional model name / identifier for agent LLM configuration.
set_first_utterance (bool) – If True, assign a first system utterance.
- Returns:
Tuple (system_agent_with_orchestrator, user_agent_with_orchestrator).
- Return type:
sdialog.config
This module loads and processes the configuration for the sdialog package.
- sdialog.config.llm(llm_name, **llm_kwargs)
Update the LLM model setting in the config.
- Parameters:
llm_name (str) – The name of the LLM model to set.
- sdialog.config.llm_params(**params)
Update the LLM hyperparameters in the config.
- Parameters:
params (dict) – Dictionary of hyperparameter names and values.
- sdialog.config.cache(enable)
Enable or disable caching.
- Parameters:
enable (bool) – Whether to enable caching or not.
- sdialog.config.cache_path(path)
Set the path for the cache directory.
- Parameters:
path (str) – The new path for the cache directory.
- sdialog.config.set_cache(path, enable=True)
Set the cache path and enable/disable caching.
- Parameters:
path (str) – The path to the cache directory.
enable (bool) – Whether to enable caching or not.
- sdialog.config.clear_cache()
Clear the cache by deleting all files in the cache directory.
- sdialog.config.set_persona_dialog_generator_prompt(path)
Set the path for the persona_dialog_generator prompt.
- Parameters:
path (str) – The new path for the prompt file.
- sdialog.config.set_persona_generator_prompt(path)
Set the path for the persona_generator prompt.
- Parameters:
path (str) – The new path for the prompt file.
- sdialog.config.set_dialog_generator_prompt(path)
Set the path for the dialog_generator prompt.
- Parameters:
path (str) – The new path for the prompt file.
- sdialog.config.set_persona_agent_prompt(path)
Set the path for the persona_agent prompt.
- Parameters:
path (str) – The new path for the prompt file.
sdialog.util
Utility Functions for sdialog
This module provides helper functions for the sdialog package, including serialization utilities to ensure objects can be safely converted to JSON for storage or transmission.
- class sdialog.util.CacheDialogScore
Bases:
objectStatic class for caching utility for dialog scoring functions keyed by: (score class name, score object JSON-serializable attributes, dialog._path).
Provides static methods to initialize, enable/disable, persist, and clear cache.
- static cache(func)
Decorator adding disk-backed caching for dialog scoring functions.
Cache key includes:
score object class name
JSON-serializable attributes of score object
dialog._path (must exist)
- Parameters:
func (callable) – Target scoring function
(score_obj, dialog, *args, **kwargs).- Returns:
Wrapped function with caching logic.
- Return type:
callable
- static clear()
Clear in-memory cache and persist empty structure.
- Returns:
None
- Return type:
None
- static get_cache()
Get internal cache dictionary.
- Returns:
Current in-memory cache mapping.
- Return type:
dict
- static get_cache_path() str
Get absolute path to cache JSON file.
- Returns:
Path to cache file.
- Return type:
str
- Raises:
ValueError – If init() not called first.
- static init(path, enable_cache=True)
Initialize cache system (load existing cache file if present).
- Parameters:
path (str) – Directory path where cache file resides / will reside.
enable_cache (bool) – Whether to enable caching immediately.
- Returns:
None
- Return type:
None
- static is_cache_enabled() bool
Check if caching is enabled.
- Returns:
True if enabled.
- Return type:
bool
- static save()
Persist cache dictionary to JSON file (if enabled).
- Returns:
None
- Return type:
None
- Raises:
ValueError – If not initialized.
- static set_cache_path(path: str)
Set cache file path (creates directory if needed).
- Parameters:
path (str) – Base directory for cache file.
- Returns:
None
- Return type:
None
- static set_enable_cache(enable: bool)
Enable or disable the cache.
- Parameters:
enable (bool) – True to enable caching, False to disable.
- Returns:
None
- Return type:
None
- class sdialog.util.KNNModel(items, k=3)
Bases:
objectThin wrapper around sklearn NearestNeighbors for cosine similarity retrieval.
- Parameters:
items (Iterable[Tuple[Any, Sequence[float]]]) – Iterable of (item_id, embedding_vector) pairs.
k (int) – Default number of neighbors to retrieve.
- __call__(target_emb, k=None)
Retrieve k nearest neighbors by cosine distance.
- Parameters:
target_emb (Sequence[float]) – Query embedding vector.
k (int) – Override number of neighbors (defaults to self.k).
- Returns:
List of (item_id, distance) pairs ordered by proximity.
- Return type:
List[Tuple[Any, float]]
- neighbors(target_emb, k=None)
Retrieve k nearest neighbors by cosine distance.
- Parameters:
target_emb (Sequence[float]) – Query embedding vector.
k (int) – Override number of neighbors (defaults to self.k).
- Returns:
List of (item_id, distance) pairs ordered by proximity.
- Return type:
List[Tuple[Any, float]]
- class sdialog.util.SentencePairTransformer(model_name: str = 'roberta-base', device: str = None, verbose: bool = True)
Bases:
objectTransformer wrapper producing CLS embeddings for paired sentences (similar to NLI encoding).
- Parameters:
model_name (str) – Hugging Face model name.
device (str) – Explicit device (
"cpu"/"cuda:*"); auto-detected if None.verbose (bool) – Enable verbose progress display.
- encode(sent1: str | List[str], sent2: str | List[str], batch_size: int = 128, show_progress_bar: bool = True, progress_bar_desc: str = 'Computing embeddings') numpy.ndarray
Encode aligned sentence pairs into CLS embeddings.
- Parameters:
sent1 (Union[str, List[str]]) – First sentence or list of first sentences.
sent2 (Union[str, List[str]]) – Second sentence or list of second sentences.
batch_size (int) – Batch size for encoding.
show_progress_bar (bool) – Whether to show progress bar.
progress_bar_desc (str) – Description label for progress bar.
- Returns:
Array of shape (N, hidden_size) containing CLS embeddings.
- Return type:
np.ndarray
- sdialog.util.camel_or_snake_to_words(varname: str) str
Convert camelCase or snake_case identifier into normalized spaced words.
- Parameters:
varname (str) – Identifier string.
- Returns:
Human-readable spaced words.
- Return type:
str
- sdialog.util.check_valid_model_name(func)
Decorator ensuring first argument (model_name) is a string; otherwise short-circuits to False.
- Parameters:
func (callable) – The predicate function to wrap.
- Returns:
Wrapped function enforcing a str model_name.
- Return type:
callable
- sdialog.util.dialogs_to_utt_pairs(dialogs: List[BaseModel], ai_speaker: str = None) Tuple[List[str], List[str]]
Extract utterance -> next utterance (adjacent turn) pairs from dialogs.
- Two modes:
Sliding window mode (ai_speaker is None): pairs every turn with its successor.
QA mode (ai_speaker given): pairs each human turn immediately preceding an AI turn (speaker match).
- Parameters:
dialogs (List[BaseModel]) – List of dialog-like Pydantic objects each having a .turns list with .text and .speaker.
ai_speaker (str) – Optional name (case-insensitive) of the AI speaker to filter answer turns.
- Returns:
Tuple (utterances, next_utterances) of equal length.
- Return type:
Tuple[List[str], List[str]]
- Raises:
ValueError – If no turns found, or lengths mismatch, or ai_speaker filtering yields nothing.
- sdialog.util.dict_to_table(data: dict, sort_by: str = None, sort_ascending: bool = True, markdown: bool = False, format: str = '.2f', show: bool = True) str
Render a dict-of-dicts as a table (Markdown or fancy grid).
- Parameters:
data (dict) – Mapping where each value is itself a mapping of column -> value.
sort_by (str) – Column name to sort by.
sort_ascending (bool) – Sort ascending if True.
markdown (bool) – If True produce GitHub-flavored Markdown table.
format (str) – Float formatting specifier passed to pandas.
show (bool) – If True print table to stdout.
- Returns:
Table as string.
- Return type:
str
- sdialog.util.get_llm_default_params(model_name: str, llm_params: dict, retry: bool = True) float
Get the default parameters for the model if not already specified, and merges them into llm_params.
- Parameters:
model_name (str) – LLM model name.
llm_params (dict) – Existing LLM parameter dictionary to update in-place.
- Returns:
Updated llm_params with defaults filled.
- Return type:
dict
- sdialog.util.get_llm_model(model_name: str, output_format: dict | BaseModel = None, return_model_params: bool = False, think: bool = False, tools: List = None, **llm_kwargs)
Instantiate a LangChain chat model (OpenAI, AWS, Google, Ollama, Hugging Face).
Applies backend-specific adjustments (e.g., removing unsupported params). Optionally wraps model for structured output if output_format provided and supported.
- Parameters:
model_name (Union[str, Any]) – Model name or instance.
output_format (Union[dict, BaseModel, type[BaseModel]]) – Pydantic model class or JSON schema dict for structured output.
return_model_params (bool) – If True, return (model, llm_kwargs) tuple instead of just model.
think (bool) – If True, enables “thinking” segments in responses.
tools (List[langchain_core.tools.structured.StructuredTool]) – Optional list of tool functions to enable.
llm_kwargs (dict) – Additional backend-specific model kwargs.
- Returns:
Configured LangChain model (possibly wrapped for structured output).
- Return type:
Any
- Raises:
ValueError – If model_name is invalid type.
- sdialog.util.get_timestamp() str
Return current UTC timestamp in ISO 8601 format (seconds precision, trailing ‘Z’).
- Returns:
ISO 8601 UTC timestamp.
- Return type:
str
- sdialog.util.get_universal_id() str
Generates a unique identifier (UUID4) for sdialog objects.
- Returns:
A unique identifier as a string.
- Return type:
str
- sdialog.util.is_amazon_model_name(model_name, *args, **kwargs)
- sdialog.util.is_anthropic_model_name(model_name, *args, **kwargs)
- sdialog.util.is_azure_openai_model_name(model_name, *args, **kwargs)
- sdialog.util.is_google_genai_model_name(model_name, *args, **kwargs)
- sdialog.util.is_huggingface_model_name(model_name, *args, **kwargs)
- sdialog.util.is_ollama_model_name(model_name, *args, **kwargs)
- sdialog.util.is_openai_model_name(model_name, *args, **kwargs)
- sdialog.util.make_serializable(data: dict) dict
Convert non-JSON-serializable values in a dictionary to strings (in-place mutation).
- Parameters:
data (dict) – Dictionary to sanitize.
- Returns:
The mutated dictionary with serializable values.
- Return type:
dict
- Raises:
TypeError – If input is not a dict.
- sdialog.util.ollama_check_and_pull_model(model_name: str) bool
Ensure an Ollama model is available locally (pull if missing).
- Parameters:
model_name (str) – Model name (may include ‘ollama:’ prefix).
- Returns:
True if available or successfully pulled; False otherwise.
- Return type:
bool
- sdialog.util.remove_audio_tags(text: str) str
Remove tags of the form <…>. (Despite the summary mentioning {}, (), [], only angle brackets are removed.)
- Parameters:
text (str) – Input text possibly containing markup tags.
- Returns:
Text with angle-bracket tags removed.
- Return type:
str
- sdialog.util.remove_newlines(s: str) str
Replace all whitespace (including newlines) with single spaces and collapse repeats.
- Parameters:
s (Any) – Input value (non-str inputs are returned unchanged).
- Returns:
Normalized single-line string or original object if not str.
- Return type:
str
- sdialog.util.set_generator_seed(generator, seed)
Attempt to set a deterministic seed on the underlying LLM (if supported); fallback to torch.manual_seed.
Also applies a workaround for certain Ollama caching issues by forcing an initial trivial generation.
- Parameters:
generator (Any) – Object containing .llm and (optionally) .messages or .memory.
seed (int) – Desired seed; if None a random 32-bit value is generated.
- Returns:
The seed actually used (or None if unsupported).
- Return type:
int
- sdialog.util.softmax(values, temperature=0.05, as_list=True)
Compute softmax over a 1D iterable of numeric values.
- Parameters:
values (Iterable[float]) – Sequence of numeric scores.
temperature (float) – Temperature divisor (lower = sharper distribution).
as_list (bool) – If True return a Python list; otherwise a torch tensor.
- Returns:
Softmax probability distribution.
- Return type:
Union[List[float], torch.Tensor]
- sdialog.util.upper_camel_to_dash(name: str) str
Convert UpperCamelCase to dash-case (preserving acronym groups).
- Parameters:
name (str) – Class or identifier name.
- Returns:
dash-case form.
- Return type:
str
sdialog.server
OpenAI-compatible RESTful API server for SDialog agents.
This module provides a Server class that serves SDialog agents as OpenAI-compatible chat completion endpoints, enabling integration with OpenAI-compatible clients like Open WebUI.
- class sdialog.server.Server
Bases:
objectStatic server class for serving SDialog agents as OpenAI-compatible API.
This server provides OpenAI-compatible chat completion endpoints that can be used with clients like Open WebUI. The server handles agent memory internally, only processing new messages while maintaining conversation context.
- classmethod add_agent(agent: Agent, model_name: str = None) None
Add an agent to the server without starting it.
- Parameters:
agent (Agent) – The SDialog agent to add.
model_name (str) – Model name to use for the agent.
- classmethod list_agents() List[str]
List all registered agent model names.
- Returns:
List of model names.
- Return type:
List[str]
- classmethod remove_agent(model_name: str) None
Remove an agent from the server.
- Parameters:
model_name (str) – Model name of the agent to remove.
- classmethod reset_agent(model_name: str, seed: int | None = None) None
Reset an agent’s memory and state.
- Parameters:
model_name (str) – Model name of the agent to reset.
seed (Optional[int]) – Optional seed for the reset.
- classmethod serve(agents: Agent | List[Agent], host: str = '0.0.0.0', port: int = 1333, stateless: bool = True, model_names: str | List[str] | None = None, log_level: str = 'info') None
Serve SDialog agents as an OpenAI-compatible RESTful API.
This method automatically detects the environment and chooses the appropriate server startup method. In standard environments (command line, scripts), it uses uvicorn.run(). In Jupyter notebooks or other environments with existing event loops, it automatically falls back to a threaded server.
- Parameters:
agents (Union[Agent, List[Agent]]) – The SDialog agent or a list of agents to serve.
host (str) – Host address to bind the server to.
port (int) – Port number to bind the server to.
stateless (bool) – If True, the agent will not maintain memory between requests and the full context must be provided with each request.
model_names (Optional[Union[str, List[str]]]) – Model names to expose in the API (defaults to agent’s name).
log_level (str) – Logging level for the server.
Example:
from sdialog import Persona from sdialog.agents import Agent from sdialog.server import Server # Create two agents user = Agent(persona=Persona(name="Dr. Nebula", role="Astrobotanist seeking alien spores"), name="Scientist") bot = Agent(persona=Persona(name="StationCore", role="Sarcastic habitat control AI"), name="Bot") # Serve them as an OpenAI-compatible API Server.serve([user, bot], port=1333) # Output: # Starting server for agents on localhost:1333 # > 2 registered agents: Scientist:latest, Bot:latest
- async classmethod serve_async(agents: Agent | List[Agent], host: str = '0.0.0.0', port: int = 1333, stateless: bool = True, model_names: str | List[str] | None = None, log_level: str = 'info') None
Serve SDialog agents as an OpenAI-compatible RESTful API (async version).
This method is designed for use in environments with existing event loops, such as Jupyter notebooks, where uvicorn.run() would fail.
- Parameters:
agents (Union[Agent, List[Agent]]) – The SDialog agent or a list of agents to serve.
host (str) – Host address to bind the server to.
port (int) – Port number to bind the server to.
stateless (bool) – If True, the agent will not maintain memory between requests and the full context must be provided with each request.
model_names (Optional[Union[str, List[str]]]) – Model names to expose in the API (defaults to agent’s name).
log_level (str) – Logging level for the server.
- classmethod serve_in_thread(agents: Agent | List[Agent], host: str = '0.0.0.0', port: int = 1333, stateless: bool = True, model_names: str | List[str] | None = None, log_level: str = 'info') None
Serve SDialog agents in a separate thread (alternative for Jupyter).
This method runs the server in a separate thread, allowing it to coexist with Jupyter’s event loop without conflicts. It’s automatically used as a fallback by the main serve() method when an event loop conflict is detected.
- Parameters:
agents (Union[Agent, List[Agent]]) – The SDialog agent or a list of agents to serve.
host (str) – Host address to bind the server to.
port (int) – Port number to bind the server to.
stateless (bool) – If True, the agent will not maintain memory between requests and the full context must be provided with each request.
model_names (Optional[Union[str, List[str]]]) – Model names to expose in the API (defaults to agent’s name).
log_level (str) – Logging level for the server.
- Returns:
The thread object running the server.
- Return type:
threading.Thread