Configuration Objects

DELM keeps runtime settings in dataclasses with strict validation. Review each configuration type below to understand default values, validation rules, and serialization behaviour.

delm.config.LLMExtractionConfig dataclass

Bases: BaseConfig

Configuration for the LLM extraction process.

get_provider_string

get_provider_string() -> str

Return the combined provider string for Instructor.

Returns:
  • str

    Provider string in the form "<provider>/<model>".

validate

validate()

Validate all LLM extraction fields.

Raises:
  • ValueError

    If any field has an invalid value.

delm.config.SplittingConfig dataclass

Bases: BaseConfig

Configuration for text splitting strategy.

from_dict classmethod

from_dict(data: Dict[str, Any]) -> SplittingConfig

Construct a SplittingConfig from a mapping.

Parameters:
  • data (Dict[str, Any]) –

    Mapping with a type key and optional parameters.

Returns:

to_dict

to_dict() -> dict

Serialize the strategy configuration to a dictionary.

Returns:
  • dict

    A dictionary with the strategy configuration or {"type": "None"}.

validate

validate()

Validate the configured split strategy.

Raises:
  • ValueError

    If strategy is provided but not a SplitStrategy.

delm.config.ScoringConfig dataclass

Bases: BaseConfig

Configuration for relevance scoring strategy.

from_dict classmethod

from_dict(data: Dict[str, Any]) -> ScoringConfig

Construct a ScoringConfig from a mapping.

Parameters:
  • data (Dict[str, Any]) –

    Mapping with a type key and optional parameters.

Returns:

to_dict

to_dict() -> dict

Serialize the scoring configuration to a dictionary.

validate

validate()

Validate the configured scorer.

Raises:
  • ValueError

    If scorer is provided but not a RelevanceScorer.

delm.config.DataPreprocessingConfig dataclass

Bases: BaseConfig

Configuration for the data preprocessing pipeline.

from_dict classmethod

from_dict(data: Dict[str, Any]) -> DataPreprocessingConfig

Construct a DataPreprocessingConfig from a mapping.

Tracks which fields were explicitly set to detect conflicts when preprocessed_data_path is used.

Parameters:
  • data (Dict[str, Any]) –

    Mapping of preprocessing options.

Returns:

to_dict

to_dict() -> dict

Serialize preprocessing configuration.

Returns:
  • dict

    A dictionary representation suitable for YAML serialization.

validate

validate()

Validate the preprocessing configuration.

Raises:
  • ValueError

    If any field is invalid or conflicts are found when preprocessed_data_path is provided.

delm.config.SchemaConfig dataclass

Bases: BaseConfig

Configuration for extraction schema reference and settings.

This config contains: - Path to the schema specification file (schema_spec.yaml) - Schema‑specific settings (prompts)

The actual schema definition (including container_name) is stored in the separate schema_spec.yaml file.

from_dict classmethod

from_dict(data: Dict[str, Any]) -> SchemaConfig

Construct a SchemaConfig from a mapping.

to_dict

to_dict() -> dict

Serialize schema configuration to a dictionary.

validate

validate()

Validate schema configuration.

Raises:
  • ValueError

    If the spec path does not exist or fields are malformed.

delm.config.SemanticCacheConfig dataclass

Bases: BaseConfig

Persistent semantic‑cache settings.

from_dict classmethod

from_dict(data: Dict[str, Any]) -> SemanticCacheConfig

Construct a SemanticCacheConfig from a mapping.

resolve_path

resolve_path() -> Path

Resolve and return the cache path.

to_dict

to_dict() -> dict

Serialize semantic cache configuration.

validate

validate()

Validate semantic cache configuration.

Raises:
  • ValueError

    If backend or parameters are invalid.

delm.config.DELMConfig dataclass

Bases: BaseConfig

Complete DELM configuration including pipeline and schema reference.

Contains: - Pipeline configuration (LLM settings, data preprocessing, etc.) - Reference to a separate schema specification file

The configuration can be loaded from: - A single pipeline config file (config.yaml) that references a schema file - Separate pipeline config and schema spec files

from_any staticmethod

from_any(
    config_like: DELMConfig | dict[str, Any] | str | Path,
) -> DELMConfig

Create DELMConfig from various input types.

Parameters:
  • config_like (DELMConfig | dict[str, Any] | str | Path) –

    Instance of DELMConfig, dict, or path to YAML file.

Returns:
  • DELMConfig

    A configured DELMConfig instance.

Raises:
  • ValueError

    If the input type is unsupported.

from_dict classmethod

from_dict(data: Dict[str, Any]) -> DELMConfig

Create DELMConfig from a mapping.

from_yaml classmethod

from_yaml(path: Path) -> DELMConfig

Create DELMConfig from a pipeline config YAML file.

Parameters:
  • path (Path) –

    Path to the YAML configuration.

Returns:
  • DELMConfig

    A configured DELMConfig instance.

Raises:
  • FileNotFoundError

    If the file does not exist.

to_dict

to_dict() -> dict

Alias for to_serialized_config_dict for backward compatibility.

to_serialized_config_dict

to_serialized_config_dict() -> dict

Return a dictionary suitable for saving as pipeline config YAML.

to_serialized_schema_spec_dict

to_serialized_schema_spec_dict() -> dict

Load and return the schema spec as a dictionary (schema_spec.yaml).

validate

validate()

Validate all sub‑configurations.