Getting Started¶
Install DELM and run your first extraction pipeline in minutes.
Installation¶
Install from PyPI:
pip install delm
Or with optional dependencies (pdf, excel, alternative caching, etc)
pip install delm[extras]
Environment Variables¶
DELM requires API keys for the LLM providers you use. You must set these environment variables before using DELM.
For a complete list of supported providers and their required environment variable names, see the Instructor documentation.
Quick Example: For OpenAI, you would set:
export OPENAI_API_KEY="sk-..."
Optional: If you prefer using .env files with python-dotenv:
from dotenv import load_dotenv
load_dotenv()
Define Your Schema¶
Import the necessary classes and define what you want to extract:
from delm import DELM, Schema, ExtractionVariable
# Define extraction schema
schema = Schema.nested(
container_name="commodities",
ExtractionVariable(
name="commodity_type",
description="Type of commodity mentioned",
data_type="string",
required=True,
),
ExtractionVariable(
name="price_value",
description="Price value mentioned",
data_type="number",
required=False,
),
)
Run Extraction¶
Create a DELM pipeline and extract structured data from your text:
import pandas as pd
# Initialize pipeline
delm = DELM(
schema=schema,
provider="openai",
model="gpt-4o-mini",
temperature=0.0,
)
# Prepare input data
data = pd.DataFrame({
"text": [
"Oil prices rose to $75 per barrel while gold fell to $1,850 per ounce.",
]
})
# Run extraction
results = delm.extract(data)
print(results)
Understanding Results¶
The results DataFrame will contain your original data plus extracted information. For the example above, DELM would extract:
Input text: "Oil prices rose to $75 per barrel while gold fell to $1,850 per ounce."
Extracted data:
{
"commodities": [
{
"commodity_type": "oil",
"price_value": 75.0
},
{
"commodity_type": "gold",
"price_value": 1850.0
}
]
}
The results DataFrame includes all your original columns plus extraction results:
| text | delm_record_id | delm_chunk_id | delm_extracted_data_json |
|---|---|---|---|
| Oil prices rose to $75 per barrel... | 0 | 0 | {"commodities": [{"commodity_type": "oil", "price_value": 75.0}, ...]} |
| ... | ... | ... | ... |