L2P: Language-to-Plan | ICAPS-26 Tutorial

The Framework

Large Language Models (LLMs) have demonstrated strong capabilities in structured code generation, yet their use in automated planning remains underdeveloped.

In planning, correctness is non-negotiable: syntactic validity, semantic consistency, and executability of valid plans are essential. This tutorial introduces Language-to-Plan (L2P), a principled framework for generating, validating, and iteratively refining PDDL domain and problem models from natural language descriptions.

Generative

NL-to-PDDL pipelines

Validation

Iterative refinement loops

Integration

Python-based toolkit

Requirements before the Tutorial

Ensure you have the following tools and environment ready before the hands-on session.

LLM Setup: Local (Ollama) or API Key

Option 1: Local model via Ollama (Recommended for the tutorial)

Follow these steps to set up a local LLM using Ollama:

Step 1: Install Ollama

curl -fsSL https://ollama.com/install.sh | sh

Step 2: Pull a model

We recommend llama2:7b or mistral:7b for this tutorial:

ollama pull llama2:7b

Step 3: Verify the model works

ollama run llama2:7b "Hello, how are you?"

Step 4: Use the model with L2P

from l2p.llm.unified import UnifiedLLM

llm = UnifiedLLM(
    provider="ollama",
    model="llama2:7b"
)

response = llm.query("Hello!")
print(response)

Refer to the Ollama setup in the L2P README for additional configuration options including tokenizer settings and generation parameters.

Option 2: Cloud LLM via API key

If you cannot run a local model, obtain an API key from a provider and set it as an environment variable:

# Set your API key (macOS/Linux)
export OPENAI_API_KEY="sk-your-key-here"

# Or set it in Python
import os
os.environ["OPENAI_API_KEY"] = "sk-your-key-here"

Supported providers: OpenAI, Anthropic, DeepSeek, Mistral, Gemini, GLM, Ollama-Cloud.

Python Environment (>3.10) & Dependencies

Ensure you have Python 3.10 or newer installed. Download Python if needed.

Install the required packages:

pip install l2p llm openai

The l2p package provides the core framework, llm is a general-purpose LLM CLI tool, and openai is the OpenAI Python SDK.

Text Editor & Virtual Environment

Recommended: Visual Studio Code

Download from code.visualstudio.com. Install the Python extension by Microsoft for syntax highlighting, IntelliSense, and debugging.

Setting up a Virtual Environment

Create and activate a virtual environment to isolate your tutorial dependencies:

# Create virtual environment
python3 -m venv l2p-tutorial-env

# Activate (macOS/Linux)
source l2p-tutorial-env/bin/activate

# Activate (Windows)
l2p-tutorial-env\Scripts\activate

# Install dependencies inside the environment
pip install l2p llm openai

In VS Code, press Cmd+Shift+P (macOS) or Ctrl+Shift+P (Windows/Linux), search for "Python: Select Interpreter", and point it to the virtual environment you just created.

LLM Configuration Files

L2P uses YAML configuration files to connect to LLM backends. The UnifiedLLM and OPENAI classes both derive from the BaseLLM abstract class, which defines the interface for querying language models. The YAML config specifies provider, model, endpoint, and generation parameters. Click the buttons below to download placeholder YAML files - replace the values with your actual configuration.

OpenAI SDK (OPENAI)

Provider configurations for the OpenAI SDK-based LLM backend:

openai deepseek mistral gemini anthropic glm ollama-cloud

llm (UnifiedLLM)

Provider configurations for the UnifiedLLM multi-provider backend:

openai deepseek mistral gemini anthropic ollama

Tutorial Schedule

A three-part tutorial including a hands-on interactive session.

Please bring your own LLM API keys or capable local machine (to run local models) to participate.

01

Foundations

Why LLMs Should Not Replace Planners

LLMs excel at text generation but lack the soundness guarantees required for planning. A planner provides provable correctness; an LLM provides heuristically plausible text. Treating LLMs as planners leads to hallucinated actions, invalid state transitions, and plans that cannot be executed.

ACL Survey Paper LLM-Modulo Framework LLMs Still Can't Plan; Can LRMs?

Empirical Reasoning Limitations

Current LLMs struggle with causal reasoning, long-horizon dependencies, and maintaining consistent world state across multiple steps. Empirical studies show that even state-of-the-art models fail on simple planning benchmarks, revealing fundamental gaps in their reasoning capabilities.

Chain of Thoughtlessness? An Analysis of CoT in Planning An Analysis of Iterative Prompting for Reasoning Problems Can Large Language Models Really Improve by Self-critiquing Their Own Plans?

Separation of Modelling vs. Solving

A key insight of the L2P framework is that writing a PDDL model (the domain and problem) is fundamentally different from solving it. LLMs are well-suited to assist with the creative task of modelling a domain from natural language, while classical planners should handle the search for valid plans. This separation leverages the strengths of both AI paradigms.

On the Limit of Language Models as Planning Formalizers Language Model Planners do not Scale, but do Formalizers?

PDDL Recap & Failure Modes

A quick refresher on PDDL syntax, types, predicates, actions, and problem definitions - followed by common failure modes when LLMs attempt PDDL generation: type mismatches, undeclared predicates, inconsistent state transitions, and malformed action effects.

planning.wiki PDDL Guide Understanding the Capabilities of Large Language Models for Automated Planning

02

Live L2P Tutorial

A live walkthrough of the L2P toolkit from three perspectives: the end-user, the CLI, and programmatic/agent usage.

Interactive User Workflow

Using l2p init to configure an LLM provider, then l2p generate domain and l2p generate problem for step-by-step interactive PDDL generation guided by the LLM. The l2p chat REPL session enables natural-language-driven PDDL editing with live validation.

CLI Feature Deep-Dive

Stateless commands for building PDDL without an LLM: l2p set to inject individual components, l2p build to assemble full domain/problem files, l2p validate for semantic checking, and l2p plan to run classical planners like Fast Downward and Unified Planning.

LLM Agent Integration

How LLM agents can use the CLI's JSON-based stateless commands in tool-calling loops: l2p schema --examples to discover expected schemas, l2p build --data to generate full PDDL in one call, and l2p validate for verification - all pipeable for chained agent workflows.

GitHub Repo Documentation CLI Reference

03

Hands-On Session

The core interactive component. Use your own local LLM or API keys to build pipelines. Click each topic for detailed setup instructions.

Connect Local LLMs (Ollama) or LLM API

Option 1: Local model via Ollama

L2P supports local models via Ollama through the UnifiedLLM class:

# Install Ollama and pull a model
curl -fsSL https://ollama.com/install.sh | sh
ollama pull llama2:7b

# Use UnifiedLLM with Ollama provider
from l2p.llm.unified import UnifiedLLM

llm = UnifiedLLM(
    provider="ollama",
    model="llama2:7b",
    config_path="l2p/llm/utils/llm.yaml"
)

response = llm.query("Hello, world!")
print(response)

Refer to the Ollama setup in the README for model configuration options including tokenizer settings and generation parameters.

Option 2: Cloud LLM via API key

L2P also supports cloud-based LLMs via API keys. Here is an example using OpenAI:

# Set your API key
import os
os.environ["OPENAI_API_KEY"] = "sk-your-key-here"

# Use UnifiedLLM with OpenAI provider
from l2p.llm.unified import UnifiedLLM

llm = UnifiedLLM(
    provider="openai",
    model="gpt-5-nano",
    api_key=os.getenv("OPENAI_API_KEY"),
    config_path="l2p/llm/utils/llm.yaml"
)

response = llm.query("Hello, world!")
print(response)

This pattern works for any provider supported by the UnifiedLLM backend, including Anthropic, DeepSeek, and Mistral. See the API key setup in the README for details.

Generate Entire Domains/Problems

Generating a Domain with DomainDetails

Use DomainBuilder and DomainDetails to generate a complete PDDL domain from natural language:

import os
from l2p import DomainBuilder, UnifiedLLM
from l2p.utils.pddl_types import DomainDetails

llm = UnifiedLLM(
    provider="openai",
    model="gpt-5-nano",
    api_key=os.getenv("OPENAI_API_KEY")
)
db = DomainBuilder()

results, _ = db.formalize_component(
    model=llm,
    component_class=DomainDetails,
    description="I want you to model a standard blocksworld domain.",
)

domain = results[DomainDetails][0]
print(db.generate_domain(domain))

Generating a Problem with ProblemDetails

Use ProblemBuilder and ProblemDetails to generate a PDDL problem:

import os

from l2p import ProblemBuilder, UnifiedLLM
from l2p.utils.pddl_types import ProblemDetails, PDDLType, Predicate

llm = UnifiedLLM(provider="openai", model="gpt-5-nano", api_key=os.getenv("OPENAI_API_KEY"))
pb = ProblemBuilder()

types = [PDDLType(name="block", parent="object")]
predicates = [
    Predicate(name="on", params=[{"variable": "?x", "type": "block"}, {"variable": "?y", "type": "block"}]),
    Predicate(name="on-table", params=[{"variable": "?x", "type": "block"}]),
    Predicate(name="holding", params=[{"variable": "?x", "type": "block"}]),
    Predicate(name="clear", params=[{"variable": "?x", "type": "block"}]),
    Predicate(name="arm-empty", params=[])]

results, _ = pb.formalize_component(
    model=llm, component_class=ProblemDetails,
    description="3 blocks. b2 on b3, b3 on b1, b1 on table. Goal: stack b2 on b3.",
    types=types, predicates=predicates)

problem = results[ProblemDetails][0]
print(pb.generate_problem(problem))

Check the README Quickstart and Getting Started docs for full examples of generating predicates, actions, problems, and using interactive generation via l2p generate domain.

Create Custom Generation Pipelines

Chain CLI commands together for automated, stateless pipelines - ideal for scripts and LLM agents:

# 1. Output schema for LLM reference
l2p schema domain --examples

# 2. Build domain from JSON
l2p build domain --data '{
  "name": "blocksworld",
  "types": [{"name":"block","parent":"object"}],
  "predicates": [...],
  "actions": [...]
}' -o domain.pddl

# 3. Validate
l2p validate domain domain.pddl

# 4. Plan
l2p plan --domain @domain.pddl --problem @problem.pddl --json

See the Agentic CLI section in the README and the CLI Agentic Workflow docs for end-to-end examples.

Visualizing the Pipeline

Watch how the L2P toolkit transforms unstructured natural language into executable PDDL models via a streamlined CLI experience.

0:00 / 0:00

bw_generate_predicates.py

import os
from l2p import UnifiedLLM
from l2p.domain_builder import DomainBuilder
from l2p.utils.pddl_types import Predicate, PDDLType
from l2p.utils.pddl_format import format_predicates

# set up LLM
api_key = os.getenv("OPENAI_API_KEY")
llm = UnifiedLLM(provider="openai", model="gpt-5-nano", api_key=api_key)

db = DomainBuilder() # instantiate DomainBuilder class

# context
types = [PDDLType(name="block", parent="object")]
desc = "I want you to model predicates from a standard PDDL blocksworld domain."

# generate predicates
results, raw_output = db.formalize_component(
    model=llm,
    component_class=Predicate, # component to generate
    description=desc,
    types=types                # pass in kwargs context
)

# parse out predicates list from dictionary
predicates = results[Predicate]
predicates_str = format_predicates(predicates) # format nicely

print(predicates_str)

Terminal Output

1  ### OUTPUT
2  - (clear ?x - block) ; true if block ?x has no other blocks on top of it
3  - (arm-empty ) ; true if the robotic arm is currently not holding any block
4  - (holding ?x - block) ; true if the robotic arm is holding block ?x
5  - (on ?x - block ?y - block) ; true if block ?x is stacked directly on top of block ?y
6  - (on-table ?x - block) ; true if block ?x is resting directly on the table

Example of automated PDDL predicate generation using the L2P Python API.

Language-to-Plan (L2P)

Presenter

Marcus Tantakoun