Customize, control, and enhance LLM generation with logits processors, featuring visualization capabilities to inspect and understand state transitions.
pip install litelinesThe only dependency is outlines-core.
- transformers
- Download a model and its tokenizer:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = torch.device("cuda") # "cuda", "mps", or "cpu"
model_id = "Qwen/Qwen2.5-0.5B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_id).to(device)
tokenizer = AutoTokenizer.from_pretrained(model_id)- Prepare the inputs to the LLM:
user_input = "Are you sentient?"
messages = [{"role": "user", "content": user_input}]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt",
return_dict=True
).to(model.device)- Define a logits processor through a Pydantic schema or a regular expression and visualize it:
from litelines.transformers import SchemaProcessor
processor = SchemaProcessor(response_format=r"Yes\.|No\.", tokenizer=tokenizer)
processor.show_graph()- Generate a structured response:
generated = model.generate(**inputs, logits_processor=[processor])
print(tokenizer.decode(generated[0][inputs['input_ids'].shape[-1]:-1]))
# No.- Visualize the selected path:
processor.show_graph()- Define a pydantic schema describing the required JSON or provide the JSON schema as a string:
from typing import Literal
from pydantic import BaseModel, Field
class Sentiment(BaseModel):
"""Correctly inferred `Sentiment` with all the required parameters with correct types."""
label: Literal["positive", "negative"] = Field(
..., description="Sentiment of the text"
)
'''
Alternatively, provide the JSON schema as a sting:
Sentiment = """{'description': 'Correctly inferred `Sentiment` with all the required parameters with correct types.',
'properties': {'label': {'description': 'Sentiment of the text',
'enum': ['positive', 'negative'],
'title': 'Label',
'type': 'string'}},
'required': ['label'],
'title': 'Sentiment',
'type': 'object'}"""
'''- Prepare the inputs to the LLM:
user_input = "What is the sentiment of the following text: Awesome!"
messages = [{"role": "user", "content": user_input}]
inputs = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, return_tensors="pt", return_dict=True
).to(model.device)- Define the processor and visualize it:
from litelines.transformers import SchemaProcessor
processor = SchemaProcessor(response_format=Sentiment, tokenizer=tokenizer)
processor.show_graph()- Generate a structured answer:
generated = model.generate(**inputs, logits_processor=[processor])
print(tokenizer.decode(generated[0][inputs['input_ids'].shape[-1]:-1]))
# {"label": "positive"}- Visualize the selected path:
processor.show_graph()- Define a pydantic schema describing the tool:
from typing import Literal
from pydantic import BaseModel, Field
class Sentiment(BaseModel):
"""Correctly inferred `Sentiment` with all the required parameters with correct types."""
label: Literal["positive", "negative"] = Field(
..., description="Sentiment of the text"
)- Prepare the inputs to the LLM:
from openai import pydantic_function_tool
user_input = "What is the sentiment of the following text: Awesome!"
messages = [{"role": "user", "content": user_input}]
inputs = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, tools=[pydantic_function_tool(Sentiment)], return_tensors="pt", return_dict=True
).to(model.device)- Define the processor, add the parameter
include_tool_call=Trueand visualize it:
from litelines.transformers import SchemaProcessor
processor = SchemaProcessor(response_format=Sentiment, tokenizer=tokenizer, include_tool_call=True)
processor.show_graph()- Generate a structured response:
generated = model.generate(**inputs, logits_processor=[processor])
print(tokenizer.decode(generated[0][inputs['input_ids'].shape[-1]:]))
# <tool_call>
# {"name": "Sentiment", "arguments": {"label": "positive"}}
# </tool_call>- Visualize the selected path:
processor.show_graph()- Define a pydantic schema describing the required JSON or provide the JSON schema as a string:
from typing import Literal
from pydantic import BaseModel, Field
class Sentiment(BaseModel):
"""Correctly inferred `Sentiment` with all the required parameters with correct types."""
label: Literal["positive", "negative"] = Field(
..., description="Sentiment of the text"
)- Prepare the inputs to the LLM:
user_input = "What is the sentiment of the following text: Awesome!"
messages = [{"role": "user", "content": user_input}]
inputs = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, return_tensors="pt", return_dict=True
).to(model.device)- Define the processor, add parameter
allow_preamble=Trueand visualize it:
from litelines.transformers import SchemaProcessor
processor = SchemaProcessor(response_format=Sentiment, tokenizer=tokenizer, allow_preamble=True)
processor.show_graph()- Generate a structured response:
generated = model.generate(**inputs, logits_processor=[processor])
print(tokenizer.decode(generated[0][inputs['input_ids'].shape[-1]:]))
# The sentiment of the text "Awesome!" is positive.
# {"label": "positive"}- Visualize the selected path:
processor.show_graph()




