Trajectory¶
The Trajectory and message classes define the data model for captured agent executions.
Trajectory¶
Trajectory
¶
Bases: BaseModel
A single agent execution trace on an Inspect sample.
Trajectories are grouped into Runs, which provide task and model context. A trajectory represents one sample's execution trace.
Source code in lunette/models/trajectory.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 | |
messages
instance-attribute
¶
Sequence of messages (System, User, Assistant, Tool) in this execution.
metadata = Field(default_factory=dict)
class-attribute
instance-attribute
¶
Additional metadata about this trajectory execution.
sample
instance-attribute
¶
Inspect sample ID - identifies which sample this trajectory is for.
sandbox_id = None
class-attribute
instance-attribute
¶
Optional sandbox container ID if this trajectory ran in a sandbox.
score
property
¶
Return the unique score value for the trajectory if it exists and None otherwise.
scores = None
class-attribute
instance-attribute
¶
Multi-metric scores for this trajectory, if available.
solution = None
class-attribute
instance-attribute
¶
Optional solution or patch produced by the agent.
from_inspect(sample)
classmethod
¶
Convert an Inspect AI EvalSample to a Trajectory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sample
|
EvalSample
|
The Inspect AI sample to convert |
required |
Returns:
| Type | Description |
|---|---|
Trajectory
|
Trajectory object containing the sample's execution trace |
Source code in lunette/models/trajectory.py
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 | |
Run¶
Run
¶
Bases: BaseModel
A collection of trajectories from a single evaluation run.
This is the primary unit for uploading evaluation results. A run represents
a single execution of inspect eval that produces multiple trajectory samples.
All trajectories in a run share the same task and model.
Source code in lunette/models/run.py
id = None
class-attribute
instance-attribute
¶
Optional server-assigned run ID. If None, server generates a UUID. If provided, appends to existing run.
model
instance-attribute
¶
Model identifier used for this run (e.g., 'claude-sonnet-4', 'gpt-4').
task
instance-attribute
¶
Task name for this run (e.g., 'math-eval', 'swe-bench').
trajectories
instance-attribute
¶
List of trajectory samples produced during this evaluation run.
Messages¶
SystemMessage¶
SystemMessage
¶
Bases: BaseMessage
System message.
Source code in lunette/models/messages.py
from_inspect(position, message)
classmethod
¶
Convert an Inspect AI ChatMessageSystem to SystemMessage.
Source code in lunette/models/messages.py
UserMessage¶
UserMessage
¶
Bases: BaseMessage
User message.
Source code in lunette/models/messages.py
from_inspect(position, message)
classmethod
¶
Convert an Inspect AI ChatMessageUser to UserMessage.
AssistantMessage¶
AssistantMessage
¶
Bases: BaseMessage
Assistant message.
Source code in lunette/models/messages.py
from_inspect(position, message)
classmethod
¶
Convert an Inspect AI ChatMessageAssistant to AssistantMessage.
Source code in lunette/models/messages.py
ToolMessage¶
ToolMessage
¶
Bases: BaseMessage
Tool message.
The content field contains the result of the tool call.
Source code in lunette/models/messages.py
arguments
property
¶
Get the arguments of this tool call.
function
property
¶
Get the function name of this tool call.
result
property
¶
Get the result of this tool call.
from_inspect(position, message, tool_call)
classmethod
¶
Convert an Inspect AI ChatMessageTool to ToolMessage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
position
|
int
|
Position in the trajectory |
required |
message
|
ChatMessageTool
|
The Inspect ChatMessageTool |
required |
tool_call
|
ToolCall
|
The matching ToolCall (found by the caller) |
required |
Returns:
| Type | Description |
|---|---|
ToolMessage
|
ToolMessage with proper tool_call reference |
Source code in lunette/models/messages.py
ToolCall¶
ToolCall
¶
Bases: BaseModel
A tool call.
Does not include the result of the tool call, as it is not available until a later ToolMessage is received.
Source code in lunette/models/messages.py
from_inspect(tool_call)
classmethod
¶
Convert an Inspect AI ToolCall to our ToolCall model.
Source code in lunette/models/messages.py
Scores¶
ScalarScore
¶
Bases: BaseModel
A scalar score for a trajectory.
Source code in lunette/models/trajectory.py
answer = None
class-attribute
instance-attribute
¶
Answer extracted from model output, if available.
explanation = None
class-attribute
instance-attribute
¶
Explanation of the score, if available.
metadata = None
class-attribute
instance-attribute
¶
Additional metadata about the score.
value
instance-attribute
¶
The value of the score.