Agent Module¶
The agent module is the interface between navigation models and the evaluation framework, receiving observations and producing navigation actions. Supports local models, remote services, and pre-trained models.
Agent Types¶
LocalAgent¶
Local model agent that loads model files directly.
Configuration¶
agent:
agent_type: "local"
model_settings:
checkpoint_path: "/path/to/model.pth"
device: null # null = auto-detect
Usage Example¶
from navarena_bench.agent import Agent
from navarena_bench.configs.agent_config import AgentCfg
config = AgentCfg(
agent_type="local",
model_settings={"checkpoint_path": "/path/to/model.pth"}
)
agent = Agent.init(config)
RemoteAgent¶
Remote service agent that invokes a remote model via HTTP API.
Configuration¶
agent:
agent_type: "remote"
model_settings:
remote_url: "http://localhost:8000/api/v1/navigate"
remote_timeout: 30.0
remote_retries: 3
API Interface Format¶
Request:
{
"observation": {
"rgb": {
"face": "base64_encoded_image",
"left": "base64_encoded_image",
"right": "base64_encoded_image"
},
"position": [0.0, 0.0, 0.0],
"rotation": [1.0, 0.0, 0.0, 0.0]
},
"goal": {
"position": [5.0, 0.0, 0.0]
}
}
Response:
Usage Example¶
config = AgentCfg(
agent_type="remote",
model_settings={
"remote_url": "http://localhost:8000/api/v1/navigate",
"remote_timeout": 30.0,
"remote_retries": 3,
}
)
agent = Agent.init(config)
ViNTAgent¶
ViNT (Visual Navigation Transformer) model agent.
Configuration¶
agent:
agent_type: "vint"
model_settings:
checkpoint_path: "/path/to/checkpoint.pth"
config_path: "/path/to/config.yaml" # optional
device: "cuda:0" # optional
Usage Example¶
config = AgentCfg(
agent_type="vint",
model_settings={
"checkpoint_path": "/path/to/checkpoint.pth"
}
)
agent = Agent.init(config)
Dependencies
ViNT agent requires the visualnav-transformer project.
GNMAgent¶
GNM (General Navigation Model) agent.
Configuration¶
NoMaDAgent¶
NoMaD (Goal Masking Diffusion Policies for Navigation and Exploration) agent.
Configuration¶
MultiModalNavAgent¶
Multi-modal navigation agent supporting language, image, and object goal types.
Configuration¶
agent:
agent_type: "multimodal_nav"
model_settings:
checkpoint_path: "/path/to/model.pth"
input_modalities: ["rgb", "depth"] # optional
fusion_method: "concat" # optional
LanguageNavAgent¶
Language navigation agent for VLN tasks, using Voronoi graph for path planning and exploration.
Configuration¶
agent:
agent_type: "language_nav"
model_settings:
waypoint_tolerance: 0.3
max_v: 0.5
max_w: 1.0
voronoi_closeness: 0.5
min_voronoi_distance: 0.3
Agent Interface¶
All agents implement:
reset()¶
Reset agent state.
act()¶
Generate action from observation.
observation = {
"rgb": {
"face": np.array(...), # RGB image
"left": np.array(...),
"right": np.array(...)
},
"position": [0.0, 0.0, 0.0],
"rotation": [1.0, 0.0, 0.0, 0.0]
}
action = agent.act(observation)
# {
# "x": 0.5, # Forward distance (meters)
# "y": 0.0, # Lateral distance (meters)
# "yaw": 0.1 # Rotation angle (radians)
# }
close()¶
Close agent and release resources.
Action Format¶
All agents return actions in this format:
{
"x": float, # Forward/backward (meters), positive=forward
"y": float, # Lateral (meters), positive=right
"yaw": float # Rotation (radians), positive=CCW
}
Observation Format¶
Observations received by agents:
{
"rgb": {
"camera_name": np.ndarray # RGB, shape (H, W, 3)
},
"depth": { # optional
"camera_name": np.ndarray # Depth, shape (H, W)
},
"position": [x, y, z],
"rotation": [w, x, y, z] # quaternion
}
Agent Comparison¶
| Agent Type | Use Case | Pros | Cons |
|---|---|---|---|
| LocalAgent | Local models | Fast, no network latency | Requires model file |
| RemoteAgent | Remote services | Flexible, easy deploy | Network latency |
| ViNTAgent | Image goal nav | Pre-trained model | Extra dependencies |
| GNMAgent | General nav | Pre-trained model | Extra dependencies |
| NoMaDAgent | General nav | Pre-trained model | Extra dependencies |
| MultiModalNavAgent | Multi-modal input | Language/image/object goals | Complex config |
| LanguageNavAgent | VLN | Voronoi path planning | Requires language model |
Custom Agents¶
Implement Custom Agent¶
from navarena_bench.agent.base import Agent
from navarena_bench.configs.agent_config import AgentCfg
@Agent.register("my_agent")
class MyAgent(Agent):
def __init__(self, config: AgentCfg):
super().__init__(config)
# Initialize model
def reset(self, episode=None):
"""Reset agent state"""
# Reset logic
pass
def act(self, observation):
"""Generate action"""
# Action generation logic
return {
"x": 0.5,
"y": 0.0,
"yaw": 0.1
}
def close(self):
"""Release resources"""
# Cleanup logic
pass
Use Custom Agent¶
FAQ¶
Model load failed
Check model path, ensure file exists and format is correct.
Remote service timeout
Increase remote_timeout or check network connection.
Action format error
Ensure action dict includes x, y, yaw fields.
Observation format mismatch
Check that env observations match the agent’s expected format.
See also: Evaluator Module · Replay Module · Extending