For engineers · May 2026

Technical reference.

The depth the main page leaves out — pipeline, atomic vocabulary, schema, runtime state and profiles. Tables over prose.

01Pipeline

Seven stages, end-to-end, in under a second.

#StageRoleLib
01micUSB → 16 kHz mono, VAD trims silencesounddevice
02whisperOn-device STT (base.en), sub-secondopenai-whisper
03intentFast parser, LLM fallback → typed atomic Planqwen · pydantic
04cameraOverhead + oblique side cam, 1920×1080opencv-python
05sam3Remote masks + centroidssocket.io
06homography4 anchors → robot base framenumpy · cv2
07urscriptAtomic action → URScript over TCP:30002ur_rtde

02Atomic actions

Three tiers. The LLM lives in tier 1 most of the time.

PrimitiveWhat it does
scan_scene()Capture frame, run SAM3, refresh scene state.
pick(obj, modifier?)Move above → descend → grasp → ascend. Modifier disambiguates ("leftmost").
place_at(zone)Release at a named zone (config or runtime).
place_on(support)Stack on top, using the support's cached height.
place_at_original_of(obj)Return to the XY where the named object was picked.
assign_zone(name, where)Declare a temporary zone — free_spot, corner.
measure_height(obj)Pure sonar measurement, cached.
measure_all_heights(set)Eager bulk measure before motion.
home()Joint-space move to safe pose.

Tier 2 (rare): raw motion — move_above, descend_to, ascend_to, grip, release, orient_to, gripper_set. Used for calibration and error recovery only.

Tier 3 (meta): dry_run(), abort(), await_user(prompt). Plan validation, e-stop, operator-in-the-loop.

Cardinal rule

The LLM never produces coordinates and never produces motor commands. It names objects, zones and intentions; the runtime resolves geometry, measures depth and drives motors.

03Schema

Every plan is validated before any motion: a Pydantic discriminated union over the atomic action types, followed by a cross-plan validator that rejects inconsistent grip state, missing object heights, unknown objects or zones and over-long plans. Invalid plans come back to the LLM as structured errors it corrects on the next turn.

04Runtime state

One state object, mutated by the executor. The LLM never sees it directly.

FieldRole
held_objectWhat's in the gripper now.
sceneLast scan, keyed by stable IDs.
height_cacheSonar-measured heights, per session.
zonesConfig DROP_ZONES + runtime-declared.
original_xySnapshot at pick() — makes swap trivial.
plan_logJSONL trace, one line per executed action.

05Profiles

Same atomic schema, different end-effector executor. Built around the UR5e; the same plans run on the lab's other arms unchanged.

ProfileArmEnd-effectorPayloadReachRole
ur5e_vacuumUR5eRobotiq EPick vacuum5 kg850 mmprimary · demo
ur3e_2fingerUR3eRobotiq 2-finger3 kg500 mmalso runs
ur10e_3fingerUR10eRobotiq 3-finger12.5 kg1300 mmalso runs

Profiles live in robopick/profiles.py. Activating one overrides config — robot IP, end-effector, workspace bounds, homography — without touching another profile's calibration.