STN Speech-to-Note
STN is a local-first assistant for Pardus desktops that turns voice or typed requests into structured, stateful actions. Instead of switching between note apps, email, meeting tools, weather pages, alarms, and launchers, users describe what they need in one conversation while STN completes missing details, asks for approval when needed, and stores each result as a traceable local artifact.
Product
Task execution from a single desktop conversation
STN solves the daily context-switching problem: users often move between mail clients, meeting tools, weather pages, alarm utilities, note apps, and launchers for small tasks. STN keeps this work inside one chat-based desktop interface while preserving local data ownership.
The end-product accepts speech or text, extracts the user's intent and required parameters, asks only for missing details, waits for approval when an action can affect external systems, and then executes the task through isolated tool integrations.
Voice and text input
Microphone capture, offline STT, and typed chat share the same orchestration pipeline.
Structured notes
Conversations and generated notes are stored locally with session context.
Meetings and email
Provider-backed meeting creation and SMTP email flows use explicit previews and approvals.
Weather
Natural language city and date requests are normalized before provider execution.
Alarms
Exact-time and relative reminders are mapped to local scheduling tools on supported systems.
App launching
Installed desktop applications can be opened safely through desktop-entry discovery.
Workflow
From natural language to verified action
STN is designed around a stateful execution loop. It does not discard incomplete commands; it tracks the active task and continues collecting only the missing fields.
Capture request
The Qt UI receives typed text or a transcript produced from microphone audio.
Extract intent
The orchestrator resolves the task type and converts natural language slots into strict JSON.
Complete parameters
If a field is missing, STN asks for that field and stores the pending state per chat.
Validate with agents
Deterministic agents normalize dates, addresses, providers, durations, and application names.
Execute through MCP
MCP tools validate inputs, call the local system or provider, log results, and return safe outputs.
Architecture
A modular stack built for local-first task automation
The implementation separates presentation, language understanding, task logic, tool execution, and persistence. This keeps the assistant extensible without letting one provider or one action dominate the whole system.
Voice, typed chat, quick actions
Speech-to-text, intent detection, parameter completion
Email, meeting, notes, weather, alarm, app launch
Local OS actions, provider calls, safe logging
Conversation memory, artifacts, user feedback
Orchestrator
Interprets the request, chooses the right task flow, tracks missing information, and handles approvals.
Agents
Apply deterministic task rules for email, meetings, notes, weather, alarms, and application launch.
MCP tool layer
Isolates real system and provider actions behind validated inputs, logs, timing, and safe error handling.
Local persistence
Keeps conversations, pending states, notes, meeting artifacts, email drafts, and tool activity on device.
Tech Stack
The implementation stack behind STN
STN combines a native Linux desktop interface, speech processing, language orchestration, modular tool execution, and local persistence into one maintainable product pipeline.
Python
Main application logic, agent flows, integration code, and provider adapters.
PySide / Pardus
Native Qt desktop UI designed and tested for Pardus/GNU Linux environments.
Speech-to-Text
Microphone input is transcribed and passed into the same task pipeline as typed chat.
LLM Orchestration
Natural language requests are converted into intents, slots, follow-up questions, and approvals.
MCP Tools
System and provider actions are isolated behind validated tool interfaces and structured outputs.
Local Storage
Conversation history, artifacts, pending tasks, and user feedback stay on the device.
Email / Calendar / Weather
External service calls use previews, explicit confirmation where needed, and replaceable adapters.
Notes & Summaries
Requests, tool results, and meeting activity are turned into reusable notes and concise local summaries.
Demo
A two-minute path through the main product features
The demo focuses on visible end-product behavior: natural input, parameter completion, approval, and real execution through modular tools.
Screenshots
Product screens
Video
Product demo video
Engineering
Built as a real desktop product, not only a chatbot demo
Local-first runtime
Offline-capable speech processing, local SQLite state, and optional external providers.
Approval boundaries
Email and meeting actions present structured previews before execution.
Provider flexibility
Meeting, email, weather, STT, and LLM components are isolated behind replaceable interfaces.
Regression coverage
Unit, integration, voice E2E, provider-validation, and packaging checks cover the demo paths.
Team
Project members and contact
Güven Yurtseven
Ersin Sert
Onur Ceylan
Shahin Aghayev