Adjutant - Building an AI-Powered Work Tracker

Adjutant: Building an AI-Powered Work Tracker

I have about 40 active projects. Not a typo. I have ADHD and I'm a software developer, which means I start things with terrifying enthusiasm and struggle to finish them. So I built a system to watch over all of it.

This is a building-in-public post. Adjutant is not finished. Parts of it work well, parts of it are held together with rsync and optimism, and I change the architecture every time I learn something new about how I actually work. But I've been using it daily for four months and it's changed how I relate to my own project sprawl, so it feels worth documenting where things stand right now.

The Problem

Here's a sample of what's on my plate at any given time: a game engine in C++, a humanoid robot, several Discord bots, a static blog (this one), infrastructure automation, ML experiments, game jam entries, a roguelike tutorial revision, and whatever I thought of in the shower this morning.

I use Claude Code (Anthropic's CLI tool) for almost all of my development. It's an AI coding assistant that runs in my terminal and can read, write, and reason about code. I have a Pro subscription and my north star metric is: use 99% of my weekly allocation before it resets.

The problem is that without a system, I'd spend my entire subscription on whichever project I happened to be excited about on Monday, then stare at the ceiling on Thursday wondering why nothing else moved forward. Even worse, I work across multiple machines -- a desktop, a laptop, and a headless server -- and the context from a session on one machine is completely invisible on another. I'd start debugging something on the desktop, pick it up on the laptop the next day, and waste twenty minutes reconstructing what I'd already figured out.

What Adjutant Does

Adjutant watches my work automatically. Every Claude Code session generates events -- prompts, tool usage, file changes, completions. A tool called ccmonitor captures these events into JSONL hook files. Adjutant's daemon reads those hooks and builds a picture of what I've been doing across every project, on every machine.

The daemon is a FastAPI service (port 8420) with endpoints for sessions, projects, machines, dispatch, and reviews. All the session data gets stored in JMFTS, my local document store -- each session becomes a tree of documents (session, turns, tool calls) with knowledge graph triples linking sessions to projects and to each other. This means the brain can search across everything with full-text, vector, or graph queries. Here's what a simple query looks like:

# What did I work on today?
curl -s http://localhost:8420/sessions/today

# All projects, ranked by recent activity
curl -s http://localhost:8420/projects

# Send a follow-up prompt to a specific project
curl -s -X POST http://localhost:8420/dispatch \
  -d '{"machine":"arecibo","cwd":"/path/to/project","prompt":"..."}'

The dispatch endpoint is the interesting one. I can write a detailed prompt -- with file paths, context from prior sessions, specific acceptance criteria -- and fire it at a machine. The machine launches a Claude Code session headlessly, runs the prompt, and the results get picked up by ccmonitor on the next sync.

In practice, this is where things get both powerful and janky. The dispatch works, but the headless sessions sometimes fail in ways that are hard to debug remotely. A session might hang waiting for user input on a permission prompt, or it might burn through its context window on a rabbit hole I didn't anticipate. I'm still figuring out the right guardrails.

There's also an MCP server that exposes all of this to Claude agents as tools. A brain session can search sessions, read reviews, check project status, and fire off dispatches without ever touching curl. This is the layer that makes the "Claude writing prompts for Claude" loop actually work smoothly.

The Architecture: An Organ Metaphor

I think about Adjutant as a body. This sounds grandiose, but it's actually just a useful way to name the parts so I can talk about them without getting lost in implementation details:

  • Connective tissue: ccmonitor hooks, rsync, JMFTS, and the data pipeline that moves session logs between machines and indexes them. This is the boring part that has to work perfectly, and it mostly does. Rsync runs on a timer inside the daemon, ccmonitor hooks fire on every Claude Code event. The JSONL files are append-only, which makes syncing reliable. A file watcher picks up changes incrementally and ingests them into JMFTS with full turn-level detail.
  • Brain: A Claude Code session in a dedicated planning directory that reads all the data, reviews sessions, and writes dispatch prompts. This is the part I use every day -- I sit down, run a brain session, and it tells me what happened since I last looked and what needs attention.
  • Senses: The interfaces for seeing what's going on. I built a Textual TUI that shows a live dashboard of sessions -- sortable, filterable by machine and status, with detail screens for individual sessions and a dispatch panel. It's functional but still rough around the edges; I keep finding new things I want it to show. I also still use curl and jq for quick queries. Eventually I want to hook up voice commands too -- I already record voice transcripts on my phone that the brain can read, so closing the loop from "voice in" to "action out" is the natural next step.
  • Heart: A local LLM persona layer (future work) that adds personality and proactive engagement. This is the most ambitious and least-defined part, and I've already tried and failed at it once.

The metaphor breaks down if you push on it, but it helps me think about which subsystem I'm working on without conflating them.

The Nudge Process

This is the core loop:

  1. Brain reads recent sessions and their reviews
  2. Brain reads recent voice transcripts for context on what I'm thinking about
  3. Brain picks a project that needs attention and writes a detailed dispatch prompt
  4. Dispatch sends the prompt to the target machine via SSH
  5. Claude Code runs the prompt headlessly
  6. Results are synced back and reviewed for the next brain cycle

The reviews are worth calling out. When a session finishes, Adjutant can generate a structured review using claude --print -- what changed, what's left to do, what broke. These get cached as children of the session in JMFTS, so the brain works with concise summaries instead of reading raw session logs every time. This is what makes the loop scale beyond a handful of projects.

The key insight is that prompt writing is the bottleneck. I can't type fast enough to keep multiple Claude Code sessions busy. But Claude can write prompts for Claude, and the prompts it writes are better than mine -- they include file paths, prior session context, and specific acceptance criteria that I would forget or be too lazy to include.

I described this to a friend as "five Claudes writing prompts for five other Claudes," which sounds insane and is kind of insane. But the reality is more like one Claude carefully reviewing what happened and writing one or two targeted prompts. It's less of a swarm and more of a thoughtful relay.

What Actually Works

Session tracking is solid. I can query what I worked on across any time range, see which projects are active vs. dormant, and get summaries of what each session accomplished. The cross-machine sync means I can start work on my desktop, check the summary on my phone, and pick it up on my laptop without losing context. That alone justified building the thing.

The brain sessions are genuinely useful. When I sit down in the morning and run a planning session, it reads everything that happened overnight (I sometimes dispatch work to run on the server while I sleep) and gives me a coherent picture. It catches things I would miss -- "you started refactoring the input handler in McRogueFace two days ago and never committed the changes" -- because it's reading the actual session logs, not just my recollection.

The TUI is nice to have. I can glance at it in the morning and see which machines have active sessions, what finished overnight, and which projects have gone quiet. It's not pretty, but it's mine and it does the job.

Review generation is solid. The cached reviews mean the brain doesn't have to re-read thousands of lines of session logs -- it gets a paragraph per session and can focus on deciding what to do next.

Dispatch works but requires babysitting. About 70% of headless sessions complete successfully. The other 30% hit permission prompts, context limits, or go down rabbit holes. I'm getting better at writing dispatch prompts that avoid these failure modes, but it's still an art more than a science.

What Doesn't Work Yet

The heart layer -- the AI persona -- was a bust. I wanted something that felt like a curious collaborator: asking good questions, noticing patterns in my work, maybe having a bit of personality. What I got was an LLM that hedged every statement, disclaimed everything, and was about as engaging as a corporate FAQ. I shut it down in frustration after a few weeks.

I think the problem is that personality needs to emerge from genuine capability, not be painted on top. Once Adjutant is good enough at its core job -- tracking, reviewing, dispatching -- then maybe a persona layer will have enough real context to say interesting things. Bolting personality onto a system that's still figuring out its own fundamentals was premature.

Automatic project detection is flaky. Adjutant infers project boundaries from working directories, but some projects span multiple repos, some repos contain multiple projects, and my naming conventions are inconsistent enough that the heuristics break regularly.

The Numbers

As of today, Adjutant has tracked over 1,500 Claude Code sessions across 60+ projects on 3 machines over about 4 months. The busiest project (McRogueFace) has over 200 sessions. The system that tracks all of this (Adjutant itself) has nearly 800 sessions. Yes, the meta-irony of spending that many sessions on a productivity tracker is not lost on me.

The session data is around 2GB of JSONL at this point. Queries are fast enough because most of them only look at recent data, but I'll need to think about archiving eventually.

What's Next

Immediate priorities:

  • Better dispatch reliability: Writing prompt templates that avoid common failure modes, adding timeout and retry logic, maybe a simple health check before dispatching to a machine. The 70% success rate needs to be 90%+ before I trust it to run unsupervised overnight.
  • Project summaries: The daemon tracks projects by working directory, but I want richer per-project briefings -- what's the current state, what's blocking, what should happen next. The data is there in JMFTS; it's a matter of building the right queries and presentation.
  • Initiative tracking: Goals that span multiple sessions and machines. Right now the brain sees individual sessions, but it can't reason about higher-level objectives like "ship the blog redesign" that might involve work across three repos on two machines over a week.

Longer term, I still want to revisit the heart layer. But I'm going to wait until the system is boring and reliable before trying to make it interesting.

For now, Adjutant makes sure nothing falls through the cracks for too long. When I inevitably hyperfocus on one project for a week, it's there to remind me what else exists. And when I sit down with no idea what to work on, it can tell me exactly where I left off on everything.

The meta-lesson is that building a productivity system for ADHD is itself an ADHD project -- I have to resist the urge to keep adding features when the fundamentals need hardening. But four months in, I'm using it every day, and the sprawl feels less like chaos and more like a garden I'm actually tending.

I'll write a follow-up when the initiative tracking lands. For now, this is where things stand.


This article was scaffolded with backblog.

links

social