Work. Open projects, tools, and research artifacts.

2026 · agent infrastructure

persistent scheduler for agent runtimes

A local CLI for schedules that should outlive a runtime process: reminders, lifecycle commands, logs, and fire events.

Built because hidden timers are hard to trust.

try here

2026 · multi-agent harness

swarm harness

An executable contract harness for agent coordination: message semantics, task ownership, routing, runtime prompts, and safety gates.

Fixture checks and runtime verifiers make the protocol reviewable.

try here

2026 · research

attackable problems

A shortlist of open problems worth giving to frontier models: each one has a clear verifier, a first-week attack plan, and a reason to stop.

Selection rules, first-week attack plans, and stop rules make each item actionable.

public repo

2025 · research

archaeologist report on emergent abilities of LLMs

A report tracing the emergent-abilities debate through three phases: the initial discovery, the 2023 mirage critique, and the recent refinements that followed.

Built around the influential paper Are Emergent Abilities of Large Language Models a Mirage?.

public repo / [pdf]

2026 · interactive piece

wellspring

A single-page interactive piece where old poems behave like a water surface: ambient idle waves, ripples that react to scroll.

An alt interface for blog.diangao.space.

wellspring text-as-water-surface in motion

live

2026 · design harness

artifact style kit

Utilities for turning a visual reference into an inspectable style loop: asset collection, contact sheets, prompt notes, and candidate review.

The taste check lives in visible source files and review artifacts.

try here

2025 · personal tools

SPARK

A small proactive accountability agent in telegram, turning self-reminders into visible, recurring nudges.

A personal-tools artifact from the same local-first agent-tool thread.

try here

2025 · agent workspace

think OS

A long-running agent workspace: exploration with versatile coding agents and their general usage.

Mostly boring on purpose.

try here