The Decision That Shapes Everything
When we started building CodePulse, the first architectural decision was the most important one: where does the data live?
CodePulse monitors everything your AI coding agent does. It sees every file edit, every shell command, every commit diff, every permission request, and every token count. That is sensitive data. It includes your source code, your project structure, your development patterns, and potentially secrets that pass through the terminal.
We had three options: a cloud database, a local database, or flat files. We chose flat files. Specifically, JSONL (JSON Lines) stored on your local disk, with no cloud component, no synchronization service, and no third-party storage.
This article explains why.
Why Not a Cloud Database?
The obvious choice for a modern developer tool is a cloud-hosted database. Supabase, PlanetScale, Firebase, or a managed PostgreSQL instance. You get real-time sync, multi-device access, backups, and a dashboard for free.
But for CodePulse, a cloud database would have been the wrong call for three reasons.
Your code diffs are not our business
CodePulse intercepts git commits and sends their diffs to a fast AI model for review. Those diffs contain your actual source code. Routing that data through our servers, even encrypted, creates a trust problem that no privacy policy can solve. The only way to guarantee we never see your code is to never have it leave your machine.
Latency kills the experience
CodePulse sits in the critical path between Claude Code and your Telegram bot. When Claude asks for permission to run a command, you want that notification instantly. A round trip to a cloud database adds 50-200ms of latency per event. For a tool that processes hundreds of events per session, that adds up to noticeable delays. Local disk writes are sub-millisecond.
Offline must work
Developers work on trains, planes, and in coffee shops with unreliable WiFi. CodePulse must continue logging and processing events even when the network is down. With local storage, the tool never stops working because a database connection failed.
Why Not SQLite?
SQLite is the usual answer for "local but structured." It is embedded, fast, and battle-tested. Many local-first tools use it, and for good reason.
We considered SQLite and decided against it for CodePulse. Here is why.
Append-only workload
CodePulse's primary data operation is appending events. A session generates a stream of events: tool calls, permission requests, notifications, token counts. These events are written once and read occasionally, typically when generating a Morning Briefing or replaying a session.
For append-only workloads, JSONL is faster than SQLite. Appending a line to a file is a single fs.appendFile call. No schema, no indexes to update, no write-ahead log, no page allocation. The operating system handles buffering and flushing.
Schema evolution
CodePulse is under active development. The shape of events changes frequently. New fields get added, old fields get deprecated, and entirely new event types appear. With JSONL, schema evolution is trivial: you add fields to new events and ignore them in old ones. Every line is a self-contained JSON object. There is no migration step, no ALTER TABLE, and no version tracking.
With SQLite, adding a column requires a migration. Changing a column type requires copying the table. If a user updates CodePulse and the schema has changed, the tool needs migration logic that handles every possible previous version. For a tool that is iterated on daily, that migration burden is significant.
Human readability
JSONL files are plain text. You can open them in any editor, pipe them through jq, grep for specific events, or tail them in real time. This matters for a developer tool. When something goes wrong, you want to inspect the data directly, not write SQL queries against a binary file.
# Find all commit events from today
cat ~/.codepulse/sessions/2026-02-25/*.jsonl | jq 'select(.type == "commit")'
# Count token usage across all sessions
cat ~/.codepulse/sessions/**/*.jsonl | jq 'select(.type == "token_usage") | .tokens' | paste -sd+ | bc
# Tail events in real time
tail -f ~/.codepulse/sessions/current.jsonl | jq .
Try doing that with SQLite.
No binary dependencies
SQLite requires a native binding (better-sqlite3 or sql.js). Native bindings mean platform-specific compilation, potential issues with Node.js version mismatches, and extra installation steps. sql.js avoids native compilation by using WebAssembly, but it is slower and uses more memory.
JSONL requires fs.appendFile and readline. Both are built into Node.js. Zero dependencies. The install step is npm install and nothing else.
The JSONL Architecture
Here is how CodePulse structures its local data.
Session files
Each Claude Code session gets its own JSONL file. The filename includes the session ID and timestamp. Every event during that session is appended as a single JSON line.
~/.codepulse/
sessions/
2026-02-25/
session-abc123-1740000000.jsonl
session-def456-1740003600.jsonl
2026-02-24/
session-ghi789-1739913600.jsonl
Event structure
Every event follows the same top-level structure:
{
"timestamp": "2026-02-25T14:30:00.000Z",
"type": "tool_use",
"session_id": "abc123",
"data": {
"tool_name": "Bash",
"command": "npm test",
"duration_ms": 3200,
"exit_code": 0
}
}
The type field determines the shape of data. Common types include tool_use, permission_request, permission_response, commit, token_usage, notification, and user_message.
Rotation and cleanup
Session files are organized by date. CodePulse does not delete old files automatically. You control your data retention. If you want to keep six months of history, keep the files. If you want to purge last week, delete the directory. No database vacuum, no orphaned records, no referential integrity concerns.
For users who generate large volumes of data, the Premium tier includes automatic rotation policies that compress old sessions and archive them.
Security by Architecture
Local-first is not just a performance choice. It is a security architecture.
No attack surface
CodePulse has no server, no API endpoint, no authentication system, and no user accounts. There is no cloud infrastructure to breach, no database to dump, no API keys to leak. The attack surface is your local filesystem, which is already protected by your operating system's access controls.
No data in transit
The only network traffic CodePulse generates is between your machine and the Telegram API. That traffic contains notification messages and your responses, not your source code or session data. The Telegram Bot API uses HTTPS with TLS 1.2+, so even those messages are encrypted in transit.
Commit diffs sent to the AI review model (Haiku) are processed through Anthropic's API with the same security guarantees as any Claude API call. The diffs are not stored by Anthropic and are not used for model training.
Auditable by design
Every piece of data CodePulse handles is in a plain text JSONL file on your disk. You can audit exactly what is stored, when it was stored, and what it contains. There is no opaque database format, no encrypted blob storage, and no proprietary data format. If you want to verify that CodePulse is not storing something it should not be, open the file and look.
Trade-offs We Accept
Local-first is not free. There are real trade-offs.
No multi-device sync
Your session data lives on one machine. If you use Claude Code on your desktop and your laptop, each machine has its own CodePulse data. We considered adding optional sync through user-provided storage (like a personal S3 bucket or a Syncthing folder), but decided against the complexity for now. The Telegram integration provides the cross-device visibility that matters most: you can see what any machine is doing from your phone.
No web dashboard
There is no hosted dashboard where you can browse your session history in a web browser. The Morning Briefing and Replay features deliver summaries through Telegram, which is sufficient for most use cases. For deeper analysis, the JSONL files are available for any tool you prefer.
Limited query capability
You cannot run a SQL query like "show me all sessions where token usage exceeded 100k tokens in the last month." You can write a shell script that does the same thing with jq, but it is not as convenient. For users who need advanced analytics, exporting JSONL to a local SQLite database or a Jupyter notebook is straightforward.
Why This Matters for Developer Tools
The developer tools industry has a growing trust problem. SaaS tools that touch source code, from code review platforms to CI/CD pipelines, represent a concentrated target. A breach at any one of these services exposes the codebases of every customer.
Local-first architecture eliminates that risk category entirely. Your data never leaves your machine. There is no central repository of customer data to breach. There is no admin panel an attacker can access. There is no third-party subprocessor handling your code.
For CodePulse, local-first is not a feature. It is the foundation. Every other design decision flows from it: the JSONL format, the Telegram integration, and the hook-based architecture. They all serve the same principle: your data, your machine, your control.
See It In Action
CodePulse's full feature set, from Morning Briefings to AI-powered commit review, runs entirely on your local machine. Check out the Features page to see how local-first architecture enables every capability without compromising your privacy.