ocloc: My First Rust Project (And It's 23x Faster Than cloc)
So I finally did it. I learned Rust.
Well, "learned" might be generous. I built something in Rust that actually works, ships to real package managers, and doesn't segfault. Close enough, right?
Meet ocloc (pronounced "oh-clock"). It's a lines-of-code counter that runs at about 2.3 million lines per second on my M2 MacBook. The entire Elasticsearch codebase (31,000 files, 5.5 million lines) analyzed in 2.35 seconds.
For comparison, cloc takes 56 seconds on the same repo. That's a 23x speedup.
Why build yet another LOC counter?
Look, cloc is a fantastic tool. I've used it for years. But every time I ran it on a large monorepo, I'd go make coffee while waiting for it to finish.
I wanted something that finished before I could even think about getting up.
Also, I really wanted to learn Rust. And what better way to learn a systems language than to build something that actually needs to be fast?
The stack that makes it fast
A few crates do the heavy lifting:
- Rayon for dead-simple parallelism. Rust's fearless concurrency is real. I just slap
.par_iter()on things and it spreads work across all CPU cores. No race conditions, no mutex nightmares. - memmap2 for memory-mapped file I/O on large files. Instead of reading files byte-by-byte, the kernel just maps them into memory. Massive syscall savings.
- memchr for finding newlines. This uses SIMD instructions under the hood, so we get vectorized searching that processes multiple bytes in a single CPU instruction.
- ignore crate (from the ripgrep authors) for respecting
.gitignore. Because nobody wants theirnode_modulescounted.
The analyzer itself is pretty straightforward: detect language by extension/filename/shebang, then count lines while tracking whether we're inside a block comment. Most of the speed comes from avoiding the obvious mistakes, like reading files synchronously one at a time.
The AI assist
I'm not going to pretend I wrote every line myself. This was very much a "vibe coding" session with AI assistance.
I used Claude Code and Claude Sonnet 4 heavily for the implementation. When I got stuck on Rust's borrow checker (which was... often), Claude would explain what I was doing wrong and suggest fixes that actually made sense. It's like pair programming with someone who has infinite patience for your "why won't this compile" questions.
I also used OpenAI's o3 for some of the trickier algorithmic bits, especially around the block comment parsing logic. Different models have different strengths, and I found that mixing them worked well for a project like this.
The result is clean, idiomatic Rust rather than the C-with-classes mess I would've written on my own.
Getting it into Homebrew and Cargo
Publishing to Cargo was surprisingly easy:
cargo login
cargo publishTwo commands, and your crate is live on crates.io and anyone can install it with cargo install ocloc.
Homebrew was more involved. I had to:
- Create a separate tap repository (
homebrew-ocloc) - Write a formula that handles both Apple Silicon and Intel Macs
- Set up GitHub Actions to automatically build binaries for macOS, Linux, and Windows
- Auto-update the Homebrew formula when I push a new tag
The CI pipeline builds binaries for four targets:
aarch64-apple-darwin(Apple Silicon)x86_64-apple-darwin(Intel Mac)x86_64-unknown-linux-gnu(Linux)x86_64-pc-windows-msvc(Windows)
Now when I push a tag like v0.5.0, GitHub automatically builds everything, creates a release with checksums, and updates the Homebrew formula. It's pretty satisfying to watch.
What can it do?
The basics:
# Count lines in current directory
ocloc .
# Only Rust and Python files
ocloc . --ext rs,py
# Output as JSON for piping to other tools
ocloc . --json > stats.json
# Show progress bar for large repos
ocloc /giant/monorepo --progressBut the killer feature is diff mode, which is perfect for CI pipelines:
# See what changed between commits
ocloc diff --base HEAD~1 --head HEAD
# Output markdown for GitHub PR summaries
ocloc diff --merge-base origin/main --markdown
# Fail CI if too many lines added
ocloc diff --base origin/main --max-code-added 2500 --fail-on-thresholdYou can literally add this to your GitHub Actions and have it comment on PRs with a breakdown of lines added/removed per language. Or gate merges if someone tries to dump 5000 lines of code in a single PR.
The benchmark numbers
I ran this on real repos, not synthetic tests:
| Repository | Files | Lines | cloc | ocloc | Speedup |
|---|---|---|---|---|---|
| elasticgpt-agents (small) | 302 | 53K | 0.45s | 0.07s | 6.4x |
| elasticsearch (large) | 31K | 5.5M | 56s | 2.35s | 23.8x |
The speedup gets more dramatic as repos get larger. That's the parallelism paying off. More files means more work to distribute across cores.
At 2.3 million lines per second, you could analyze the Linux kernel (~30M lines) in about 13 seconds, and a typical microservice finishes in 20 milliseconds.
What I learned about Rust
The borrow checker fought me hard for the first week or so. Then something clicked: once you start thinking in terms of ownership, most of the errors stop being mysterious and start pointing at real problems in your code.
The ecosystem is genuinely strong. There was a well-maintained crate for everything I reached for, with Rayon handling parallelism, memmap2 and memchr handling I/O, and git2 covering Git operations, all at a high quality bar.
The compile times are the one rough edge, though the runtime performance more than makes up for it.
The tooling is where Rust really won me over. cargo clippy caught plenty of bugs before they shipped, cargo fmt ended any debate about style, and cargo test was painless to set up. It's the best developer experience I've had with any compiled language.
Try it
Install via Cargo:
cargo install oclocOr Homebrew on macOS:
brew tap adhishthite/ocloc
brew install oclocThen run it on your codebase.
The code is all on GitHub. Issues and PRs welcome, especially if you want to add support for more languages.
This was a fun project. There's something satisfying about writing fast code in a language designed for fast code, and Rust's reputation is well-earned. I'm definitely building more stuff with it.
If you've been putting off learning Rust because it seems intimidating, a small, performance-sensitive project like this is a good way in. You spend a lot of time arguing with the compiler, but by the time your code compiles, you've usually fixed the bugs that would have bitten you later anyway.
Check out ocloc on GitHub
