Back to blog

Blog post

Thoughts from Building a Disk Scanner

Published Feb 16, 2023

Thoughts from Building a Disk Scanner cover

Building a Disk Scanner, While Learning What I Was Actually Building

To start with, this was not a big plan.

My Mac ran out of storage, and I wanted a clear answer to a very basic question: where is the space actually going.

System settings gave me categories, but not enough detail to make decisions. I could see that storage was full, but not what was worth cleaning.

So I started building a scanner.

Why I built one instead of buying one

There were paid tools. There were free tools. I still wanted to build my own.

Part of it was curiosity. Part of it was learning Rust. I also wanted a desktop UI, so Tauri felt like a practical shell around Rust scanning logic.

I thought the problem would be straightforward.

The first model felt complete, until it didn’t

The first mental model was simple:

Read directory entries, sum file sizes, recurse into subdirectories.

That model is enough to get a scanner running quickly.

One thing I handled early was symlinks. Following them blindly can create loops or drift outside the intended root. So I skipped symlinks for size accounting.

The first version worked. It scanned folders, reported totals, and looked usable.

Then I compared results with macOS and saw totals that were too high to ignore.

The bug was not in recursion, it was in identity

At first I thought I had a math bug or a traversal bug.

The real issue was identity.

Hard links can create multiple directory entries for the same underlying file object. If you sum by path alone, the same physical data can be counted multiple times.

That was the moment the scanner changed from “walk paths and add sizes” to “track filesystem objects and count once.”

Counting objects instead of paths

The fix was simple in concept: keep a set of seen identities based on device and inode.

If a file identity appears for the first time, add its size.

If it appears again through another path, skip recounting.

I applied the same idea to directories for safer traversal behavior.

No dramatic rewrite. Just a better model.

Totals moved much closer to what the system reports, and they were stable enough to trust.

Turning scanner logic into a real app

Once the scanner behavior felt reliable, I moved fully into an app flow with Tauri.

Rust handled scanning and emitted progress events. The UI consumed those updates and rendered a live state, then final results.

That gave me room to add the parts that make a tool usable, not just correct:

visual size hotspots, drill-down by folder, and scan history.

What building this changed for me

Before this project, “disk scanner” sounded like a solved and almost boring category.

After building one, it felt like a reminder of something bigger: many tools look simple from outside because someone already absorbed the hard parts inside.

The hard part here was not writing recursion. It was choosing the right abstraction for what should be counted.

That shift took me longer than expected, but it is the part I remember most.

Sitting with it

StorageLens started as a quick fix for one machine.

It became a project where I learned Rust more deeply, understood filesystem behavior better, and got clearer about the difference between something that works and something you can trust.

I still like projects that start from a small personal problem and slowly force better thinking.