Once a Maintainer: Benji Nguyen
After challenging himself to build something great on a six hour flight, erdtree was born, forgotten, and reinvented.
Welcome to Once a Maintainer, where each week we interview an open source maintainer and tell their story.
This week we’re talking to Benji Nguyen, creator of erdtree, a multi-threaded filesystem and disk-usage analysis tool written in Rust. Benji is a former medical student and current Director of Engineering at Wefunder, based in San Francisco.
How did you get into programming?
It was autumn of 2018, and I was starting my first week in medical school. I had a lot of reservations about going down that path because of a lot of things, really, but the straw that broke the camel's back was when I had to turn away a baby.
I worked at a pediatrics office. This family had come to America, they signed up for Covered California, they did all that stuff they were supposed to only to find out that they weren't in network with our doctor. They really wanted to see this doctor because he was Taiwanese and this family was Taiwanese. Vaccines for their baby would have cost them an obscene amount of money out of pocket. And I had to turn them away, and it was really heartbreaking. I think insurance really ruined the magic for me of medicine. That was it.
So I dropped out and I was kind of lost it about what I wanted to do and through very fortunate happenstance, I met this girl on Tinder. She invited me out to a bar, and I met up with her and a friend of hers, and that friend was a software engineer at Zillow. And I was like, oh, I don't really know what you guys do. He told me about it, and he invited me to a coffee shop the following week. He sent me home with a book called Learn Python the Hard Way.
And then I just self studied. I discovered that it was one of the most fun things in the world and I got really into it. And then six months later, in May of 2019, I landed my first internship at the company that I'm still currently with. So yeah, that's how I got into programming. It was just a very lucky happenstance and it happened to be a craft that I just really clicked with. And it is one of my biggest regrets that I didn't discover this hobby way sooner. Now I get to do my hobby as a job.
That's amazing. It's so interesting to hear different people's stories because I’d say the standard thing is “My dad was into computers,” you know, and it's so great to hear a different path. And med school is a very different path.
Yeah, I kind of wish I had that, that role model or that figure in my life who was savvy with computers. I'm not that regretful, I got to explore a lot of different other facets of life that I probably wouldn't have gotten into had I discovered programming so soon, because it is very addictive. It's honestly kind of like a video game. You get a dopamine hit when you create something that works. And there's a very good chance that I might not have picked up other hobbies, hobbies that I'm very passionate about.
So on the one hand, it's regretful because I could have had a lot more years of fun with programming and I could have maybe been at a farther place in my knowledge. But you know, on a more human level, I'm glad I have other hobbies as well.
What led to writing erdtree?
The primary impetus was boredom. I was on an airplane flight. It was six hours. And I was like, I want to build something that is sufficiently challenging in this six hour time frame. And I wanted to be able to build it without the internet because, you know, I didn't have internet on the airplane. And erdtree was simple enough. So I got to building it on the airplane. I mean, it wasn't finished within six hours. What started with six hours ended up becoming what it is today. But that was a primary impetus, boredom.
The second impetus was the Rust community. What I mean is Rust is kind of like a modern alternative to C and C++ and our operating systems are built on C and C++. All of the programs that ship with every Unix-like operating system that we know and love and take and take for granted every single day, they're very old programs. And the Rust community wanted to make saner and faster alternatives to all these programs, and I wanted to do something similar. But by this time, most of the old programs had already been rewritten, except for one, which was called tree. Tree is unique because it doesn't come from Unix, it actually comes from Microsoft. I got a lot of mileage from this very old school 80’s tree program and I know a lot of command line nerds enjoy it as well.
So I figured, oh, I can rewrite that one! So I did and I also figured, rather than just rewriting tree, I want to bring my own flavor to it. I want it to do a bit more. I want it to be a very general tool to not only work with your file system, but also work with your disk and all that stuff. So I built it, shelved it for a year, and one day it got like 100 stars on GitHub. One of my friends ended up sharing it on Slack without knowing that I wrote it. And I was kind of embarrassed because I was like, that was a year ago when I wasn't very good at Rust. I had just picked it up. And I decided now that I'm older and wiser, let's give it a proper effort. And then it kind of picked up a lot of attention. And now I have two jobs, one I get paid for and one where I don't get paid.
So you wrote it, shelved it for a year, and then this friend shared it on Slack. How did they come across it? Do you know?
Kind of, yeah. We were dealing with a very specific problem at work where the cloud service provider that we use, Heroku, they impose a limit on the size of the program that you ultimately ship to their machines after it has built. We were getting very close to this limit. So we needed to do some disk usage analysis and someone was looking for, I guess more modern, saner alternatives to the very old school du
command which tells you all the memory sizes of all your files in a directory. But it's old and it's not very pretty to look at. And you know the the thing about these old school programs too is that the documentation is very dense and you have to scroll through these man pages and they use very old conventions and they're almost speaking a different language. So anyway this guy happened to find one of my programs. And they Slacked it and then I was like, oh crap, that's me.
The time passing during this is really interesting too, right? It almost feels like being a founder of a startup. You put something out there and it’s human nature to think okay, I hope this resonates with people immediately, but it often takes a little bit of time.
Yeah, that's one aspect of it for sure. The other aspect that I think is a bit more difficult to contend with is, I work with founders. My company by nature works with founders, and I work very closely with Wefunder’s founders. And you know, people come in and everyone has a lot of different opinions on the direction the product should go based on their own experience. And I think that was the thing that was a little difficult for me at first.
I had a very focused idea for what erdtree was supposed to be. I had all these people come and tell me, “It'd be better if you did this.” Or you should probably do things this way instead. And some people have good points and some people make suggestions that I feel like, you know, it's nice, but it just doesn't make sense given the nature of the tool. And at first I was just trying to please everybody. And what ended up happening was I ended up with a Frankenstein of a project, which is different from how I originally envisioned it.
But then I gained some confidence, found my grounding, and right now I'm working on version 2. Which is going to come with a lot of breaking changes. And it's going to be kind of what I originally envisioned. I have stronger conviction now and I'm going to just say, sorry, if you preferred it this way, I'm going to keep the old version up. You're free to use it. But I won't be actively developing that anymore. So yeah, there's a lot of noise coming from all over the place. But ultimately, as the founder, you’ve got to have some conviction about your product.
So you wrote erdtree in Rust and I know that Wefunder is a Ruby shop and we're also a Ruby shop so I want to ask this question kind of openly, but what are some other open source programs or Ruby gems that you think are interesting?
Yeah, so in the Ruby world I think my favorite packages, which I haven't used extensively, are the concurrent-ruby gem as well as this gem called falcon which is a web server for Ruby that uses more modern socket APIs. People usually reach for puma or unicorn which are these two very monolithic web servers and they're very traditional in how they deal with massive amounts of socket i/o. Ruby, unlike Node, doesn't ship with this thing that allows you to handle i/o very seamlessly. In Node, it ships with runtime, this thing that just figures out how to take all of these people who are trying to request resources from you or are trying to talk to your server. They will figure out how to concurrently handle all of those at the same time.
With Ruby, Ruby is often made fun of because people say it doesn't scale, it can't do concurrency. But I think this is largely because Ruby doesn't ship with runtime like Node. It's left to the ecosystem, and not a lot of people are aware of that. But you have these library dudes who came together who made this really fantastic runtime that allows Ruby to do very modern concurrency rather than just spinning up threads. And yeah, I kind of wish that concurrent Ruby was more, I guess, in the zeitgeist in the Ruby community.
And then falcon is very cool because it is a web server that uses modern APIs to handle i/o. It doesn't use this thread pool model. It does have a thread pool, but the way it handles all these sockets relies on a very evented model which I enjoy and I think scales very well. So those are my favorite Ruby packages.
How can we get more people into open source?
It’s difficult. I think largely the domain of open source is programmers making tools for other programmers instead of making tools for customers. Mastodon is open sourced, built on Rails. But most companies are closed source.
So you have to be the type of person who has empathy for other programmers. And want to make yourself more productive. I was able to get one friend into it by just showing him the command line and showing him how you should learn how to use all these tools that come with your machine to be more productive. So he become like a productivity junkie. And that sent him into the rabbit hole.
I think educating people on how to engage with the open source community is huge. When I first joined I had no idea what the etiquette was. I had no idea if it was nice or respectful to open up an issue. I had no idea how to do releases. And engaging with the open source community is very daunting and and learning the etiquette was very tricky for me in the beginning. Even learning how to talk to people and knowing what to say, it reminds me of when I first went to Vegas and I was about to sit down at a blackjack table. But I had no idea what the etiquette was. It's like, do I just sit down? Do I ask them to sit down? How do I buy in?
So I think educating people on the social aspect, about how to engage the open source community would be huge because you know, every programmer knows how to use GitHub and git and all this stuff. It's more how do I actually engage with these people? How do I ask to contribute? And if I wanted to start my own project, you need to learn about semantic versioning, and these aren't things that they necessarily teach you in school. So I think education would definitely be a big one and then just let people organize themselves afterwards.
I was talking to someone the other day who did a boot camp and then they started their first job and they were talking about how a big difference was learning how to write code for other people to read. Not just necessarily writing code that works, right?
I'm glad you mentioned that readability part. That part is interesting because definitely I think most programmers nowadays learn computer science with the intention of eventually getting a job. So you have to write code generally that's readable. But all of these older programs before open source was even a thing, they didn't write code so that it was readable. They wrote it so that it was hyper efficient and they wrote it so that it just worked. And there are still programmers out there like that, the hacker types who just want to get something ugly to work so that they can just get their thing done. And there's still a lot of programmers that do that today, but definitely there's an etiquette in the open source world to write the sort of code that's heavily documented.
Not everyone does this. And those projects would be a nightmare to contribute to. I've contributed to a few of those. But yeah, some people don't really think about readability. I don't blame them because some of these projects were just people writing a personal project and then people came and just started getting interested in it.
Who’s someone you think is doing really interesting work?
There's this one guy on GitHub that I follow extensively, his profile is called Burnt Sushi. He's written a lot of programs that a lot of people use. He writes a lot of Rust programs. He wrote this very popular alternative to grep called ripgrep and all the underlying libraries that people have used and taken for granted, like the regular expressions library that also came from him. The amount of output he’s had given the young-ness of the language is insane.
Oh, this is one guy (Jens Axboe), he works at Meta or Facebook. He's working on this new library that's supposed to be the one to rule them all in terms of doing evented i/o across all different types of i/o, whether it's socket i/o or file system i/o. He's working on this program called io_uring. Right now it’s still very obscure. But I'm confident that thing will be the next big thing that you know, all of the Node people and Ruby people, Rust, C, whatever, everyone's going to look at this library and be like, okay, this is now or the nginx people too. They're going to look at this and be like this is the new way to do evented i/o, but it's still under active development and it's only available on Linux systems I believe, which is kind of holding it back. But yeah, I'd say that's the product I'm looking most forward to in the coming future.
Once a Maintainer is written by the team at Infield, an app that helps engineering teams keep their open source dependencies up-to-date.
To suggest a maintainer doing awesome work in the community, find us on Twitter at @infieldai. To subscribe, hit the button below: