Almost by default, open-source developers get very little insight into who uses their projects. In part, that’s the beauty of open source, but for developers who want to monetize their projects, it’s also a bit of a curse because they get very little data back from these projects. While you usually know who bought your proprietary software — and those tools often send back some telemetry, too — that’s not something that holds true for open-source code. Scarf is trying to change that.
In its earliest incarnation, Scarf founder Avi Press tried to go the telemetry route for getting this kind of data. He had written a few successful developer tools and as they got more popular, he realized that he was spending an increasingly large amount of time supporting his users.
“This project was now really sapping my time and energy, but also clearly providing value to big companies,” he said. “And that’s really what got me thinking that there’s probably an opportunity to maybe provide support or build features just for these companies, or do something to try to make some money from that, or really just better support those commercial users.” But he also quickly realized that he had virtually no data about how the project was being used beyond what people told him directly and download stats from GitHub and other places. So as he tried to monetize the project, he had very little data to inform his decisions and he had no way of knowing which companies to target directly that were already quietly using his code.
“If you were working at any old company — pushing code out to an app or a website — if you pushed out code without any observability, that would be reckless. You would you get fired over something like that. Or maybe not, but it’s a really poor decision to make. And this is the norm for every domain of software — except open source.”
That led to the first version of Scarf: a package manager that would provide usage analytics and make it easy to sell different versions of a project. But that wasn’t quite something the community was ready to accept — and a lot of people questioned the open-source nature of the project.
“What really came out of those conversations, even chatting with people who were really, really against this kind of approach — everyone agrees that the package registries already have all of this data. So NPM and Docker and all these companies that have this data — there are many, many requests of developers for this data,” Press said, and noted that there is obviously a lot of value in this data.
So the new Scarf now takes a more sophisticated approach. While it still offers an NPM library that does phone home and pixel tracking for documentation, its focus is now on registries. What the company is essentially launching this week is a kind of middle layer between the code and the registry that allows developers to, for example, point users of their containers to the Scarf registry first and then Scarf sits in front of the Docker Hub or the GitHub Container Registry.
“You tell us, where are your containers located? And then your users pull the image through Scarf and Scarf just redirects the traffic to wherever it needs to go. But then all the traffic that flows through Scarf, we can expose that to the maintainers. What company did that pull come from? Was it on a laptop or on CI? What cloud provider was it on? What container runtime was it using? What version of the software did they pull down? And all of these things that are actually pretty trivial to answer from this traffic — and the registries could have been doing this whole time but unfortunately have not done so.”
To fund its efforts, Scarf recently raised a $2 million seed funding round led by Wave Capital, with participation from 468 Capital and a number of angel investors.
Early Stage is the premier “how-to” event for startup entrepreneurs and investors. You’ll hear firsthand how some of the most successful founders and VCs build their businesses, raise money and manage their portfolios. We’ll cover every aspect of company building: Fundraising, recruiting, sales, legal, PR, marketing and brand building. Each session also has audience participation built-in — there’s ample time included in each for audience questions and discussion.