Software And Data Are Disrupting Venture Capital Firms

Editor’s note: Aaron Holiday is co-founder of 645 Ventures and Managing Entrepreneurial Officer at Cornell Tech

Over the last decade, the estimated number of new tech startups formed in the U.S. each year ranged from 16,000 to 20,000, and the total amount of venture funding per year for software startups increased from ~$5 billion to $19 billion.

Most recently, over the past five years, terabytes of structured data about these startups have been proliferating on the web (see CrunchBase profile growth and App Annie traffic volume below).

App Annie Data Volume

Crunchbase Data Volume

In addition to reflecting a more active venture industry, these numbers are indicative of a rapidly expanding demographic of founders and emerging geographic hubs of innovation. Despite these changes in the startup formation landscape, internal processes used by venture capitalists to source and create value for founders have not kept pace with modern-day software innovations and the proliferation of data on seed-stage startups.

Demographic Shifts of Early-Stage Startup Founders

There is significant student demand for entrepreneurship in both undergraduate and graduate programs, which is partly a result of an imbalance between an increasing number of college students – many of whom aspire to be mid- to executive-level professionals – and the inherently small number of leadership roles at large companies.

Effectively, the career path for most people in large companies flattens out at the junior and mid-levels. Consequently, there are many recent graduates and experienced professionals who are willing to forfeit their dreams of being a big corporate executive in return for startup equity and the experience of forming or joining early-stage companies.

The difference between this generation and previous tech generations, however, is that this trend is spreading from traditional startup hubs – communities that are surrounded by top research institutions and publicly traded tech companies (i.e. Silicon Valley and Boston) – to new business capitals and urban centers, specifically cities that have access to investors, engineering talent, domain experts, and antiquated industries that are in search of technology innovation.

Emerging tech centers such as New York are embracing the rise of startup activity, with New York specifically beginning to track all of its startup formation activity online at destination sites like Digital.NYC.

The rapid growth of VC deals in NY Metro, Midwest, and LA compared to stable growth in New England.

The rapid growth of VC deals in NY Metro, Midwest, and LA compared to stable growth in New England.

Data and Volume of Startups Formed Are Overwhelming Traditional VC Operations

At the same time that startup activity is expanding across geography and demographics, the volume of data online that tracks seed-stage startups is growing exponentially. The massive quantity of constantly updated data online about startups’ founders, product traction and competitors has only existed for a few years. While much of this data is still fragmented, a large amount of it is easily assessable through APIs, and can be used in real time to detect signals of high growth startup activity.

This has given rise to tools such as DataFox, MatterMark and CB Insights, which are all aiding startup investors in quickly assessing public information on private companies. But these tools are not being used as core, end-to-end solutions that drive ongoing venture investment decisions and value creation for venture firms’ portfolio companies.

Although there has been substantial change in the tech community over the past decade, early-stage venture capital operations and processes are for the most part the same as they were twenty years ago. Conventional venture capital deal sourcing stems from personal relationships that provide access to exclusive and proprietary deals.

This information flow plus thoughtful investment theses, due diligence and sharp character judgments are the primary basis of top investors’ investment decisions – methods that have historically generated alpha for limited partners. Portfolio value creation has been derived from general partners’ personal networks (including existing portfolio companies), community managers, business development functions, and some VCs’ operating experiences/know-how.

These are tried-and-true methods that have always been employed by top VCs, and will continue to be used for years to come. But these approaches overlook new-age founders because there is a growing number of founders who do not inhabit traditional venture capital networks, and the rate at which companies are formed today overwhelms traditional manual methods of deal sourcing and vetting.

Algorithmically Searching the “Gulf of Startup Experimentation for Winners

Because of the reduced complexity to code, shrinking costs to build software, and lowered barriers to access initial funding, new founders are taking advantage of the ease of starting a company. These changes are partly responsible for the significant growth of startup formation in the tech sector, as well as the changing demographics of founders and new methodologies such as Lean Startup which encourage rapid experimentation.

This growth has created an entirely new asset class that is adjacent to the traditional seed to series A funnel (see diagram below). We refer to the extension of the seed market as a massive “Gulf of Startup Experimentation.”


Within the gulf, there is a small group of very talented technologists, product managers, designers, and domain experts who are capable of transforming their experiments into high growth startups that are difficult to replicate. Several of these startup experiments in the gulf are being financed by angels, accelerators and seed investors; and the founders of these startups are making themselves and their companies known online through platforms like Product Hunt, AngelList and CrunchBase.

In order to efficiently discover the companies within this gulf that are capable of raising competitive Series A rounds, top venture capitalists must become more sophisticated at filtering and partnering with the best founders in the gulf – a process that is imperfect if it only relies on human intelligence and personal networks as sources of the information. Traditional methods of deal sourcing and vetting simply cannot scale to sufficiently evaluate the rapid experimentation that is occurring, they need to be supplemented by a technology-based approach.

The future deal flow for top-tier seed to early-stage investors will be complemented by artificially intelligent algorithms that help sharpen investors’ view into the gulf of startup experimentation, specifically through intelligent sourcing and tracking. Investors will also use software to identify opportunities to influence outcomes (i.e. value creation) for startup management teams. These algorithms will analyze general partners’ relationships, understand a firm’s investment strategies, and proactively discover founders that the VC firm is uniquely able to support.

We are in the very early days of the adoption of software and algorithms as a core part of venture capital firms operational DNA. Several VCs have started to experiment with complementary software and statistical models to aid investment decisions; however, very few firms have retrofitted their entire day-to-day operations (i.e. from sourcing to portfolio company management) to be supported by a fully integrated software system and intelligent algorithms that contributes to the VC’s ability to generate alpha for limited partners.

The change in the industry requires innovative and emerging VC firms to discover the role that software plays in venture capital and to share knowledge with the ecosystem.