Leverage public data to improve content marketing outcomes

Publishers prefer pitches that demonstrate accuracy and authoritativeness

Recently I’ve seen people mention the difficulty of generating content that can garner massive attention and links. They suggest that maybe it’s better to focus on content without such potential that can earn just a few links but do it more consistently and at higher volumes.

In some cases, this can be good advice. But I’d like to argue that it is very possible to create content that can consistently generate high volumes of high-authority links. I’ve found in practice there is one truly scalable way to build high-authority links, and it’s predicated on two tactics coming together:

  1. Creating newsworthy content that’s of interest to major online publishers (newspapers, major blogs or large niche publishers).
  2. Pitching publishers in a way that breaks through the noise of their inbox so that they see your content.

How can you use new techniques to generate consistent and predictable content marketing wins?

The key is data.

Techniques for generating press with data-focused stories

It’s my strong opinion that there’s no shortcut to earning press mentions and that only truly new, newsworthy and interesting content can be successful. Hands down, the simplest way to predictably achieve this is through a data journalism approach.

One of the best ways you can create press-earning, data-focused content is by using existing data sets to tell a story.

There are tens of thousands — perhaps hundreds of thousands — of existing public datasets that anyone can leverage for telling new and impactful data-focused stories that can easily garner massive press and high levels of authoritative links.

The last five years or so have seen huge transparency initiatives from the government, NGOs and public companies making their data more available and accessible.

Additionally, FOIA requests are very commonplace, freeing even more data and making it publicly available for journalistic investigation and storytelling.

Because this data usually comes from the government or another authoritative source, pitching these stories to publishers is often easier because you don’t face the same hurdles regarding proving accuracy and authoritativeness.

Potential roadblocks

The accessibility of data provided by the government especially can vary. There are little to no data standards in place, and each federal and local government office has varying amounts of resources in making the data they do have easy to consume for outside parties.

The result is that each dataset often has its own issues and complexities. Some are very straightforward and available in clean and well-documented CSVs or other standard formats.

Unfortunately, others are often difficult to decode, clean, validate or even download, sometimes being trapped inside of difficult to parse PDFs, fragmented reports or within antiquated querying search tools that spit out awkward tables.

Deeper knowledge of web scraping and programmatic data cleaning and reformatting are often required to be able to accurately acquire and utilize many datasets.

Tools to use

  • Google dataset search — Google provides perhaps the most comprehensive way to quickly find datasets with their recently released dataset search.
  • inurl:gov “dataset” — This search query will surface many of the large and important federal and state government datasets, but it’s by no means comprehensive. Still, I find myself using this search string frequently, adding additional keywords outside to narrow my topical focus.
  • Reddit.com/r/datasets — This is one of the largest dataset communities online, and it’s very active, with users posting datasets of all types including public datasets, rare finds and custom scraped data, as well as tools, tricks and tips for finding great data elsewhere online.
  • Data.world — This site aggregates datasets and provides tools to make them more accessible. Additionally they have a great email listserv that surfaces new and interesting datasets as they become available.
  • Data is plural — This is a listserv run by Jeremy Singer-Vine, the data editor at BuzzFeed. It’s a curated list of new and fascinating datasets.
  • Kaggle datasets — Kaggle is the largest data-science and machine learning competition platform online. Its massive community has published thousands of datasets, many of them in easy to consume formats that are pre-cleaned.
  • Data journalism GitHub repositories FiveThirtyEight and the NYTimes Upshot are just two examples of major outlets that make some or all of the data they use in data journalism available on GitHub. This trend is likely to continue, with more and more news publishers creating transparency with their data journalism by making the raw data publicly available.

Tip: Often the juiciest angles in existing datasets are found after becoming very familiar with what the dataset contains and how the data can be mixed/matched to find unique correlations and answer the unanswered questions.

Winning examples of content executed using free data

Some datasets are so large and so comprehensive that there are literally hundreds or thousands of unique and highly newsworthy stories that can be told using their data.

Here are three examples of how you can use these types of datasets to make engaging content that garners high-quality links.

U.S. Census information

Census estimates are released every year on the census.gov website. When creating content for Porch.com, we decided to rank neighborhoods with the highest incomes and home values in each state and see what trends appeared in the names of those neighborhoods.

We did this analysis by gathering data from U.S. Census information for all Census Designated Places (CDPs). A CDP is a concentration of population used for statistical purposes only and is not legally incorporated. We then used QGIS to extract the data from the census and analyzed the data by ZIP code.

With this information, we created and pitched a project called Neighborhood Names.

While massive datasets can seem dense and difficult to use, if you focus on particular data with a thesis in mind, you can create a straightforward, simple and popular campaign like this one that earns coverage on publications like CNBC and Realtor.com.

Takeaway: Consider questions you have that large datasets could potentially answer, and then explore those specific angles. Having a narrower focus can really help keep you and your readers from being overwhelmed by information.

U.S. Department of Labor Information

For a different project, we utilized readily available and free information from The U.S. Department of State — Bureau of Consular Affairs and U.S. Department of Labor to formulate a guide to applying, securing and making the most of an H-1B Visa.

We also used government data to examine how H-1B visa holders work in the U.S., how much money they earn and which companies they work for.

The objective was to create a resource by organizing government information in a digestible and interesting way through the creation of maps and by outlining employment opportunities.

The resulting project was featured with Patch.com and other local news sites.

Although this particular set of data applies to content within a very niche vertical, you can combine these government statistics with newsworthy and timely tidbits to attract national and local press.

Takeaway: While this data is already publicly available, consider who can benefit from getting a distilled, clear, digestible version of that data that applies directly to them. Most people aren’t rifling through extensive databases of information, so if you’re able to point out the crucial elements to your audience, you’re providing a value they’ll be very grateful for.

Stand out with data-focused content marketing

As the total volume of content produced online continues to grow exponentially, brands will be increasingly fighting for attention.

Fortunately, the same growth that increases competition also fuels the acceleration of available datasets and new tools for telling compelling stories with that data. Leveling up your content will require that you do more than simply rehash information that already exists and instead add something entirely and previously unknown to the world.

The good news? Those who see the opportunity in data-focused content will reap disproportionate benefits.