About Data Boutique
Data Boutique is a web-scraped data marketplace. Unlike generic-purpose ones, this addresses specific, well-defined needs in a transparent, interest-aligned space.
Generic data marketplaces are broken
Four years have passed since Amazon launched AWS Data Exchange, and not AWS nor any other player proved to be that accelerating agent in the growth of the segment everyone hoped for.
As of today, the coverage of datasets is minimal:
AWS Data Exchange lists 4.109 data products
Snowflake Marketplace lists 2.427 data products
Databricks Marketplace has 143 providers (which include solutions providers as well, not only data providers).
Far off a market that’s supposed to be “the new oil”.
Marketplaces fail to capture a significant part of the exchanges: According to Grand View Research, the Data Marketplace global market size in 2023 was 1.2 billion USD: The Alternative Data Market - a sub-niche of data - was 6 times bigger (7.2 billion USD), and Web Scraping 5 times that much (6 billion USD).
Cross-referring to this research, marketplaces capture less than 2% of the exchanged volumes.
.
What’s the problem here?
Large and well-funded corporations have tried to address this, so why hasn’t it worked out yet?
As my all-time most-quoted essay reads, the role of marketplaces is to lower transaction costs.
Marketplaces are just as powerful as the value they bring to the table. And that value has not been enough yet.
The marketplace is pretextual to platforms: interests are misaligned
Data is platform-agnostic. But those who built major data marketplaces today are not: Their intent is to push their technologies, not data commerce.
In fact:
AWS Data Exchange wants you to use the AWS stack
Snowflake Data Marketplace works as long as you are a Snowflake user
Databricks, Tableau, Qlik, SAP, Informatica, and others all are in support of a specific technology
They get dollars when a user uses the platform, but it makes no difference if a buyer purchases external data products or uses their own. They have zero incentive to optimize for a sale. They won’t invest in it more than the bare minimum, and that explains the following point.
Search is not the problem.
Generic data marketplaces are just a collection of providers with little more than search features.
But if we were to break down the transaction costs for data (search, evaluation, negotiation, price, and ingestion), the search component turns out to be just a minor element.
On top of a 5k USD- often 10X this much - data contract, as a buyer you’d still need to enter the website of the vendor, call their sales team, get a quote, negotiate a sort of trial, and onboard the data - each time with a different vendor.
Finding the vendor is hardly the hard part.
There is simply too much work to be done off-platform, talking and negotiating with the vendor. Actually, there is so much work that the role played by the marketplace (search) is not that different than those of search engines with SEO.
Understanding the price, understanding what data I am buying, and comparing that data with alternatives is the problem.
What is needed to change
Marketplaces are trading all data like a gigantic blob inside a chaotic bazaar where sellers scream their best offers. This doesn’t instill trust.
1. Align interests
Marketplaces that have skin in the game will work for the benefit of buyers and sellers. When interests are aligned, all parties involved will profit only when the exchanges on the platform take off.
A “take rate” business model is the one that provides the best alignment of interests with everyone, buyers included. When the platform has stakes in a deal, you can be sure they’ll optimize for it.
2. Make pricing clear
The bull must be taken by the horns: Price is the most critical element in B2B negotiations. Today’s data market is a closed-door negotiation place, making the market illiquid.
This is the contrary of a marketplace, where a more extensive, prosperous economy flourishes because it is liquid, and goods are bought and sold rapidly.
The price for an exact service needs to be visible and comparable. “Free plan available” or “Subscription options available “ won’t work.
Price needs to be a transparent, outspoken element, allowing for comparison among like-for-like products.
3. Address the transaction pain points
A platform needs to drive faster and more frequent (liquid) transactions. To achieve this, we believe a vertical specialization is needed. We picked web scraping because it’s a market we understand very well, and we optimized Data Boutique for it.
The advantage of being vertically specialized is that it allows one to enter the specifics of the purchase journey, from discovery to evaluation, to purchase, to fruition, to after-purchase support.
Let’s take web scraping: Data Quality is a big headache for the buyer. Since we only do this kind of data, we can have a given set of quality checks before the data product is listed.
This way prices become comparable, we can have a like-for-like benchmark, and the purchase decision can be made faster.
This is what buyers and sellers look for and the ultimate end goal of a marketplace: More sales.
Annex - screenshots I took so you know I didn’t make this up
About the Project
Data Boutique is a community for sustainable, ethical, high-quality web data exchanges. You can browse the current catalog and add your request if a website is not listed. Saving datasets to your interest list will allow sellers to correctly size the demand for datasets and onboard the platform.
More on this project can be found on our Discord channels.
Thanks for reading and sharing this.