About Data Boutique
Data Boutique is a web-scraped data marketplace.
If you’re looking for web data, there is a high chance someone is already collecting it. Data Boutique makes it easier to buy web data from them.
Join our Platform to learn and interact about this project:
Customer’s Value
Our mission is to make every web-data project economically viable.
We have often discussed the costs of web-data and how it can quickly become expensive. But the reason a client is willing to use web-data does not lie in its costs but in its value, what they are getting out of it.
This represents the real return and needs to be compared with the costs asked to achieve it. This is what Return On Investment (ROI) is all about.
This framework has helped over our past years of dealing with, building, and selling web-data in various industries.
Understanding Web-Data ROI
Return on Investment (ROI) is a simple and popular performance metric for a project or an investment. As its name suggests, it’s the ratio (expressed in percentage) of Returns (benefits generated by the investment) over Investments (costs linked to it).
The upside of ROI lies in it being straightforward and easy to understand. The downside is it tends to oversimplify things, but it still provides an excellent first glance at the overall economic viability of a project.
Returns represent the share of profits that are referable to or enabled by web-data. Investments should include the project's total cost of ownership (TCO), not just the cost of external data.
While ROI is interesting, it is also valuable to look at the formula the other way: Determine the cost range needed to achieve a given result (benefits obtained with web data) with a desired or target ROI in mind.
This is valuable to determine if a given data expense makes sense for our desired use.
Spoiler: The room for data-related costs is smaller than what is normally thought of.
Understanding Web Data Investments
The cost of web data operations can be broadly split into three major categories:
Data acquisition Costs
Internal FTEs
Infrastructure Costs
But..
The share between these elements changes too much, case by case
We have no cost benchmarks by these categories
So we need to simplify this further, splitting between
External Spending
Internal Spending
We use the Opimas research on web scraping, which allocates a global average estimate of approx 75% of the costs being internal (2022 data) and 25% external (pure data costs, in a broad sense).
Understanding Web Data Returns
Let’s be frank: No one can scientifically predict what will happen just because you use web data in your company.
But there is one thing we can do: We can clearly define our expectations for the project based on similar activities we have done in the past, or others have done.
Using this top-down approach, we identify the range of acceptable returns we are willing to achieve. And we do this by asking this simple question: What’s in it for me?
“How much should I reasonably expect dynamic pricing to help improve the bottom line?” “1%? 10%? 20%?” Is this figure compatible with previous activities on pricing? Is this compatible with what we see in the market?
A fair answer will set a solid baseline for a proper ROI evaluation. And the answer depends on where in the process you are using web-data.
From our experience, there are three main use cases:
Web-Data to build new services.
When data is used to create new services that deliver clearly identifiable new revenue streams.
Examples can be ChatGPT (which needs web data to exist), one of the many price comparison tools (PriceShape, Wieser, OmniaRetail, PriceGrabber, Price2Spy, Netrivals, just to name a few), market intelligence tools (Edited, Competitoor, HQrevenue, OTAinsight), consultancy firms adding new services to their portfolio to benchmark clients performance, and more that I did not mention (feel free to add your use case in the comments).
Building new services has the highest returns referable to or enabled by web-data. It wouldn’t be unfair for many businesses to consider the entire revenue stream of the new product as enabled by web-data.
Example: For a 10M$ revenue web-data SaaS company with 1M$ in profits, returns enabled by web-data can be up to the entire 1M$ profit pool (10% of revenue).
Seeking a 10X ROI, only a 100k$ TCO in web-data investment that can fit, which implies (25% of external costs) approximately 25k$ for data only.
Web-Data to improve what already exists.
This happens when we already have a product or service and want to enhance its performance using web data.
We may want to price our products better, faster, or discount them only when needed, drive better trade-marketing negotiations, or have a faster product lifecycle development.
Examples typically include B2C brands and retailers in the consumer goods industry, travel, and hospitality. This also includes B2B marketing, advertising, and investment service providers.
Improving product performance can be harder to track ex-ante (meaning before you do the project) to the bottom line. A cautious approach suggests to use a single-digit percentage of current profits, as a proxy for expected returns.
Example: A 1B$ brand, with 10% EBIT (100M$), a 5% profit uplift due to web-data adoption for price optimization, translated into a 5M$ expected return (0.5% of revenue). When they seek a 10X ROI, a 500k$ TCO in the project, 125k$ in pure data expenses.
Web-Data to monitor the strategic alignment of operations.
These typically are strategic, nonrecurrent projects of strategic planning or investor relations communications. They have a weaker direct relation with end results and should be treated as such.
Estimating returns for strategic monitoring is far-fetched. There are intrinsic challenges to attributing returns directly to the general administration activity. In our models we use below-single digit attribution percentages.
Example: A publicly listed online retailer with 1B$ in revenue, and 5% EBIT (50M$), can rarely allocate more than 0.1% of its results as a return for use of web-data in their strategic planning or 50k$ return (0.005% of revenue). With a 3X ROI (let’s give it a lower ROI for this activity), only a 16.7K$ TCO can do, implying a theoretical 4.1k$ expense in data.
Conclusions
Although the framework for web-data ROI contains many variables that have very wide ranges and need to be adjusted by company size, we have found in our own experience that they provide a pretty good ballpark for where the transactions for data will land.
While the expected returns for a project are under little control, the attention should be on reducing the data costs, as we are promoting at Data Boutique.
Join the Project
Data Boutique is a community for sustainable, ethical, high-quality web data exchanges. You can browse the current catalog and add your request if a website is not listed. Saving datasets to your interest list will allow sellers to correctly size the demand for datasets and onboard the platform.
More on this project can be found on our Discord channels.
Thanks for reading and sharing this.