Schmarzo's Data Theorem Identifies Three Economic Effects of Treating Data as a Non-Depreciating Asset

November 19, 2020 · JPG · 6 min read

Original source: Issam Hijazi

This video from Issam Hijazi covered a lot of ground. 8 segments stood out as worth your time. Everything below links directly to the timestamp in the original video.

The economic rules governing data are fundamentally different from those governing physical or financial assets — and most organisations are still managing data as if those rules don't exist.

Schmarzo's Data Theorem Identifies Three Economic Effects of Treating Data as a Non-Depreciating Asset

Research conducted at the University of San Francisco produced what Schmarzo calls a fundamental reorientation in how organisations should value data: unlike physical assets, data never wears out, never depletes, and can be reused across an unlimited number of use cases at near-zero marginal cost. That structural property generates three compounding effects — flattening marginal costs over time, accelerating time-to-value as reusable analytic modules accumulate, and a third 'Tesla effect' in which any improvement to a shared analytic module ripples backward to every prior use case that deployed it, at no additional cost.

What this exposes is that most organisations are still applying an accounting framework — depreciation, replacement cost, finite utility — to an asset whose economic behaviour is categorically different. Google's decision to open-source TensorFlow, the core engine of its own search and advertising business, illustrates the logic: wider adoption makes the tool better for everyone, and Google, as the most sophisticated user, captures the greatest share of those gains.

"Data never wears out, data never deletes, and the same data set can be used across an infinite number of use cases at zero marginal cost."

▶ Watch this segment — 10:17

Tesla's Fleet-Wide Learning Model Reframes Competitive Advantage in Autonomous Vehicles

Elon Musk's claim that a Tesla appreciates rather than depreciates in value the more it is used is not a marketing flourish about collector cars — it describes a structural learning architecture in which each of the roughly 600,000 vehicles on the road continuously generates training data, uploads those learnings to a central cloud at the end of every day, and receives a consolidated, improved model in return. The result is that a single avoided accident by one vehicle becomes, within twenty-four hours, knowledge shared across the entire fleet.

The structural issue here is that competitors still framing Tesla as an electric-vehicle company are contesting a market Musk has already moved beyond; the real competition is in autonomous-learning infrastructure, and the lead it has accumulated through continuous fleet-scale training is compounding daily.

"When people think about competing with Tesla, they're thinking about building electric cars — Elon Musk is already fighting the next game. He's thinking about autonomous vehicles and how he's going to use that autonomous vehicle engine he's building."

▶ Watch this segment — 48:51

Data Silos Destroy the Core Economic Value of Data, Schmarzo Argues Against Mesh Architecture

The economic value of data rests on a single structural condition: the ability to share the same dataset across an unlimited number of use cases. Any architecture that fragments data into domain-specific silos — however operationally convenient — directly undermines that condition, because the marginal cost advantage collapses the moment reuse is restricted. Schmarzo also draws a distinction between data sources and use cases as organising principles, arguing that only a use-case lens can reliably separate signal from noise within a given dataset, since the same data element can be informative in one context and irrelevant in another.

The real question is not where data physically resides but whether the governance model permits or inhibits cross-domain reuse — and architectures that prioritise domain autonomy over shared access are, by that measure, trading long-term economic value for short-term organisational convenience.

"Data silos are the killer of the economic value of data, because the economic value of data is based on being able to share the same data set across an unlimited number of use cases."

▶ Watch this segment — 39:43

Analytics Projects Fail on Politics, Not Technology, Schmarzo's Prioritisation Framework Argues

The first step in Schmarzo's value-creation framework is a prioritisation matrix — a quadrant exercise in which diverse business stakeholders place use cases on axes of business value and implementation feasibility, each represented by a post-it note on a flip chart. The mechanism forces every participant to justify their rankings publicly, creating a documented rationale for sequencing decisions that survives beyond the workshop. Schmarzo's central contention is that large analytics initiatives collapse not because the technology fails but because passive-aggressive behaviour — the unheard voice that withholds adoption — goes unaddressed at the outset.

The structural issue is one of accountability architecture: by compelling business stakeholders to own the value axis and technologists to own the feasibility axis, the matrix distributes responsibility in a way that makes subsequent foot-dragging visible and attributable.

"These kinds of big data and advanced analytics projects don't fail mostly because of the technology — they fail because of passive-aggressive behaviour, because somebody's voice didn't get a chance to get heard."

▶ Watch this segment — 31:10

Explainable AI Enables CIOs to Pinpoint Which Data Elements Are Worth Investing In

Schmarzo's theorem assigns value to data not intrinsically but instrumentally: a dataset's worth is determined by its measurable contribution to a specific use case. By working backward from a use case's economic outcome — a two-percent improvement in cross-sell effectiveness worth $45 million, for instance — and apportioning that value across the datasets and analytic modules that produced it, organisations can construct an evidence-based case for where data quality investment will yield the greatest return. Explainable AI makes this attribution tractable at the variable level, identifying which individual data elements most powerfully drive a model's predictions.

Beyond investment prioritisation, this approach directly addresses regulatory requirements — GDPR, the California Consumer Privacy Act, and fair credit reporting rules — that already mandate transparency about how individual variables influence automated decisions.

"From a CIO or chief data officer perspective, it tells me which of my data elements are most valuable — and if I'm going to spend money trying to improve the quality of my data, then I'd better know which data sets can benefit the most from that investment."

▶ Watch this segment — 43:58

Economies of Learning Outweigh Economies of Scale in Knowledge-Based Industries, Schmarzo Contends

The organisations most likely to succeed in digital transformation, Schmarzo argues, are not those that simply deploy the most sophisticated AI but those that simultaneously build empowered human teams operating at the front lines of customer and operational engagement. The compounding effect arises from a specific pairing: AI systems that improve continuously through reinforcement learning, coupled with human teams that are curious enough to push those systems into territory the algorithms have not previously encountered. Each side expands the other's reach in ways neither could achieve alone.

What this exposes is a tension with prevailing institutional structures — educational systems that systematically suppress curiosity, and management hierarchies that concentrate decision-making far from the operational edge. Schmarzo's broader claim is that AI will not diminish the distinctively human capacity for creativity and curiosity but will instead force organisations to rediscover and deploy it.

"AI and machine learning are going to force humans to become more human — to learn to embrace the natural curiosity that resides in us, that drives us from curiosity to create something, to be creative, and ultimately to drive innovation at scale."

▶ Watch this segment — 52:42

Individual-Level Behavioural Scoring Replaces Segment-Based Thinking in Schmarzo's Analytics Framework

The second step in Schmarzo's framework builds a granular behavioural profile for each entity an organisation cares about — whether a customer, a student, a patient, or an industrial device such as a turbine — generating a set of predictive scores analogous to a credit score but applied across every operationally relevant dimension. Propensity to respond to a promotion, likelihood to churn, probability of recommending the organisation to others: each becomes a quantified, individual-level attribute rather than a segment average, enabling decisions calibrated to a specific person or asset rather than a statistical group.

The structural shift here is from descriptive segmentation, which collapses variation, to predictive individuation, which preserves it — a distinction with material consequences for healthcare triage, customer retention economics, and industrial maintenance scheduling alike.

"We don't look at segments of customers, we don't look at segments of students, we don't look at segments of wind turbines and jet engines — we look at them individually and we make individual decisions, very informed granular decisions, to how to better serve that particular entity."

▶ Watch this segment — 35:52

Knowing When a Model Is 'Good Enough' Requires Business Stakeholders to Price the Cost of Being Wrong

Software developers define success requirements before writing a line of code; data scientists, by contrast, discover what success looks like by burrowing through data — a fundamentally different epistemology that demands curiosity rather than specification. Schmarzo argues that this distinction is frequently misunderstood, with organisations attempting to manage data science projects using software development governance frameworks that are structurally incompatible with iterative, discovery-driven work. The more consequential and underappreciated problem, however, is the question of model sufficiency: since no predictive model achieves one hundred percent accuracy, the threshold for 'good enough' can only be established by business stakeholders who understand the relative cost of false positives versus false negatives — a judgement the data scientist cannot make alone.

The real question is not how accurate the model is in the abstract but what the organisation is actually risking when it errs in each direction — a calculation that requires domain knowledge, cost accounting, and imagination about failure modes that lie outside the data scientist's remit.

"You only know if a model is good enough if you know the costs associated with false positives and false negatives — and that's not a data scientist's job."

▶ Watch this segment — 56:56

Summarised from Issam Hijazi · 1:09:40. All credit belongs to the original creators. Streamed.News summarises publicly available video content.

Streamed.News

Convert your full video library into a digital newspaper.

Get this for your newsroom →

Schmarzo's Data Theorem Identifies Three Economic Effects of Treating Data as a Non-Depreciating Asset

Schmarzo's Data Theorem Identifies Three Economic Effects of Treating Data as a Non-Depreciating Asset

Tesla's Fleet-Wide Learning Model Reframes Competitive Advantage in Autonomous Vehicles

Data Silos Destroy the Core Economic Value of Data, Schmarzo Argues Against Mesh Architecture

Analytics Projects Fail on Politics, Not Technology, Schmarzo's Prioritisation Framework Argues

Explainable AI Enables CIOs to Pinpoint Which Data Elements Are Worth Investing In

Economies of Learning Outweigh Economies of Scale in Knowledge-Based Industries, Schmarzo Contends

Individual-Level Behavioural Scoring Replaces Segment-Based Thinking in Schmarzo's Analytics Framework

Knowing When a Model Is 'Good Enough' Requires Business Stakeholders to Price the Cost of Being Wrong

More from

Data Assets Generate Compounding Returns at Zero Marginal Cost, Schmarzo Argues

Schmarzo Reframes 'Data Monetization' as 'Insights Monetization,' Arguing Most Companies Miss the Point

Tesla's Fleet Learning Model Reframes AI Assets as Appreciating, Not Depreciating