Methodology
How estimates work
StackStats turns public Substack signals into transparent gross MRR ranges, confidence labels, and growth history. It is built for learning from what appears to work, not pretending public data gives private subscriber counts.
Estimate Ranges
StackStats estimates gross monthly recurring revenue from public paid-audience buckets and visible subscription prices.
The range is intentionally wide when the public signal is vague. A publication with only a bucket such as thousands of paid subscribers should not be presented as if we know the exact paid subscriber count.
- Low estimate: conservative paid-subscriber and price interpretation.
- Base estimate: the finance-v1 midpoint used for ranking.
- High estimate: optimistic interpretation of the same public signals.
Gross MRR
Gross MRR means estimated subscription revenue before platform fees, payment processing, taxes, refunds, coupons, churn, or comped subscriptions.
The model uses regular public web plans first. Annual plans are converted into monthly revenue by dividing by 12, then combined with an explicit monthly/yearly mix assumption.
- No discount or coupon adjustment.
- No Substack platform fee or payment fee assumption.
- No tax, refund, churn, or comped subscriber assumption.
- No net revenue claim.
Confidence
Confidence describes the quality of the observable public inputs, not how successful a publication is.
A high-confidence estimate has a public paid-audience bucket, visible regular pricing, a free subscriber signal, recent source records, and enough post history to understand cadence.
- High: paid audience, pricing, subscriber signal, recency, and post history are all present.
- Medium: paid audience and at least one public price are present, but some supporting signals are missing.
- Low: paid audience is missing, pricing is ambiguous, sources are stale, or the publication appears unusual.
Source Records And Signals
Every external response is stored as a source record before normalized fields are derived from it. This keeps provenance available when parsers improve or the model changes.
Normalized signals include identity, public leaderboard rank, paid-audience wording, visible tier prices, public post metadata, and recommendation edges.
- Raw source records preserve where a signal came from.
- Normalized tables keep the product fast and queryable.
- Daily snapshots freeze estimates, ranks, and graph-ready metrics for historical charts.
Connected Publications
Exact paid subscriber counts and exact tier distribution are not public data. StackStats should not imply that V1 knows those numbers.
A future Connected Publication flow can let a creator authorize accurate private metrics for their own publication. Those verified numbers should be displayed separately from public estimates.
- Public estimate: inferred from public buckets and prices.
- Verified revenue: future creator-authorized data.
- Exact paid-tier distribution requires creator permission.
Graph Signals
Recommendation and public relationship signals are useful for discovery, adjacency, and growth insight. They are not complete subscriber lists.
V1 stores graph edges in Postgres and uses them to show adjacent publications. A graph database can wait until traversal becomes central to the product.
- Recommendations can suggest who a publication wants to be associated with.
- Visible public subscription/read edges, if collected later, will still be incomplete and privacy-biased.
- Graph signals should support inspiration and discovery, not exact revenue claims.