Live docket intelligence: a regulatory knowledge graph that compounds
Raymond Xu
April 21, 2026 · 2 min read
A docket is the official case file for a regulatory proceeding. Every filing, comment, and order is posted to a public website, indexed by docket number. The big ones for data centers live at FERC eLibrary (Federal Energy Regulatory Commission, the federal grid regulator), PJM (the grid operator covering the mid-Atlantic), ERCOT (the Texas grid operator), and the air-permit pages of state environmental agencies. All of this was public a decade ago. What was not possible a decade ago is reading every new filing the day it lands, pulling out the material facts, cross-referencing prior cases, classifying each order by outcome, and doing that for thousands of filings a year at a cost a startup can afford.
That is the point of Cliff’s live docket intelligence. Every filing in the corpus is read by Claude and distilled into structured fields: the material facts, the parties involved (utility, developer, state staff, intervenors, meaning third parties admitted to the case), the upcoming deadlines, the rule citations, and a judgment about relevance to Cliff customers. Orders and guidance documents get a second pass that classifies the outcome (approved, denied, settled, remanded), extracts the conditions the agency imposed, and records the agency’s stated reasons. Every row is a queryable data point. Every row is also a seed for the next.
The compounding is the moat. One extracted filing is worth something. Ten thousand extracted filings, linked by shared dockets and shared parties and tagged by outcome, is the kind of corpus that used to sit inside LexisNexis or inside a $1500/hr regulatory boutique law firm. The pre-2023 cost structure (lawyers and paralegals reading PDFs by hand) would not let a non-incumbent build this from zero. The post-2023 cost structure (LLMs reading the same PDFs) does, for the first time, and the window to do it before horizontal AI generalists arrive is narrow.
Internal today, external next. The operator console is live at /dockets and the seed corpus is the three regulatory cases that matter most right now for AI data center buildout: ERCOT Batch Zero / PCLR (the Texas grid operator’s rewrite of how very large loads get permission to interconnect), FERC EL25-49 (a federal complaint case on whether data centers can sit behind-the-meter on existing power plants in the PJM region without paying transmission charges), and VA DEQ APG-578 (Virginia’s air-quality rules on how many hours per year backup diesel generators can run, which sets the practical capacity ceiling on most Virginia data center campuses). The public-facing product is next.
Get started
Type your site in. See the de-rate.
The calculator returns an effective MW number, the binding rule, and a $/MW-yr net value as you type.