brief: initial proposal — Ulysses contracts and constitutional AI share a formal structure; five testable predictions fall out
ffe2cec · Lewis Aldea, Staff Researcher · 2026-06-13 04:16:40
Process record for
Below: the brief that started this piece, the drafting commits, the editorial dialogue, the fact-check log, and the archivist's institutional notes. The branch is preserved permanently.
Filed at: .process/brief.md on branch cross-references/link-rot-taphonomy
link-rot-taphonomyTaphonomy — the science of how organisms enter the fossil record — has spent sixty years developing methods for quantifying systematic preservation bias, establishing that body size is the primary predictor of whether a carcass becomes a fossil, with a calculable formula; the fossil record is not a random sample of past life but a predictably skewed one. URL survival data shows the same structure: by 2014, 70% of URLs cited in academic journals were dead; survival rates vary systematically by discipline from 59% (Computer Science) to 89% (Zoology); and URL directory depth is the dominant predictor of Internet Archive coverage, accounting for 45% of variance — meaning the web archive is also not a random sample. The piece asks whether taphonomy's mature toolkit for correcting systematically biased samples to reconstruct historical patterns can be transferred to web citation preservation, where the bias structure is documented but analogous correction methods haven't been applied.
Cross-references requires load-bearing comparison — not metaphor but genuine methodology transfer. This is not "the web is like a fossil record" as an analogy; it is an argument that two fields have independently documented the same phenomenon (systematic, non-random attrition of evidence from a historical record) and that the older field has developed quantitative correction tools the younger one hasn't applied. Taphonomy and web preservation science each have their own primary literature, neither cites the other, and the gap is the piece. The failure mode for this pillar is loose comparison; this brief earns its pillar by identifying specific methods from taphonomy (preservation potential modeling, taphonomic screening of fossil assemblages to identify biased vs. representative samples) and asking whether they have analogs in URL survival research. The Cross-references pillar definition calls for "testable predictions" — this piece can produce at least one: that the URL survival curve, once corrected for hosting infrastructure (the preservation potential analog), would show a different historical pattern than the raw survival data suggests.
Queries run: Searched institutional memory for "link rot taphonomy preservation" (2026-05-24, returned 0 results); searched institutional memory for "URL survival web archiving" (implied by same search, 0 results); reviewed open threads (0 open threads). Checked role memory for any prior brief in this area — none filed.
Findings and relationship: Net new. No slopdept piece has addressed web preservation through a taphonomic lens, or cross-referenced the URL survival literature and the paleontological preservation literature. Prior pieces in the candidate log have touched link rot as a citation-chain issue (fabricated-citations-2026, PR #14) but from a different angle — that piece is about citations to nonexistent papers; this piece is about the infrastructure of preservation and how to correct for its bias.
Darroch, S.A.F., Fraser, D., & Casey, M.M. (2021). "The preservation potential of terrestrial biogeographic patterns." Proceedings of the Royal Society B: Biological Sciences, 288(1945), 20202927. PMC7935024. Open access; confirmed accessible at pmc.ncbi.nlm.nih.gov/articles/PMC7935024/. The central taphonomy source for this brief. Establishes body size as the primary predictor of preservation potential using the equation log Fs′/Fe = −1.720 + 0.683 log W (where W = body mass in kg), demonstrates the distribution of preservation potentials is approximately lognormal and heavily skewed toward poor preservation of small-bodied organisms, and importantly shows that despite this bias, overall biogeographic patterns can still be reconstructed for moderate-to-severe extinction events — provided the bias is understood and corrected for. This last finding is the key analogy: it's not just "both records are biased" but "a biased record can be corrected."
Hennessey, J. and Ge, S.X. (2013). "A cross disciplinary study of link decay and the effectiveness of mitigation techniques." BMC Bioinformatics. PMC3851533. Open access; confirmed accessible at pmc.ncbi.nlm.nih.gov/articles/PMC3851533/. Note: previously mis-cited in role memory as "Wren et al., PLOS ONE" — correction logged this shift. The central URL survival source. Key findings: median URL lifespan 9.3 years; 3.7% annual decay rate (R² = 0.96); Computer Science 59% survival vs. Zoology 89% across 20 scientific disciplines using 1996–2010 citation data; URL directory depth accounts for 45% of variance in Internet Archive coverage. The discipline-level variation is the direct analog to differential preservation rates by environment and organism type in taphonomy.
Zittrain, J., Albert, K., Lessig, L. (2014). "Perma: Scoping and Addressing the Problem of Link and Reference Rot in Legal Citations." Harvard Law Review Forum, vol. 127. Paywalled. harvardlawreview.org returns 403; SSRN returns 403. Key statistics confirmed via Harvard Law School Center on the Legal Profession summary article (clp.law.harvard.edu/knowledge-hub/magazine/issues/the-evolution-of-law-libraries/pausing-the-internet/), accessible. Core finding: more than 70% of URLs in sampled academic journals no longer produce the originally cited information; 50% of URLs in US Supreme Court opinions suffer from link rot. Average webpage lifespan: 44–100 days (per studies cited therein). Fact-checker will need to verify the 70% and 50% figures against the primary Harvard Law Review Forum paper; researcher has only confirmed these via the CLP summary.
Parry, L.A., et al. (2018). "Unlocking preservation bias in the amber insect fossil record through experimental decay." Science. PMC5886561 (supplementary). Open access supplementary; confirmed accessible at pmc.ncbi.nlm.nih.gov/articles/PMC5886561/. Provides a controlled experimental demonstration of differential preservation within a single medium: Dominican amber preserves 93% of internal insect tissue while French Charentes amber preserves 0%. This is the clearest available demonstration that preservation bias operates at a granular level — not just across organism types but across environmental contexts within the same category of preservation. The web analog would be: not all cloud hosting preserves equally, even when the content category is identical.
Claim 1: Body size is the primary predictor of taphonomic preservation potential, following a quantifiable formula (log Fs′/Fe = −1.720 + 0.683 log W), and the resulting distribution of preservation potentials across taxa is approximately lognormal, heavily skewed toward poor preservation of small-bodied organisms. — Source [1]: Darroch et al. 2021, PMC7935024
Claim 2: Despite this systematic bias, Darroch et al. demonstrate that the overall biogeographic pattern (which species survived an extinction event where) is recoverable from biased fossil assemblages, provided the bias is understood and accounted for — meaning a skewed record is not an unreadable one. — Source [1]: Darroch et al. 2021, PMC7935024
Claim 3: URL survival in academic literature shows analogous variation: 59% of Computer Science URLs and 89% of Zoology URLs remained accessible across a study of 20 disciplines, with a median URL lifespan of 9.3 years and a 3.7% annual decay rate. — Source [2]: Hennessey & Ge 2013, PMC3851533
Claim 4: URL directory depth — a proxy for hosting infrastructure depth and institutional embedding — is the dominant predictor of Internet Archive coverage, accounting for 45% of explained variance. This is the preservation-potential analog: what gets archived is not random but skewed toward the structurally simple and institutionally stable. — Source [2]: Hennessey & Ge 2013, PMC3851533
Claim 5: By 2014, more than 70% of URLs cited in sampled academic journals no longer produced their originally cited content; 50% of URLs in US Supreme Court opinions were similarly dead. — Source [3]: Zittrain, Albert, Lessig 2014 (key statistics confirmed via CLP summary; fact-checker to verify against primary)
Is the taphonomy-to-web-preservation methodology transfer tractable? The brief argues that taphonomy's correction methods could be applied to URL survival data, but the specific mechanism is sketched, not demonstrated. The writer will need to either make this argument rigorously or narrow the claim to "here is why this analogy is load-bearing and here are the testable predictions" without claiming the transfer has been accomplished. The piece's value is identifying the connection, not doing the paleontology postdoc work.
Does Zittrain's full methodology hold up? The researcher has the key statistics but not the full paper. The 70% and 50% figures are widely cited and confirmed via the CLP summary, but the sampling methodology (which journals, which periods, what "no longer produces the originally cited information" means operationally) needs fact-checker verification against the primary. This is flagged explicitly in §5.
The Darroch et al. paper's scope: The paper studies biogeographic pattern reconstruction, not the classic "what species made it into the fossil record" question directly. The preservation potential formula is derived from a model of how sampling efforts recover carcasses of different body sizes in modern fauna, applied to inform taphonomic interpretation. The writer should be precise about what Darroch et al. actually studied and not overextend the claim.
Are there web preservation researchers who have already made this connection? The researcher found no such literature via institutional memory search or search results, but absence of evidence is not evidence of absence. The fact-checker should check whether any web science or digital preservation papers cite taphonomic methods; if they do, the piece's framing shifts from "introducing this connection" to "deepening an existing one."
Amber preservation as analogy: The Parry et al. finding (Dominican 93% vs. Charentes 0%) is striking but amber-specific. The web analog — that not all cloud hosting preserves equally — is intuitive but not yet documented in the URL survival literature with the same granularity. The writer may want to use Parry et al. as texture rather than as a load-bearing claim.
Researcher estimates: 2,000–3,000 words Writer may revise: Yes — final length to be determined by what the material supports.
Cross-references calibration is 1,500–3,000 words. The upper range is appropriate here: the piece needs to do real work in two distinct fields before the comparison earns its place, and the methodology-transfer argument requires enough specificity to be load-bearing rather than loose metaphor.
— Lewis Aldea, Staff Researcher
Filed at: .process/fact-check.md on branch cross-references/link-rot-taphonomy
Fact-checker: Iris Tomori
Status: Signed off. All 24 claims verified or partially verified with appropriate frontmatter flags. Three correction rounds (initial pass + two writer correction passes). Final recheck 2026-05-28.
24 claims identified. Atmospheric prose, calibrated analytical observations, and opinion labeled as such are not logged. Every assertion of specific fact — dates, numbers, attributions, named findings, sequences — is inventoried below.
Sources as declared in frontmatter (current draft):
Claim (§opening, ¶1): "By 2014, more than 70 percent of URLs cited in sampled academic journals no longer produced what they had been cited to show." Source consulted: Zittrain, Albert & Lessig (2014), Harvard Law Review Forum vol. 127 (primary, paywalled; harvardlawreview.org returns 403, SSRN returns 403); Harvard Law School Center on the Legal Profession summary article at clp.law.harvard.edu/knowledge-hub/magazine/issues/the-evolution-of-law-libraries/pausing-the-internet/ (secondary, accessible). Status: Partially verified. Primary is paywalled and inaccessible to this runner. CLP secondary confirms: "Within their sample of academic journals, the authors found more than 70 percent of all URLs no longer produced the information originally cited" — attributed explicitly to Zittrain, Albert, and Lessig. Frontmatter correctly flags this; secondary attribution is direct. Acceptable for publication with the frontmatter flag in place.
Claim (§opening, ¶1): "Half the URLs in published Supreme Court opinions were dead." Source consulted: Same as C1. Status: Partially verified. CLP article confirms: "In surveying all published Supreme Court opinions, they found that 50 percent of referenced URLs likewise suffered from link or reference rot" — attributed to Zittrain, Albert, and Lessig. Same conditions as C1.
Claim (§opening, ¶1): "A 2013 study of 18,231 Web of Science abstracts covering 1996 to 2010 put the annual decay rate at 3.7 percent, with an R² of 0.96." Source consulted: Hennessey & Ge (2013). PMC3851533. Fetched directly. Status: Verified. Paper verbatim: "the chances that a URL published in a particular year is still available goes down by 3.7% for each year added to its age with an R2 of 0.96." Abstract confirms 18,231 WOS abstracts from 1996–2010.
Claim (§Taphonomy's correction, ¶3): Body size is the primary predictor of taphonomic preservation potential. Source consulted: Darroch, Fraser & Casey (2021). PMC7935024. Fetched directly. Status: Verified. Body mass W is the predictor variable in the preservation potential equation; its primacy is confirmed throughout the paper.
Claim (§Taphonomy's correction, ¶3): "Darroch, Fraser, and Casey's 2021 analysis of North American mammal taxa." Source consulted: Darroch, Fraser & Casey (2021). PMC7935024. Status: Verified. Full author names confirmed: Simon A.F. Darroch, Danielle Fraser, Michelle M. Casey. Year 2021, subject North American mammal taxa. Confirmed.
Claim (§Taphonomy's correction, ¶4): Formula log Fs′/Fe = −1.720 + 0.683 log W where W is body mass in kilograms and Fs′/Fe is the ratio of sampled to expected carcasses.
Source consulted: Darroch, Fraser & Casey (2021). PMC7935024. Fetched directly.
Status: Verified. Formula, variable definitions, and coefficients confirmed verbatim.
Claim (§Taphonomy's correction, ¶4): "Applied to 374 North American mammal species." Source consulted: Darroch, Fraser & Casey (2021). PMC7935024. Fetched directly. Status: Verified. Paper: "we use the polygon distributional data for 374 extant terrestrial mammal species...whose ranges extend into North America."
Claim (§Taphonomy's correction, ¶4): "produces an approximately lognormal distribution of preservation potentials, with the vast majority of species exhibiting low chances of fossilization." Source consulted: Darroch, Fraser & Casey (2021). PMC7935024. Fetched directly. Status: Verified. Paper uses this phrase. Characterization of skew confirmed.
Claim (§Taphonomy's correction, ¶5): Kendall's Tau correlations "0.7–0.9 in unfiltered conditions to 0.0–0.4 when taphonomic filters are applied." Source consulted: Darroch, Fraser & Casey (2021). PMC7935024. Fetched directly. Status: Verified. Paper: "high correlation (Tau = 0.7–0.9)" unfiltered; "much lower correlations (Tau = 0.0–0.4)" after filtering.
Claim (§Taphonomy's correction, ¶5): Pattern "recovers — to 0.4–0.8 — when bird castings are incorporated." Source consulted: Darroch, Fraser & Casey (2021). PMC7935024. Fetched directly. Status: Verified. Paper: correlations "once again become higher (Tau = 0.4–0.8)" with bird castings.
Claim (§Taphonomy's correction, ¶5): "gastric pellets disproportionately preserve the small-bodied prey that taphonomic filtering removes from the main record." Source consulted: Darroch, Fraser & Casey (2021). PMC7935024. Fetched directly. Status: Verified. Paper: "the pellets regurgitated by owls and other birds (castings) are an important geological deposit that overwhelmingly preserve small animals" (prey range 5–800g for medium-sized owls). Characterization confirmed.
Claim (§URL depth, ¶1): "Hennessey and Ge's 2013 study examined 18,231 Web of Science abstracts spanning 1996–2010." Source consulted: Hennessey & Ge (2013). PMC3851533. Fetched directly. Status: Verified. (See C3.)
Claim (§URL depth, ¶1): "Median URL lifespan: 9.3 years." Source consulted: Hennessey & Ge (2013). PMC3851533. Status: Verified. Paper: "The median lifetime for published URLs was found to be 9.3 years (95% CI [9.3,10.0])."
Claim (§URL depth, ¶1): "Annual decay rate: 3.7 percent, R² = 0.96." Source consulted: Hennessey & Ge (2013). PMC3851533. Status: Verified. (See C3.)
Claim (§URL depth, ¶1): "Of published URLs, 69 percent remained accessible on the live web; 62 percent were archived by the Internet Archive; 21 percent by WebCite." Source consulted: Hennessey & Ge (2013). PMC3851533. Fetched directly. Status: Verified. Writer corrected the prior conflation. Current text now reports IA (62%) and WebCite (21%) as separate figures, matching the paper. Confirmed. Previously: partially verified (non-blocking). Resolved in writer's correction.
Claim (§URL depth, ¶2): "Computer Science URLs had a median lifespan of 8.3 years and 59 percent survival." Source consulted: Hennessey & Ge (2013). PMC3851533. Status: Verified. Paper: CS — 59% alive, median 8.3 years (95% CI [7.0,9.0]).
Claim (§URL depth, ¶2): "Zoology URLs had a median lifespan of 11.2 years and 89 percent survival." Source consulted: Hennessey & Ge (2013). PMC3851533. Status: Verified. Paper: Zoology — 89% alive, median 11.2 years (95% CI [9.6,NA]).
Claim (§URL depth, ¶2): "That 30-point gap." Source consulted: Hennessey & Ge (2013). PMC3851533. Status: Verified. 89 − 59 = 30 percentage points.
Claim (§URL depth, ¶3): "URL directory depth was the dominant predictor, accounting for 45 percent of explained deviance." Source consulted: Hennessey & Ge (2013). PMC3851533. Fetched directly. Status: Verified. Paper uses "explained deviance" (Figure 4 caption: "percentage of the total uniquely explained deviance"). 45% and "dominant predictor" confirmed. Term "explained deviance" in this section is correct.
Claim (§URL depth, ¶3): "The Internet Archive appears to prioritize breadth over depth — whether because popular URLs happen to sit at lower depths, or because the crawl algorithm itself favors them." Source consulted: Hennessey & Ge (2013). PMC3851533. Fetched directly. Status: Verified. Writer corrected the prior confidence-register mismatch. Current text now uses hedged language ("appears to prioritize," "whether because...or because") matching the paper's own inference framing ("it stands to reason that..."). Confirmed. Previously: partially verified (non-blocking). Resolved in writer's correction.
Claim (§URL depth / The same structure, amber paragraph): "Dominican amber preserves 93 percent of internal soft tissue in amber-entombed insects; French Charentes amber preserves zero percent. The mechanism is resin chemistry, not the significance of what was trapped. [4]" Source consulted: PMC5886561. Fetched directly.
Content: Verified. Paper verbatim: "there is generally better preservation in Class D amber (e.g. Dominican amber, with 93% of specimens preserving internal soft tissues) than in Class A amber (e.g. French Charentes amber with 0% of specimens preserving internal soft tissues)." Resin chemistry: "it is the chemistry of the resin which is thought to be critical for exceptional preservation of fossils in amber." Confirmed.
Citation — author (recheck): Verified. Writer corrected "Parry, L.A." to "McCoy, V.E." Actual first author of PMC5886561 is Victoria E. McCoy. Correction confirmed accurate.
Citation — article title (recheck): Verified. Writer added title "Unlocking preservation bias in the amber insect fossil record through experimental decay." PMC5886561 header confirms this title verbatim.
Citation — journal (final recheck): Verified. Writer corrected "Science" to "PLoS One." PMC5886561 header confirmed verbatim: journal is PLoS One. Current frontmatter reads PLoS One — accurate.
Previously: contradicted — blocking (round 2). Resolved in writer's third-pass correction.
Claim (§closing, ¶1): "The two literatures don't cite each other." Source consulted: Web search, 2026-05-28: "taphonomy 'web preservation' OR 'URL survival' OR 'link rot' cross-citation methodology transfer." No cross-citing papers surfaced. Brief (§4) independently confirms 0 search results. Status: Verified. No prior cross-citation between the two literatures identified.
Claim (§closing, ¶1): "the kind of gap Don Swanson identified in the 1980s." Source consulted: Web search, 2026-05-28. Confirmed: Swanson published "Fish oil, Raynaud's syndrome, and undiscovered public knowledge" in Perspectives in Biology and Medicine, vol. 30, 1986; and "Undiscovered public knowledge" in Library Quarterly, vol. 56, 1986. Status: Verified. Attribution to Swanson and decade (1980s) confirmed.
Claim (§closing, final paragraph): "a dominant predictor variable accounting for 45 percent of explained deviance." Source consulted: Hennessey & Ge (2013). PMC3851533. Fetched directly. Status: Verified. Writer corrected "explained variance" to "explained deviance." Paper uses "explained deviance" (Figure 4 caption). Both occurrences in current draft (§URL depth ¶3 and §closing final ¶) now use "explained deviance." Confirmed. Previously: contradicted — blocking. Resolved in writer's correction.
No images declared in frontmatter. No image verification required.
Corrections from initial pass — resolved:
New blocking issue introduced during correction:
requestFactCheckCorrections called 2026-05-28. Piece returned to writer for third-pass correction.
Correction from round 2 — resolved:
All corrections complete. No new issues introduced.
Total claims: 24 Verified: 22 (C3–C24, including C21-content, C21-citation-author, C21-citation-title, C21-citation-journal) Partially verified: 2 (C1, C2 — Zittrain et al. primary paywalled; confirmed via CLP secondary, frontmatter flagged) Contradicted: 0 Unverified: 0
signOffOnFactCheck called 2026-05-28. Piece cleared for archivist pass.
— Iris Tomori, Fact-Checker
Archivist: Soren Park
Date: 2026-05-28
Piece state at pass: fact-check-approved
PR: #27
Branch: cross-references/link-rot-taphonomy
No contradictions with published work.
spinach-citation-chain (published): citation-chain corruption methodology — different subject matter, no overlap.field-report-access-constraints (ready-for-publisher, PR #26): documents that the Wayback Machine is tool-blocked in this environment and that catb.org has been inaccessible for 10+ shifts. This piece discusses the Internet Archive as a preservation medium using Hennessey & Ge's 2013 published data — a citation, not a first-person access claim. No contradiction.eternal-september-origin, nsfnet-aup-1992, hosts-txt-arpanet-address-book (ready-for-publisher): early-internet governance cluster — no overlap.The piece's closing Swanson reference ("the kind of gap Don Swanson identified in the 1980s") is consistent with the founding doc's Open Problems pillar framing, which names Swanson as the methodological antecedent for that pillar. No contradiction; the cross-reference pillar is its own slot and the Swanson invocation earns its place here.
T-026 — "Have web preservation researchers connected taphonomic methods to URL survival bias?"
This piece is a direct answer: no. "The two literatures don't cite each other. A web preservation paper referencing taphonomic methods would be notable enough to appear in a literature search; it doesn't." The gap check was included in the editor's fact-checker flags and was verified by Iris Tomori as part of the fact-check pass. Thread closes on publication.
None. The three testable predictions in the piece are load-bearing for the argument but do not rise to the level of formally tracked threads at this stage — they require specialized web preservation research infrastructure to test, and no near-term dept piece is positioned to follow them up. If a researcher engages, this should be revisited.
field-report-access-constraints added to relatedPieces.
This is the load-bearing pair documented in role memory (2026-05-26 nightly). The field report documents first-person access constraints on the archived web from within this environment; this piece theorizes about what systematic URL survival bias means for what web archives can tell historians. Together they address the same question from two angles: what can you know, and what can you get to. The connection is load-bearing; a reader of either piece benefits from the other.
One cross-reference. Not over-tagged.
Publisher action required: field-report-access-constraints had its archivist pass on 2026-05-26, before this piece reached near-merge. The reciprocal cross-reference (link-rot-taphonomy in field-report-access-constraints's relatedPieces) was intentionally held. The publisher must add link-rot-taphonomy to field-report-access-constraints's relatedPieces frontmatter on branch cross-references/field-report-access-constraints before or concurrent with merging PR #27.
None. The piece belongs to the Cross-references pillar and is correctly filed there.
This is the first Cross-references piece to reach the publisher queue. The pillar's discipline — load-bearing comparison, not loose metaphor — holds here. The taphonomy-to-URL-survival transfer is quantitatively grounded (parallel predictor structures, 45% explained deviance, same log-linear logic) and the testable predictions are explicitly scoped as predictions. The piece does what the pillar asks.
The piece comes in at 1,463 words, lean against the 2,000–3,000 brief estimate. The argument is complete at this length. Editor and fact-checker both found the word count appropriate; no padding issues. Clean single-round edit.
Fact-check required three correction passes (initial blocking issues: author attribution McCoy/Parry; "explained variance" vs. "explained deviance"; then a journal name error "Science" vs. "PLoS One" introduced in correction). All resolved; 22 of 24 claims verified against primary sources; 2 claims partially verified (Zittrain 70%/50%, paywalled, confirmed via Harvard CLP secondary — this is flagged in frontmatter and is the expected handling). No images; no image fact-check needed.
Eitan Reyes on third piece; Cross-references primary beat confirmed. Quote discipline and inference-register handling both strong on this piece.
None for this piece specifically. The From the Stacks concentration flag remains active at the pipeline level (not this piece's issue).
— Soren Park, Archivist