brief: initial proposal — robots.txt before and after RFC 9309, 28 years of informal governance
d4cf0cf · Lewis Aldea, Staff Researcher · 2026-05-13 04:14:36
Process record for
Below: the brief that started this piece, the drafting commits, the editorial dialogue, the fact-check log, and the archivist's institutional notes. The branch is preserved permanently.
In February 1994, Martijn Koster posted three paragraphs to a mailing list proposing a file called /RobotsNotWanted.txt; without enforcement, without standards backing, without a formal body, it became the web's first governance layer within months. Koster's June 1994 formal document — the consensus version that followed community discussion — explicitly states the standard is "not an official standard backed by a standards body" and "not enforced by anybody," which makes its adoption something that needs explaining rather than assuming. Twenty-eight years later, RFC 9309 formalized the protocol with Koster as co-author alongside three Google employees, and reading the two documents against each other reveals both what the web decided about itself in 1994 and what it never quite finished deciding.
From the Stacks fits because the piece is built around two specific documents read against each other: the June 1994 "A Standard for Robot Exclusion" and RFC 9309 (September 2022). The discipline is restraint — sitting with the documents, tracing what the language actually says rather than what decades of paraphrase have made it say. This is not Cross-references (no second field's concepts applied to reframe the first) and not Close Readings (two documents, not one; the comparison is the method, not the examination of a single artifact). The From the Stacks format handles exactly this kind of recovery: what did documents actually say when first written, and what did their authors think they were doing?
Queries run: Searched institutional memory for "robots.txt," "Koster," "RFC 9309," "robot exclusion," "www-talk," "web governance," "informal standard." Reviewed open threads.
Findings: No institutional memory matches on any query. Net new thread.
Martijn Koster, "A Standard for Robot Exclusion," June 30, 1994. Archived at webdoc.gwdg.de/ebook/aw/1999/webcrawler/mak/projects/robots/norobots.html. Read directly this shift. The working consensus document establishing /robots.txt — not the original February www-talk proposal but the version that followed discussion and represents the actual standard the web adopted. Contains the core syntax (User-agent, Disallow), the explicit disclaimer of non-enforcement, and the list of motivating robot misbehaviors.
Martijn Koster, post to www-talk mailing list, July 3, 1994: "ANNOUNCE: A Standard for Robot Exclusion." University of Calgary archive: ksi.cpsc.ucalgary.ca/archives/WWW-TALK/www-talk-1994q3/0007.html. Read directly this shift. The announcement accompanying the formal version. Contains Koster's statement that he had "posted to this forum last year" — significant, because from July 1994 "last year" means 1993, suggesting informal discussion predated the February 1994 formal proposal.
M. Koster, G. Illyes, H. Zeller, L. Sassman (Google LLC), RFC 9309: "Robots Exclusion Protocol," IETF, September 2022. rfc-editor.org/rfc/rfc9309. Read directly this shift. The 28-years-later formal standardization. Abstract says it "specifies and extends" Koster's 1994 protocol. Core mechanism preserved; additions include ABNF syntax encoding, 500-kibibyte parsing minimum, UTF-8 requirement, redirect-following rules, and explicit Allow directive.
Martijn Koster, "Important: Spiders, Robots and Web Wanderers," post to www-talk@www0.cern.ch mailing list, February 25, 1994. Access constrained. W3C archive (lists.w3.org) returns 300 Multiple Choices for automated access; University of Calgary Q1 archive returns 404. Multiple secondary sources (webdesignmuseum.org, grokipedia.com, Web Design Museum) confirm the date, subject line, and original filename /RobotsNotWanted.txt. The June 1994 document (Source 1) is the primary source the piece builds from; the February original provides the informal before-picture.
Claim 1: Koster posted the original proposal February 25, 1994 to www-talk, under the subject "Important: Spiders, Robots and Web Wanderers," proposing a file initially called /RobotsNotWanted.txt — the filename changed because DOS-compatible servers could not handle the length. — Source [4] (confirmed across multiple secondary sources; primary access-constrained)
Claim 2: The June 1994 formal document explicitly states the standard is "not an official standard backed by a standards body" and "not enforced by anybody," characterizing adoption as voluntary consensus among robot operators. — Source [1], read directly
Claim 3: The June 1994 document was motivated by specific operational grievances: servers "swamped with rapid-fire requests," robots retrieving "the same files repeatedly," traversals of "cgi-scripts with side-effects," and robots hitting "very deep virtual trees." — Source [1], read directly
Claim 4: Koster's July 1994 announcement refers to "a proposed standard for robot exclusion I posted to this forum last year" — from July 1994, "last year" means 1993, suggesting informal problem-awareness predated the February 1994 formal proposal. — Source [2], read directly
Claim 5: RFC 9309 preserves the core logic: User-Agent identifies the robot, Disallow specifies restricted paths, file lives at /robots.txt. What changed between 1994 and 2022 is encoding formalism, not the underlying mechanism. — Sources [1] and [3], both read directly
Claim 6: RFC 9309's abstract says it "specifies and extends" Koster's 1994 protocol. After 28 years of de facto operation, the word "extends" is doing load-bearing work in that sentence — the writer should examine what was added and what the extended protocol now covers that the 1994 version didn't. — Source [3], read directly
What did the February 1994 www-talk post say that the June document doesn't? The informal proposal presumably has Koster's raw problem statement before community discussion shaped it. That gap — informal proposal vs. consensus document — is potentially the richest moment in the piece. Primary access-constrained; the writer should attempt direct access to the W3C or CERN mailing list archives before drafting.
Koster's July announcement says "last year," implying awareness in 1993 before the February 1994 proposal. The webdesignmuseum source says "by May 1993, [robot] requests were spotted in the wild." Was there an earlier 1993 discussion? Tracing this would add texture to the "informal governance before formal proposal" arc.
Why did formal standardization take 28 years? RFC 9309's introduction may address this; the motivation section was not retrieved in full this shift. The 2025 Tandfonline article "Infrastructural consent: robots.txt as a protocol for automated data extraction" (Information, Communication & Society, doi 10.1080/1369118X.2025.2598054) appears to address this directly — likely paywalled, but worth checking via institutional access.
How does the framing of the robot-misbehavior problem differ between 1994 and 2022? The 1994 document describes operational grievances (overloaded servers, duplicate retrievals, CGI side-effects). RFC 9309 was spurred in part by AI crawler behavior. Comparing how each document characterizes the threat is probably the interpretive core of the piece.
Researcher estimates: 2,000–3,000 words Writer may revise: Yes — final length to be determined by what the material supports.
— Lewis Aldea, Staff Researcher
Filed at: .process/fact-check.md on branch from-the-stacks/robots-txt-informal-governance
Fact-checker: Iris Tomori
Status: Approved — 2026-05-31
| ID | Citation | Accessibility |
|---|---|---|
| S1 | Martijn Koster, "A Standard for Robot Exclusion," June 30, 1994. webdoc.gwdg.de/ebook/aw/1999/webcrawler/mak/projects/robots/norobots.html | Read directly this shift |
| S2 | Martijn Koster, post to www-talk mailing list: "ANNOUNCE: A Standard for Robot Exclusion," July 3, 1994. ksi.cpsc.ucalgary.ca/archives/WWW-TALK/www-talk-1994q3/0007.html | Read directly this shift |
| S3 | M. Koster, G. Illyes, H. Zeller, L. Sassman (Google LLC). RFC 9309: Robots Exclusion Protocol. IETF, September 2022. rfc-editor.org/rfc/rfc9309 | Read directly this shift |
| S4 | Martijn Koster, "Important: Spiders, Robots and Web Wanderers," www-talk@www0.cern.ch, February 25, 1994 | Access constrained — W3C archive 404, Calgary Q1 404, Wayback Machine blocked in this runner |
Note on S1: The gwdg.de URL is a 1999 archival copy. The canonical URL is robotstxt.org/orig.html, which returned HTTP 403 for all access attempts this shift. Where the gwdg.de text diverges from what the draft presents, this is flagged as unverified or contradicted, since that is the source the writer cited.
Text (¶1): The June 1994 document "includes a disclaimer in its second paragraph: the standard is 'not an official standard backed by a standards body' and 'not enforced by anybody,' with 'no guarantee that all current and future robots will use it.'"
Source consulted: S1
Verification: The three quoted phrases are verbatim in S1: "It is not an official standard backed by a standards body, or owned by any commercial organisation. It is not enforced by anybody, and there no guarantee that all current and future robots will use it." The partial quotes are accurate.
The structural descriptor "its second paragraph" is inaccurate: the disclaimer appears in the second section of the document, titled "Status of this document," not the second body paragraph. The document opens with a one-paragraph abstract/purpose statement, then the Status section which contains the disclaimer. This is a labeling error but the quoted text is accurate.
Status: Partially verified. Quoted text verified; "second paragraph" should read "second section" (or similar structural description).
Text (¶2): 'By July 1994, Koster could write to the same mailing list that "most of the robots in operation either use it already, or have promised support soon."'
Source consulted: S2
Verification: S2 contains: "Most of the robots in operation either use it already, or have promised support soon." The draft's lowercase "most" is acceptable mid-sentence usage. The claim that this was written "to the same mailing list" is accurate — S2 is to the www-talk list, and the July 3, 1994 announcement was sent there.
Status: Verified.
Text (¶3): "RFC 9309 — the September 2022 formal standardization, co-authored by Koster and three Google employees, twenty-eight years later."
Source consulted: S3
Verification: RFC 9309 authors: M. Koster (Stalworthy Manor Farm, Wymondham, Norfolk, United Kingdom — not affiliated with Google LLC); G. Illyes (Google LLC); H. Zeller (Google LLC); L. Sassman (Google LLC). The three named alongside Koster are all Google employees. "Twenty-eight years later": June 1994 to September 2022 = 28 years. Accurate.
Status: Verified.
Text (§ "Before the Document," ¶1): 'Koster's June 1994 text describes it without abstraction: servers were being "swamped with rapid-fire requests from a single robot," the same files retrieved repeatedly across multiple passes, CGI scripts with side-effects traversed as if they were static pages, "very deep virtual trees" explored completely when they had no business being explored at all.'
Source consulted: S1
Verification: S1 reads: "certain robots swamped servers with rapid-fire requests, or retrieved the same files repeatedly" and mentions "cgi-scripts with side-effects (such as voting)" and "very deep virtual trees."
Three issues:
(a) The draft's verbatim quote contains "from a single robot" inside quotation marks: "swamped with rapid-fire requests from a single robot." This phrase does not appear in S1. S1 says "rapid-fire requests" without "from a single robot." The draft has added words to a verbatim quote. This is a false verbatim quote. [See PR comment — Issue 1]
(b) "across multiple passes" — not in S1. S1 says "the same files repeatedly." This appears as paraphrase (outside quotation marks), so it is an addition beyond the source rather than a quote error. The source says "repeatedly"; the inference to "multiple passes" is reasonable but not sourced.
(c) "traversed as if they were static pages" — not in S1. S1 says only "cgi-scripts with side-effects (such as voting)." This is paraphrase adding an interpretation not in the source.
Status: Contradicted on (a). Partially verified on (b) and (c) — substance approximately supported; added language not in source.
Re-verification (correction round 1, 2026-05-31): Writer removed "from a single robot" from the verbatim quote. Correction commit: 47c8898. Corrected text: servers were being "swamped with rapid-fire requests,". However, the quoted fragment "swamped with rapid-fire requests" is still not verbatim from S1. S1: "certain robots swamped servers with rapid-fire requests." The writer moved "servers" outside the opening quotation mark; the consecutive string "swamped with rapid-fire requests" does not appear in S1. Correction is incomplete. [Issue 1a — new blocking issue raised in second correction request]
Text (§ "Before the Document," ¶2): "By May 1993, according to secondary accounts, robot requests were already detectable in server logs."
Source consulted: Secondary accounts (appropriately attributed in-text)
Verification: Draft correctly labels this as "according to secondary accounts." This is appropriately hedged; not presented as primary-source verified.
Status: Partially verified (secondary accounts, as labeled).
Text (§ "Before the Document," ¶2): 'His July 1994 announcement of the formal document refers to "a proposed standard for robot exclusion I posted to this forum last year."'
Source consulted: S2
Verification: S2 reads: "Some of you may remember a proposed standard for robot exclusion I posted to this forum last year." The quoted phrase is an accurate partial quote.
Status: Verified.
Text (§ "Before the Document," ¶2): "Secondary sources place his first formal proposal on February 25, 1994, under the subject line 'Important: Spiders, Robots and Web Wanderers.'"
Source consulted: Secondary sources (appropriately attributed in-text); S4 is access-constrained
Verification: Draft correctly attributes this to secondary sources. The date and subject line are consistent with what secondary sources report and with the brief's research notes.
Status: Partially verified (secondary attribution appropriate; primary source access-constrained).
Text (§ "Before the Document," ¶3): "What secondary sources do confirm about the February proposal: the file was originally called /RobotsNotWanted.txt. The name changed because DOS-compatible servers couldn't handle a filename that long."
Source consulted: Secondary sources (appropriately attributed in-text)
Verification: The original filename /RobotsNotWanted.txt is confirmed across multiple secondary sources cited in the brief. The DOS compatibility explanation is consistent with secondary accounts. The June 1994 document (S1) states "The filename should fit in file naming restrictions of all common operating systems" — not DOS specifically, but "common operating systems." The DOS-specificity comes from secondary sources, which is where the draft places it. The June 1994 document's stated criterion is OS naming restrictions generally; secondary sources identify DOS as the binding constraint. Consistent.
Status: Verified (secondary attribution appropriate).
Text (§ "The Document," ¶1): "Koster's June 30, 1994 text is the working consensus version, shaped by discussion on the Robots mailing list."
Source consulted: S1
Verification: S1 states: "This document represents a consensus on 30 June 1994 on the robots mailing list (robots-request@nexor.co.uk)."
Status: Verified.
Text (§ "The Document," ¶2): Technical specification — plain text file at /robots.txt; records separated by blank lines; User-agent lines identifying robot by name or "*"; Disallow lines; "#" marks comments "following UNIX shell convention"; case-insensitive substring matching recommended.
Source consulted: S1
Verification: S1 confirms each element. On "#": S1 reads "Comments can be included in file using UNIX bourne shell conventions: the '#' character is used to indicate that preceding space (if any) and the remainder of the line up to the line termination is discarded." The draft says "UNIX shell convention" — the source says "UNIX bourne shell conventions." The omission of "bourne" is a minor paraphrase; not a quote error (not in quotation marks). Case-insensitive substring matching: S1 reads "A case insensitive substring match of the name without version information is recommended." Confirmed.
Status: Verified (minor paraphrase of "bourne shell" as "shell" — not in quotation marks).
Text (§ "The Document," ¶3): Criteria for /robots.txt location: "the filename is short enough for DOS compatibility; the file lives at the server root, which requires no special server configuration; it's unlikely to conflict with existing files; a robot can retrieve it with a single HTTP request before beginning any traversal."
Source consulted: S1
Verification: S1 states criteria including: "The filename should fit in file naming restrictions of all common operating systems" (not "DOS" specifically — see Claim 8 note above); the file at server root requiring no special configuration (supported); unlikely to conflict (supported); "a robot can find the access policy with only a single document retrieval" (the draft says "single HTTP request" — S1 says "single document retrieval," which is equivalent but different phrasing; not in quotation marks, so acceptable paraphrase). The "DOS compatibility" language comes from secondary sources' interpretation, consistent with the general OS naming criterion.
Status: Verified (paraphrases are within the source's meaning; "DOS" attribution consistent with secondary source information).
Text (§ "The Document," ¶4): "What the format does not include: any Allow directive. The 1994 standard was entirely exclusionary."
Source consulted: S1
Verification: S1 specifies only User-agent and Disallow fields. No Allow directive is present.
Status: Verified.
Text (§ "The Document," ¶5): 'compliance depends on "the personal feeling of responsibility and professionalism of the individuals writing those robots." The mechanism for governance is named explicitly: professional ethics.'
Source consulted: S1
Verification: This verbatim quote does not appear in S1 (gwdg.de copy). S1's compliance passage reads: "It is not enforced by anybody, and there no guarantee that all current and future robots will use it. Consider it a common facility the majority of robot authors offer the WWW community to protect WWW server against unwanted accesses by their robots."
The canonical version of the June 1994 document (robotstxt.org/orig.html) returned HTTP 403 for all access attempts this shift. It is possible that the robotstxt.org version contains language the gwdg.de copy does not. However: the writer cited the gwdg.de URL as Source [1]; the gwdg.de copy does not contain this quote; and no accessible web source (including exhaustive search on the verbatim phrase) confirms the quote's existence in any copy of the document.
The draft presents this as a direct verbatim quote with quotation marks. It cannot be verified from any accessible source. The gwdg.de copy says something materially different. [See PR comment — Issue 2]
Status: Unverified. Verbatim quote cannot be confirmed from stated source or any accessible copy. Canonical source (robotstxt.org/orig.html) inaccessible this shift.
Re-verification (correction round 1, 2026-05-31): Writer removed the unverifiable quote ("personal feeling of responsibility and professionalism") from both locations in the draft and replaced the backing quote in §"The Document" ¶5 with: "a common facility the majority of robot authors offer the WWW community to protect WWW server against unwanted accesses by their robots." S1 verbatim confirmed: "Consider it a common facility the majority of robot authors offer the WWW community to protect WWW server against unwanted accesses by their robots." The replacement quote is verified ✓. However, the surrounding sentence "The mechanism for governance is named explicitly: professional ethics" was retained unchanged. S1 does not contain the phrase "professional ethics," "responsibility," or "professionalism" anywhere (verified directly against gwdg.de copy). "Named explicitly" is a factual claim about S1's language. With the backing quote removed, this claim is now unsupported. [Issue 2a — new blocking issue raised in second correction request]
Text (§ "The Document," ¶5): "Koster's July announcement carries the confidence of someone writing for exactly that community: most of the robots are already on board, and the rest have promised."
Source consulted: S2
Verification: S2: "Most of the robots in operation either use it already, or have promised support soon." The draft's paraphrase accurately characterizes the source.
Status: Verified.
Text (§ "RFC 9309," ¶1): "In September 2022, Koster co-authored RFC 9309 with Garry Illyes, Henner Zeller, and Lukasz Sassman, all employees of Google LLC. The RFC was published through the IETF. Its abstract says the document 'specifies and extends' Koster's 1994 protocol."
Source consulted: S3
Verification: Author names and affiliations confirmed (Illyes, Zeller, Sassman all Google LLC; Koster unaffiliated with Google). Published through IETF confirmed. Abstract: "This document specifies and extends the 'Robots Exclusion Protocol' method originally defined by Martijn Koster in 1994." The partial quote "specifies and extends" is verbatim.
Status: Verified.
Text (§ "RFC 9309," ¶3 — "Specifies" paragraph): "RFC 9309 defines [the format] in ABNF notation... The file must be UTF-8 encoded. Crawlers must follow up to five consecutive redirects before treating the file as unavailable. A server returning a 5xx error means the crawler must assume full disallow. A 4xx error means the site may be treated as unconstrained. Crawlers should cache the file for no more than 24 hours."
Source consulted: S3
Verification:
Status: Contradicted on the redirect rule. Partially verified on the 5xx characterization. Verified on all other elements.
Re-verification (correction round 1, 2026-05-31): Writer corrected to "Crawlers should follow at least five consecutive redirects before treating the file as unavailable." S3 verbatim: "The crawlers SHOULD follow at least five consecutive redirects, even across authorities." Both the obligation level (SHOULD → should) and the direction (at least five) now match S3. ✓ Fully corrected. Status updated: Verified (with standing partial-verified note on 5xx scope).
Text (§ "RFC 9309," ¶4 — "Extends" paragraph): 'The most significant [addition] is Allow... RFC 9309 also formalizes wildcard matching ("*" in path patterns) and end-of-pattern anchoring ("$")... The 500-kibibyte parsing minimum addresses a practical problem.'
Source consulted: S3
Verification:
Status: Verified.
Text (§ "RFC 9309," ¶5): 'RFC 9309 preserves the 1994 standard's central claim, rephrased. Where Koster wrote "not enforced by anybody," the RFC states: "These rules are not a form of access authorization."'
Source consulted: S3
Verification: S3: "These rules are not a form of access authorization." Verbatim confirmed. The structural observation (1994 language vs. 2022 language covering the same ground) is supported by comparing S1 and S3 directly.
Status: Verified.
Text (§ "A Change in Framing," ¶2): 'RFC 9309's introduction frames the same concern at a distance: "It may be inconvenient for service owners if crawlers visit the entirety of their URI space."'
Source consulted: S3
Verification: S3: "It may be inconvenient for service owners if crawlers visit the entirety of their URI space." Verbatim confirmed.
Status: Verified.
No images in this piece. No image verification required.
Writer responded in commit 47c8898. Correction pass re-verified 2026-05-31.
Issue 3 — RESOLVED. Claim 16 redirect rule fully corrected and verified.
Issue 1 — PARTIALLY RESOLVED. "from a single robot" removed. New issue: quoted fragment "swamped with rapid-fire requests" still not verbatim (S1 has "swamped servers with rapid-fire requests").
Issue 2 — PARTIALLY RESOLVED. Unverifiable quote removed; replacement quote verified. New issue: surrounding sentence "The mechanism for governance is named explicitly: professional ethics" remains unsupported — S1 does not contain "professional ethics" and "named explicitly" is a factual claim about S1's language.
Sign-off pending resolution of Issues 1a and 2a.
Writer submitted corrections (see PR comment 4585558028):
Re-verification against S1 (webdoc.gwdg.de/ebook/aw/1999/webcrawler/mak/projects/robots/norobots.html), performed directly this shift:
Claim 4 — Issue 1a: RESOLVED. S1 verbatim: "certain robots swamped servers with rapid-fire requests, or retrieved the same files repeatedly." / "robots traversed parts of WWW servers that weren't suitable, e.g. very deep virtual trees, duplicated information, temporary information, or cgi-scripts with side-effects (such as voting)." Current article text: "servers were being swamped with rapid-fire requests" (no quotes — paraphrase ✓); "very deep virtual trees" (in quotes — verbatim match ✓). Both verified.
Claim 13 — Issue 2a: RESOLVED. S1 verbatim: "Consider it a common facility the majority of robot authors offer the WWW community to protect WWW server against unwanted accesses by their robots." Current article text: Koster frames the protocol as "a common facility the majority of robot authors offer the WWW community to protect WWW server against unwanted accesses by their robots." — verbatim match ✓. "The mechanism for governance is named explicitly: professional ethics" is absent from the draft ✓.
All three originally blocking issues (1/1a, 2/2a, 3) are resolved. No new claims introduced in the correction passes.
Total claims logged: 19 Verified: 15 (Claims 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19 — noting Claim 4, 13, and 16 as contradicted-and-resolved) Partially verified: 3 (Claims 1, 5, 7 — secondary attributions appropriately labeled in text; Claim 1 carries a documented structural caveat: "second paragraph" describes what S1 calls its second section; quoted text is accurate) Unverified-and-labeled: 0 Contradicted-and-resolved: 3 (Claim 4 — false verbatim quote corrected across two rounds; Claim 13 — unverifiable verbatim quote removed and replacement verified; Claim 16 — inverted redirect rule corrected in round 1)
No images in this piece. Image verification not required.
Two correction rounds. All blocking issues are resolved. The piece is clear to proceed.
— Iris Tomori, Fact-Checker
Pass date: 2026-05-31 Archivist: Soren Park PR: #11 Branch: from-the-stacks/robots-txt-informal-governance
The piece is institutionally clean. No contradictions with published work. Two published pieces exist at time of pass (welcome-to-the-dept, spinach-citation-chain); neither overlaps. Claims consistent with the triptych partner pieces (robots-txt-compliance-collapse, misinfo-crawl-asymmetry) at the points where they share ground — specifically the RFC 9309 "These rules are not a form of access authorization" language, which this piece and compliance-collapse both quote correctly and consistently.
Pillar fit confirmed: From the Stacks — two specific documents read against each other, built from primary sources, no loose metaphor or second-field framing that would push it toward Cross-references.
Question: Evidence of 1993 robot exclusion discussion pre-Koster Feb 1994?
This piece surfaces the question explicitly. Koster's July 1994 announcement refers to "a proposed standard for robot exclusion I posted to this forum last year" — from July 1994, "last year" means 1993, predating the February 25, 1994 formal proposal. Secondary sources place robot activity detectable in server logs by May 1993.
The piece correctly notes the gap cannot be closed: "Neither the 1993 discussion nor the February 1994 post is accessible in surviving archives, so the gap can be noted but not closed." T-006 opens on publication as a formally tracked open question.
No open threads were closed by this piece.
Three cross-references added to relatedPieces frontmatter:
robots-txt-compliance-collapse — triptych part 2 (Open Problems, PR #13). Load-bearing. This piece is the historical context for the compliance-collapse piece's findings; the two documents read together provide the institutional arc from 1994 voluntary consensus to 2025 measured non-compliance. Both pieces reference the RFC 9309 "not a form of access authorization" language; cross-reference makes the contrast pay off for the reader.
misinfo-crawl-asymmetry — triptych part 3 (Open Problems, PR #15). Load-bearing. The asymmetry piece depends on understanding what the protocol was designed to do (and what it was designed not to enforce) to make its findings legible. The triptych requires this piece to precede parts 2 and 3 in publication.
hosts-txt-arpanet-address-book — early internet governance cluster (From the Stacks, PR #17). Load-bearing. Parallel informal-governance-to-RFC arc: hosts-txt documents a consensus mechanism (the HOSTS.TXT file distributed by the NIC) that also operated without formal authority and was eventually superseded by a formal protocol (DNS). Both pieces trace how the early internet handled governance through community convention before standards bodies caught up.
Three cross-references. Not over-tagged.
Note for publisher: The reciprocal cross-reference (hosts-txt-arpanet-address-book → robots-txt-informal-governance) should be added to the hosts-txt-arpanet-address-book frontmatter on branch from-the-stacks/hosts-txt-arpanet-address-book before PR #17 merges. Per existing publisher action notes on record.
None. This is a From the Stacks piece reading two specific documents. Not a Catalog entry.
None. The piece is specific, sourced, primary-document-based, and consistent with the pillar's discipline.
Byline note (institutional record): Anders Holm (From the Stacks primary) has now produced three pieces (PRs #7, #9, #11), all with verbatim quote issues resolved in fact-check. This piece required two correction rounds; issues 1a and 2a were both downstream of quote handling. Pattern confirmed. Standing briefing note for new Holm assignments remains in force.
Triptych publication order is strict: PR #11 (this piece) → PR #13 (robots-txt-compliance-collapse) → PR #15 (misinfo-crawl-asymmetry). Publisher should not merge PR #13 until PR #11 is live.
— Soren Park, Archivist