Larry Masinter for W3C TAG, 12/17/2011, draft for discussion, intended to become one or more TAG findings.
This document is circulated as part of TAG
History [ACTION-241] Review TAG versioning situation and report back to TAG and HTML
Please discuss this document on firstname.lastname@example.org (archived).
(This document covers a lot of ground, and the above actions envision taking them on a small piece at a time; this represents a new strategy of Write first, modularize later.)
This document discusses evolution of the web, and makes (will make, when completed) recomendations for best practices around managing evolution. In particular, it recommends practices for the use of MIME and MIME types in the Web, the establishment and use of registries, and managing references among technical specifications which require stability in the face of evolution of other components.
The value of the Internet and the Web is global communication among unrelated parties. Different implementations need to agree on the protocols, languages, and protocol elements in their communication for them to interoperate. Unmanaged evolution results in diminishing the interoperability of components, a cost to all which must be weighed against the benefits of evolution.
There are a number of issues that need to be addressed to help achieve the goal of careful evolution and global interoperability in an evolving world. Those recommendations include attention to the way in which standards allow for extensibility by adding values and new meaning to the protocol elements used within them, guidelines for establishing and using registries, and a model for evolution in a way that a standards organization can lead in the managed evolution of the technology available to its implementations.
In a number of cases, terms used with a particular (narrow) definition within this document are used in bold face. Editorial comments are in parentheses and in italic. (Perhaps a subsequent edition will turn bold face into hyperlinks to the definition of this term within this document.)
It is useful to consider evolution of different aspects of how the Web works and evolves, and to be precise in the terminology when discussing evolution. This section primarily establishes a framework for these aspects, through careful use of terminology.
There are relationships between these, for example:
Evolution is a process where all of these aspects change over time, to adapt to new requirements, add new features, fix problems that arise as circumstances, applications, and user needs change.
Managing evolution of standards in a world where there are multiple implementations involves coordinating the evolution of these different aspects:
The process by which specifications become standards involves coordination between multiple parties, and significant review. Various means are used to assist the evolution of implementations while maintaining interoperability, without being unduly held back by the standards process.
Extensibility and evolution can be both positive and negative. [IAB-extension]. While extensibility allows innovation and deployment of new features, there is always a risk of unintended consequences, such as interoperability problems or security vulnerabilities. This risk is especially high if the extension is performed by a different team than the original designers, who may stray outside implicit design constraints or assumptions. Extensions should be done carefully and with a full understanding of the base protocol or language, existing implementations, and current operational practice.
(This section is intended to lay out the design choices for whether to use a registry or a URI or some other means of associating values with symbols/names/codes/values. Is there a better word than 'identifier'?)
One way a standard can facilitate evolution is to allow for extensibility in some of the protocol elements it uses, through the ability to add new values for those protocol elements that were not part of the protocol or language at the time the specification was written. Some protocol elements used in a language or protocol may allow values which act as identifiers: determining the meaning those values requires information which is specified independently. In some cases, an identifier might come from a fixed set of values (e.g., identifiers for the days of the week or months of the year). But in many cases, evolution and extensibility are accomplished by changing the meaning of existing values or adding new values and associating meaning with those values. The Web uses many protocol elements which are identifiers; for example, character entities in HTML, content-types, uri schemes, color names, host names, HTTP headers. (see Appendix A for a more complete list.)
Implementations evolve by changing or extending the behavior of the implementation when identifiers are encountered, most commonly by adding new values.
There are a variety of ways for allocating identifiers and associating meaning with them:
(This section is intended to discuss the considerations when designing a protocol element which uses identifiers and the way those are managed. The basic problem is how to do this so that the specification and implementation and language evolution are kept in sync while not making old conforming implementations non conforming, etc. (still very sketchy). [IAB-extension] has a discussion of costs & benefits, but it tries to separate 'routine' and 'major' extension categories, based on the impact adding a new identifier has on the base protocol/language)
What are the processes for designing and deploying extensions?
Some extensibility points have requirements that are not obvious or well-documented or well-understood, and could affect proper functioning in some way... if so, a process that has some qualification of whether it has passed meaningful review, whether someone other than the inventor of the registry item can update its specification. Lower cost of evolution, Preserve Interoperability, Matching reality, allow for private extensions, give implementers guidance about what is actually needed to be interoperable with other deployed systems, allow discovery of what is meaningful and important, insure the information is timely, doesn't go out of date, disappear, make sure that it is stable and evolvable at the same time. Manage transition from experimental to stable to standard, make sure the process for extending a standard needs to have similar characteristics as the standard itself, in terms of "fair" and "transparent", make sure the registry is as long-lived as the specifications that use it, avoid problems of trademark, spam, denial of service, ..., and Identifier length (Some protocols and languages are sensitive to the space usage and compressibility of long strings used as identifiers; in such cases, identifier length is a consideration)
(this section is intended to give a broad evaluation of the categories against the evaluation criteria; also still an outline of notes ...)
In Specification: Low cost of implementation extension, higher cost of specification update, fairness depends on same standards process as anything else, long lifetime, transition from implementation to specification is painful but that's what standards are about.
Registry: (see Section Registries below for 'best pratices' when this is the right choice). Cost of setting up registry, managing it, expert review, benefit of avoiding interactions, fairness issues, trademark & spam. Allows using numbers and meaningless values to avoid trademark spam, and difficulties of internationalization.
Using URIs: Example: RDF. Meaning is discovered by httpRange14. Low cost (no registration process, might require maintaining URI. Very timely. Transition unnecessary. Lifetime up to lifetime of URI. Very fair. Hard to misuse because no registry. Preferred method, modulo longevity of URIs. Note that URN allows naming a registry as a URI.
Vendor Prefix: (example from CSS. Transition path difficulties outlined in [ref]
URI-named namespace: XML namespaces, RDF (?).
(Some things the TAG might 'find'? These are useless if nobody believes in them.):
Finding.DefineExtensibility: Extensibility and evolution must be planned and provided for in specifications that become standards. Standards that use identifiers should also specify the expected behavior of compliant implementations when confronted with unrecognized identifiers; for example, to distinguish between "must understand" and "must ignore" for unrecognized identifiers. Without also constraining implementation behavior, the fact that the specification might be extensible will not translate into an effective way of allowing implementations to evolve.
Finding.AvoidRegistries: (originally I drafted 'avoid registries'. Certainly non-IANA registries have a problem with the long-term viability and control of the registry over, say, a 20 year period. Avoiding registries are using IANA seem preferable for the long term. In the short-term, using a Wiki seems like it's ok? Whether or not you use a registry with some gatekeeping and review may depend on the cost of extensibility... if it's low, then use a URI or vendor prefix. if it's high, use In Specification. In the middle, use a Registry with review. )
In the case where a registry is used for maintaining a list of identifiers and their meaning or pointers to their specifications, there are some common practices that will enhance the reliability and interoperability of the Web while allowing rapid evolution.
The Internet Assigned Numbers Authority (IANA)[IANADEF] is the primary organization whose charter and purpose is to maintain registries of values needed for Internet protocols and languages as specified in the IETF.[ref BCP from which this was quoted]. IANA administers the registries for many protocol elements in the core of the Web, including URI scheme names, Internet media type identifiers ("MIME types"), HTTP protocol header values, HTTP result codes, names of character sets. (See Appendix A.)
(The idea is to extend the very brief analysis in the previous section to a more complete discussion of what makes a registry work or not work. So far this is just notes, very incomplete.)
Lower cost of evolution, Preserve Interoperability, Matching reality, allow for private extensions, give implementers guidance about what is actually needed to be interoperable with other deployed systems, allow discovery of what is meaningful and important, insure the information is timely, doesn't go out of date, disappear, make sure that it is stable and evolvable at the same time. Manage transition from experimental to stable to standard, make sure the process for extending a standard needs to have similar characteristics as the standard itself, in terms of "fair" and "transparent", make sure the registry is as long-lived as the specifications that use it, avoid problems of trademark, spam, denial of service, ..., and Identifier length (Some protocols and languages are sensitive to the space usage and compressibility of long strings used as identifiers; in such cases, identifier length is a consideration)Update: A registry has a specific update policy. Matching reality: Registries tend to go out of step with reality unless costs of registration or registry update are low and benefits are high to at least one of the parties authorized to make a registration or update. (See "Matching Reality" below). Discovery: Manual discovery is hindered by many alternative places to find a registry, and the possibility of alternative locations (Wikipedia for MIME types, for example.) Timeliness: In particular for registries, there is a tension between establishing a long-lived extensibility point meaning, between cementing the value too soon (before consensus is reached) and too late (after widespread deployment), especially when those overlap (extensions are widely deployed before consensus is reached). The policy needs to move toward "registration before deployment" independent of where in the standards cycle that holds. If the standard differs from early deployment, the registry should be updated to point to not only the standard but also the facts for what one might encounter "in the wild".
Transition: If registries encode status in registered names (as the MIME registry does), transition and grandfathering are issues. Lifetime: The documents pointed to by IANA registry are not as long-lived as the registry itself, and much of the information is obsolete. See "Registry Stability" below.
Fairness: IANA is capable of administering a "fair" process with a reasonable dispute reslution mechanism, if those are specified at the time the registry is established. Wikis and other methods for maintaining a registry have more (or at least different) potential for abuse.
The IETF best current practice specification [BCP 26][RFC 5226] gives guidelines to protocol designers for establishing the registry rules associated with an IANA registry. Note that IANA acts as the operator of each registry, but itself does not evalute registry requests, but merely adminmisters a process by which the organization or individuals authorized to review or approve registry entries are accepted. These guidelines apply to IANA namespaces established or requested by W3C working groups or task forces.
(want to summarize the HappIana problem space)
In some cases, implementations have evolved and the registries have not followed: the registries have not tracked the use of identifiers.. In some cases, the registry process is percieved as a bottleneck. If there is a registry, it is only useful if values are registered. A registry which does not match actual use (as is currently the case with URI schemes, Media Types) is not very useful.
Over time, divergence of meaning of identifiers used in the same protocol element but in different languages or protocols is harmful to interoperability. We need some way of creating a positive force so that, over the long run, divergence is reduced.
(try to address willful violations) Technical specifications that wish to override an existing registry for some values and use it for another should (a) attempt to correct the extensiting registry; in cases where it cannot, (b) a new "override" registry should be established with new values, where the spec points to the new registry. (????)
(see section References below)
Often, a registry will not contain the complete definition needed to understand the meaning of an identifier, and contains a pointer to a specification or specification series. For example, the Internet Media Type registry defining file formats and languages often contains a pointer to a specification. In those cases, the registry entry for an identifier might be updated, or the registry value itself established in a way that notes the possible evolution of the specification indicated.
Forking is a situation where there are multiple specifications or specification series which are associated with the same identifier used in a single protocol element.
An identifier in a registry which identifies a protocol or language may contain a pointer to a specification, but the use of that identifier in an implementation needs to identify the language or protocol intended, even when those have evolved beyond what the specification referenced has described.
(can we make recommendations for references in registries consistent with recommendations for references in specifications and standards?)
Registry values and the specifications they point to typically go through a life-cycle, where a parameter is introduced experimentally, deployed in a limited or vendor-specific context, and then adopted more broadly.
Frequently, groups with registries or registered values attempt to convey status of a registered value in the name chosen within the registry, e.g., using an "x-" prefix for experimental names, "vnd." prefixes in internet media types, etc. In practice, these conventions are failures, counter-productive, because there is no simple deployment path when status changes, e.g., vendor proposed extension become public standards, experiments succeed, etc.
Extensibility requires review against criteria, but some identifiers haven't been reviewed and are unsafe. Registries at least give you the option of noting the review status of the proposed extension as a warning to implementors, if it is an extensibility point that requires careful review.
(this section is left here for the cases where a W3C spec needs a registry and IANA and its processes don't fit the needs... it's intended to be specific to W3C,)
How to encourage W3C staff & working group participants to manage the registration information, update the chair document on establishing and managing registries and extensibility points. Other registrations have their own administrative procedure. A regular "have obligations related to registration been met" check into the W3C document publication/advancement procedure.
The following conclusions reached:
In the web, two important protocol elements whose values are identifiers come from MIME (the Multipurpose Internet Mail Exchange set of specifications): the "Internet Media Type"and the "charset".
The contexts of email and web are sufficiently different that some of the requirements for email registration and web registration, as well as practices in the deployment of implementations of agents that use MIME, have led to some mismatch between desired properties of the Internet Media Type protocol element.
The typical use pattern of email is that the transmission of data is unanticipated, and often between parties where the sender has no knowledge of the capabilities of the recipient. The typical use pattern of the web is that data is requested explicitly, and often much is known about the requirements and expected content for a retrieval.
Often, it is quite possible, with relatively high accuracy, to determine the language of data by examining the data itself; in some cases in the web (retrieval by ftp or file system access), there is no independent channel for communicating content-type: other indicators and sniffing.
file extensions: A common practice in many systems was to use the end of the name of a file in the file system (the "file extension" ) as identifying the type of file. This practice has now extended to other systems.
sniffing: in many contexts, language can be guessed by looking for some unique string, number or pattern, which only appears in files of that language. In circumstances where this was a unique number, it was called a "magic number", although this concept has been extended to other textual patterns. In some cases, sniffing will be employed to override a (syntactically correct) content-type label, because of previous experience with mis-labeled content.
Information about these other ways of determining language were gathered for the Internet Media Type registry; those registering types are encouraged to also describe 'magic numbers', Mac file type, and common file extensions. However, since there was no formal use of that information, the quality of that information in the Internet Media Type registry is haphazard.
In some applications, implementations of some languages and protocols have interpreted identifiers in ways inconsistent with their registry entries, to the point where the specifications of those languages have neeeded to provide for "willful violations" of the registry entries (which cannot change if they are used differently by other languages and protocols which use the same protocol element.)
(this section is intended to summarize the problems with the MIME registry because fixing it and keeping it up to date will be a big job, if that's what we want to do).
Internet Media Types suffered from "poor registration performance":
(this section may belong somewhere else, but it's a use case for finer granularity descriptions of languages and file formats, and possibly some additional information not related to language at all....)
When two parties communicate and share more than one language (or version of a language) they might use for communication, the idea of "content negotiation" involves an exchange protocol where a choice among these methods is made. content negotiation is common: fax machine twirps to each other when initially connecting, negotiating resolution, compression methods and so forth. In Internet mail, "content negotiation" consists of the sender preparing and sending multiple alternative messages, and including them all in the same message.
For example, HTML email also often contains alternatives in plain text, labeled with content-type of "text/html" and "text/plain" respectively. In HTTP, a request might include "Accept" or "Accept-Charset" parameters which allow the responding web server to match the language of the response to the capabilities of the client. The "User-Agent" parameter in a HTTP request is often used for this purpose.
Content negotiation based on "Accept" and content-type has only been successful in limited contexts. Negotiating the content-type (e.g. HTML vs. Word vs. PDF) doesn't really happen: people want to make an explicit choice of downloading an MS Office or PDF depending on the goals they have that moment, instead of letting software pick a format for them. Negotiation of HTML vs. XHTML happens but is rare in the big picture and rarely offers true value to users.
(this section might also belong somewhere else, but it's one of the problems with MIME types and sniffing, that the same content might properly be considered to be in two different languages)
There are some interesting cases where the same content can be viewed as being in multiple languages:
(Additional considerations... MIME assumes the sets of language uses are partitioned. PNG and its use in fireworks. Google re-using JPEG for new Google image format.)
The Web added the notion of being able to address part of a content and not the whole content by adding a 'fragment identifier' to the URL that addressed the data. Of course, this originally made sense for the original Web with just HTML, but how would it apply to other content. The URL spec glibly noted that "the definition of the fragment identifier meaning depends on the Internet Media Type", but unfortunately, few of the Internet Media Type definitions included this information, and practices diverged greatly.
If the interpretation of fragment identifiers depends on the MIME type, though, this really crimps the style of using fragment identifiers differently if content negotiation is wanted.
If the Internet Media Type registry is more explicit about which kinds of content contain what kind of scriptability access, then the specifications for sniffing can reference the Internet Media Type registry to determine what kinds of sniffing constitute a 'privelege upgrade'.
Note that all sniffing can be a priviledge upgrade, if there is a buggy recipient, although bugs can be fixed, but spec violations are a problem.
(This section is still open as events are happening outside of the TAG, this is just an outline still).
(This section is derived from [SpecUpdate]. )
Specifications and registry entries frequently include references to other specifications. Sometimes those references are intended to reference a specific version or edition of the specification; in other cases, the reference is intended to allow for update and evolution.
Specifications evolve as new editions of them are written. Standards evolve as groups agree that a particular edition of a specification represents their common agreement. Specifications may have versions, editions, and forks.
Often, one specification will reference another specification. In some cases the reference is given informally, in others it the reference consists of a URI which identifies the specification.
Discuss status, editions, versions, standards, specifications, in IETF, W3C, other organizations. IETF has internet-draft, RFC. RFC can be Informational, Standards track, Experimental. W3C has editor's draft, working draft, proposed, candidate, rec. Explain W3C spec.
Discuss considerations for what you might want to optimize: evolution, stability.
Discuss need for clarity against goals
(following text from HT and CMSMQ needs review -- do we really want to impose this in MIME registrations? Goes against the registry rules of "put MIME reg in spec and update it there too".)
When citing a W3C specification in the normative references section of another specification or a registry care should be taken to be clear about the status of editions of the referenced specification other than the then-current one.
Left-Handed Sewer Flutes 1.0 (Second edition), P.D.Q. Bach and Peter Schickele, Editors. World Wide Consortium, 29 February 2009. The edition cited (http://www.w3.org/TR/2009/REC-lhsf-20090229/) is the earliest appropriate for use with this specification. Conformant implementations may follow the edition cited and/or any later edition(s). The latest edition of LHSF 1.0 is available at http://www.w3.org/TR/lhsf/. It is implementation-defined which editions of LHSF 1.0 are supported.
The appropriateness of this approach is based on the W3C rules regarding what constitutes an acceptable new edition of an existing W3C Recommendation. [cite]
For references to publications from other standards bodies with similar expectations regarding backwards compatibility, for example IETF or ISO, a similar approach to citation is also called for, along the following lines:
The Extension of MIME Content-Types to a New Medium, N. Borenstein and M. Linimon. Internet Engineering Task Force RFC 1437, 1 April 1993. RFC 1437 was current at the date of publication of this specification, but may be updated or obsoleted by later RFCs. Conformant implementations may follow the RFC cited and/or any later RFCs which update or obsolete it. It is implementation-defined which RFCs are supported. Intelligent transport systems -- Physical characterisation of vehicles and equipment -- International airline seat pitch measurements. Part 1: Measurement architecture. International Standard ISO 314159-1:2009, 29 February 2009. The referenced specification may from time to time be amended, replaced by a new edition, or expanded by the addition of new parts. See http://www.iso.org/iso/home.htm for up-to-date information. Conformant implementations may follow the edition cited and/or any amendments etc. It is implementation-defined which amendments etc. are supported.
Or a general caveat:
Dated references below are to the earliest known or appropriate edition of the referenced work. The referenced works may be subject to revision, and conformant implementations may follow, and are encouraged to investigate the appropriateness of following, some or all more recent editions or replacements of the works cited. It is in each case implementation-defined which editions are supported.
and then simply
Left-Handed Sewer Flutes 1.0 (Second edition), P.D.Q. Bach and Peter Schickele, Editors. World Wide Consortium, 29 February 2009 (http://www.w3.org/TR/2009/REC-lhsf-20090229/). The latest edition of LHSF 1.0 is available at http://www.w3.org/TR/lhsf/.
The Extension of MIME Content-Types to a New Medium, N. Borenstein and M. Linimon. Internet Engineering Task Force RFC 1437, 1 April 1993.
Intelligent transport systems -- Physical characterisation of vehicles and equipment -- International airline seat pitch measurements. Part 1: Measurement architecture. International Standard ISO 314159-1:2009, 29 February 2009. See http://www.iso.org/iso/home.htm for up-to-date information.
(keep this section up to date? )
[BCP26]: Guidelines for Writing an IANA Considerations Section in RFCs, BCP 26, RFC ...
[IABext] Design Considerations for Protocol Extensions work in progress, Internet Draft
[Friendly] Friendly Registries, work in progress, Wiki Page, requirements and a place to gather explicit proposals
[MediaTypeFinding] Internet Media Type registration, consistency of use TAG Finding 3 June 2002 (Revised 4 September 2002)
[MIMEGuidelines] Register an Internet Media Type for a W3C Spec (W3C guidelines on registering types)
[MediaRegUpdate] Media Type Specifications and Registration Procedures, Intenet Draft, work in progress
[NoX] X- parameters harmful (Peter St. Andre)
[SpecUpdate] Best Practice for Referring to Specifications Which May Update [email draft, H. Thompson, C.M. Sperberg-McQueen]
|[HTML5-charset]||Hickson, I., “HTML5: A vocabulary and associated APIs for HTML and XHTML (18.104.22.168 Determining the character encoding).”|
|[RFC1521]||Borenstein, N. and N. Freed, “MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies,” RFC 1521.|
|[RFC1522]||Moore, K., “MIME (Multipurpose Internet Mail Extensions) Part Two: Message Header Extensions for Non-ASCII Text,” RFC 1522, September 1993.|
Comments welcome. ht  http://www.w3.org/2001/tag/2009/09/23-minutes#item03  http://www.w3.org/2001/tag/group/track/actions/303
text/html MIME type RFC
[IAB-extension] Design Considerations for Protocol Extensions, B. Carpenter, B. Aboba, S. Cheshire, Internet Architecture Board (Internet Draft, work in progress).
[IAB-success] What Makes for a Successful Protocol?, RFC 5218, D. Thaler, B. Aboba, Internet Architecture Board, July 2008.
[IETF-extensions] Procedures for Protocol Extensions and Variations, RFC 4775, BCP 125, S. Bradner, B. Carpenter, T. Narten, December 2006.
[tag-versioning] Review of TAG work on versioning (email Larry Masinter 4 2009)
((The idea for this appendix is to make a list of the identifiers used in web protocols and languages and their identifier method, including, for registries, the name and location and properties of the registry used. If possible, it would be great to note the versioning and extensibility process is (a) specified (b) used in practice.))
(Some things to talk about more or in related specs -- this whole section should go away or be turned into independent specs.)
Discussions: forking: two specs, different languages, same Internet Media Type, what goes in the registry? Is this covered?
Evolution toward normalization -- how to get registries to match reality over time
URI schemes identify protocols (do they?)
Precision of specifications vs. range of allowable implementations (google+ discussion w/hixie)
Normative requirements in specification relationship tobehavior of implementations
Roles within a specification (reader, writer, client, server, proxy), Balance of power between implementors of different roles
Agents and user agents
Is there more about "living standard" ?
"follow-your-nose" for using URIs vs. registries? Or else convincing registries to have stable URIs for registered values.
Reasons for a "registry":
For example, some protocol designers thought a new URI scheme could cause a lot of extra work. For HTML tags, when you introduce a new section, everyone needs to understand that who implements browsers.
But if you add metadata, it's no skin off anyone's nose. so you have 2 situations - one on which you need whole community to get involved and one in which anyone besides a sub-community can ignore.
(http://lists.w3.org/Archives/Public/www-tag/2011Dec/0049.html) Roy Fielding as calling mustUnderstand-based approaches "socially reprehensible" we need a decision tree - questions to answer to understand what kind of extension you're doing and which of these techniques you should use
Compound extensibility points: when a new version of an exensibility point defines a new context in which old extensibility points are interpreted. (This is "willful violation" territory, if not also "sniffing" territory). see discussion following http://lists.w3.org/Archives/Public/www-archive/2011Nov/0009.html
User-Agent is a protocol element which identifies implementations and their versions
Extensibility through Modularity ... instead of one big spec you have multiple specs so that the individual parts can evolve without having to review revisions the whole thing... Good if you manage cross-references and make sure the modules are aware of requirements.
Implementations, Roles, Conformance, and Evolution: A specification describes a protocol, language, protocol element, and rules for implementations of the specifications are to behave.
This section would talk about forward, bckward compatibility and requirements... Some times new evolutions are with old ones, sometimes not. Compatibility has many meanings: for example, for a language, it is desirable if new documents work reasonably (with some fallback) with old readers and that old documents work reasonably with new readers. The meaning of "reasonably" varies: "works reasonably" is softened to "either works reasonably or gives clear warning about nature of problem (version mismatch)."
Languages don't have versions:: specifications have versions. most languages used on the web don't have versions, in that most implementations of readers of the language are written to try to adapt whatever data they get to they get to whatever the implementors believe is the best they can do to satisfy user's expectations, as well or better than any other implementation, subject to the internal constraints and architecture of the implementation. In these situations, where features are implemented incrementally and are not orthogonal extensions, using a version indicator to distinguish author's intent is unacceptable. The version indicator at best gives you some (but not a very good) idea of who to blame.
Implementations have versions: Implementations have versions, and, in particular, what authors of content might want to know (or select among) is what set of language or protocol features (or versions of those features) are supported (correctly) by the receiving implementations. This leads to doing content-negotiation based on User-agent.
(this section would talk about how a protocol has "roles": client, server, proxy, user agent; specifications describe a language used by many of the roles, and a protocol between muttiple parties, each of which has a role in a particular transaction. Discuss the relationship between strict and loose conformance requirements; specs intended for multiple roles but reviewed only by one, difference between "ease of implementation" vs. "breadth of allowable implementations". http://intertwingly.net/blog/2009/04/08/HTML-Reunification
This section would draw analogies and distinctions between how formal and natural languages evolve (quite a literature on this).
Here lie some past controversies
The HTML version indicator (The HTML doctype)