<?xml version="1.0"?>
<!-- <?rfc strict='yes'?>
     complains about too long lines (2 cases)
     and appendix, but otherwise is okay
-->
<?xml-stylesheet type='text/css' href='rfc2629.css' ?>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc symrefs='yes'?>
<?rfc sortrefs='yes'?>
<?rfc iprnotified="no" ?>
<?rfc toc='yes'?>
<?rfc compact='yes'?>
<?rfc subcompact='no'?>

<?xml-stylesheet type="text/xsl" href="rfc2629.xslt"?>
<!DOCTYPE rfc SYSTEM "rfc2629.dtd">
<rfc ipr="trust200811" docName="draft-masinter-dated-uri-06bis1" category="info">
  <front>
    <title abbrev="The tdb URI scheme">The "tdb" URI scheme: denoting described resources</title>
    <author initials="L." surname="Masinter" fullname="Larry Masinter">
      <organization>Adobe</organization>
      <address>
        <postal>
          <street>345 Park Ave</street>
          <city>San Jose</city><region>CA</region>
          <code>95110</code>
          <country>US</country>
        </postal>
        <phone>+1 408 536 3024</phone>
        <email>LMM@acm.org</email>
        <uri>http://larry.masinter.net</uri>
      </address>
    </author>
    <date day="12" month="July" year="2009" />
    <area>Applications</area>
    
    <abstract>
    <t> This document defines a URI scheme, "tdb" ( standing for
   "Thing Described By").  It provides a semantic hook for allowing
   anyone at any time to mint a URI for anything that they
   can describe. Such URIs may include a timestamp to
   fix the description at a given date or time.
  </t>
  <t> This URI scheme may reduce the need to define 
  define new URN namespaces merely for the purpose of creating stable
  identifiers. In addition, they provide a ready means for identifying
  "non-information resources" by semantic indirection -- a way
  of creating a URI for anything.
</t>
    </abstract>
    <note title="Note">
 <t> This document is not a product of any working group. Many of
  the ideas here have been discussed since 2001. This document has
  been discussed on the mailing list &lt;uri@w3.org&gt;. Previous
  versions have couched "tdb" as a URN namespace, and included
  a "duri" scheme which only timestamped without indirection. 
  It was originally written as a thought experiment
  as a way of resolving the use/mention problem in semantic
  web applications, but may have other uses.</t>
    </note>
  </front>
  <middle>
    
<section anchor="overview" title="Overview and Requirements">

<t>  The tdb URI scheme here solves several related problems:
</t>
<section title="Easy assignment of permanent identifiers">
<t>
  The URN specification <xref target="RFC1737" /> allows for many
  URN namespaces, and many have been registered. However, obtaining an
  appropriate URN in any of the currently defined URN namespaces may
  be difficult: a number of URN namespace registrations have been
  accompanied by comments that no other URN namespace was available
  for the class of documents for which identifiers were wanted.
</t>
</section>

<section title="Persistent identifiers">

<t>  <xref target="RFC1737" />
 defines several requirements for Uniform Resource
  Names. In particular, it requires "persistence":
  <list style="empty">
   <t>
     Persistence: It is intended that the lifetime of a URN be
     permanent.  That is, the URN will be globally unique forever, and
     may well be used as a reference to a resource well beyond the
     lifetime of the resource it identifies or of any naming authority
     involved in the assignment of its name.
  </t>
  </list>
 </t>
  <t>  Many people have wondered how to create globally unique and
  persistent identifiers. There are a number of URI schemes and URN
  namespaces already registered. However, an absolute guarantee of
  both uniqueness and persistence is very difficult.
  </t><t>
  In some cases, the guarantee of persistence comes through a promise
  of good management practice, such as is encouraged in <xref 
  target="COOL">"Cool URLs
  don't change"</xref>.  However, relying on promise of good
  management practice is not the same as having a design that
  guarantees reliability independent of actual administrative
  practice.
  </t><t>
  A primary design goal for URIs is that they are intended to mean the
  same thing, no matter in what context they appear: a "Uniform" way
  to Identify a Resource. However, even when URIs have Uniform meaning
  from the point of view of the source of the reference, they don't
  guarantee stability over time. Despite best efforts and intentions,
  identifying information can change in unpredictable ways: domain
  names can disappear or be reassigned, name assigning organizations
  can change structure, responsibility, disappear, merge, or change in
  unpredictable ways.
  </t>
  <t>
  There is a significant dependence in the interpretation of many URNs
  with the concept of "naming authority". The authority is presumably
  some individual or organization both to insure uniqueness of
  assignment and also to help with understanding the meaning of the
  link between the name and the named.
  </t><t>
  However, authorities, whether individuals or organizations, have a
  lifetime, and must be consulted at some point to understand the
  bindings. The functioning of names as unique identifiers and holders
  of meaning depends on having a reliable infrastructure of consulting
  the authority or the authorities records to determine the thing
  referenced.
  </t>
</section>
<section title="URIs for abstractions">
<t>

  The description of URIs <xref target="RFC3986" /> describes a
  range for 'Resource' that is quite broad:
<list style="empty">
 <t>
     This specification does not limit the scope of what might be a
      resource; rather, the term "resource" is used in a general sense
      for whatever might be identified by a URI.  Familiar examples
      include an electronic document, an image, a source of information
      with a consistent purpose (e.g., "today's weather report for Los
      Angeles"), a service (e.g., an HTTP-to-SMS gateway), and a
      collection of other resources.
   A resource is not necessarily
      accessible via the Internet; e.g., human beings, corporations, and
      bound books in a library can also be resources.  Likewise,
      abstract concepts can be resources, such as the operators and
      operands of a mathematical equation, the types of a relationship
      (e.g., "parent" or "employee"), or numeric values (e.g., zero,
      one, and infinity).

 </t>
</list>
</t>
<t>
  One might use a URI such as "mailto:" email address to identify
  a person, or a "http:" URI to identify an abstract comment.
  However, this leaves the question of how one might identify, within
  the same context, both the system mailbox and the person to which
  it is assigned, or the web page at a http URI and the
  concept it describes. The "tdb" URI scheme allows ready assignment
  of URIs for abstractions that are distinguished from the media
  content that describes them.
</t><t>
  The goal, then, of the "tdb" URI scheme is to
  provide a mechanism which is, at the same time:
<list style="hanging">
<t> permanent: The identity of the resource identified
      is not subject to reinterpretation over time.
</t>
<t>explicitly bound: The mechanism by which the identified
      resource can be determined is explicitly included in
      the URI.
</t>
<t>useful for non-networked items:
 Allows identification of resources outside the network:
      people, organizations, abstract concepts.
</t>
<t>no administration:
 The mechanism does not depend on reliable administrative processes
      of authorities for either assignment or interpretation.
</t>
</list>
</t>

</section>
</section>

<section anchor="syntax" title="Syntax">
<t>A tdb URI takes the form:</t>

<figure>
<artwork>     tdb:&lt;timestamp&gt;:&lt;URI&gt;
</artwork>
</figure>
<t> Where &lt;timestamp&gt; is s sequence of digits representing
a date and time (<xref target="timestamps"/>) and &lt;URI&gt; is
any valid URI. </t>
</section>

<section anchor="semantics" title="Semantics">
<t>
  The tdb URI scheme is intended to be useful
  for describing entities, concepts, abstractions, and other items
  which may not themselves be network accessible resources, but have been
  at some point described by network accessible resources.
</t>

<t>
  The meaning of a duri is "the resource (or fragment) that was
  identified by the &lt;encoded-URI&gt; (after hex decoding) at the very
  last instant of the date(time) given".
</t>


<t>
  The intent is to use the inversion of "is a document about".  It is
  common practice to give a reference for a concept by including a
  pointer to a document, segment, phrase that defines the concept.
  "tdb" attempts to capture this practice in URI space.
</t>
<t>
  For example, one might use "tdb:2008:http://www.ietf.org" as
  a persistent identifier for the Internet Engineering Task
  Force, as described by the "http://www.ietf.org" as of the very
   last instant of the year 2008. 
</t>
<t>
  The "tdb" namespace differs from the URN methods for
  identifying abstractions because the designation of what is actually
  identified by the tdb doesn't depend on knowing the intention of the
  "assigner" of the identifier. Unlike "tag", "info", "cid", "mid"
  or related schemes, the identification is not dependent on
  the context of use.
</t><t>
  The "tdb" scheme can be thought of as adding a level of
  semantic indirection to URI resolution. 
</t>

</section>

<section title="Use as a Locator">
<t>  A tdb URI is not a resource locator in a practical sense.
It allows one to know that  a resource was described at some
point in time, but whether the description is still available,
or whether that description is still meaningful, is ambiguous.
</t>
</section>

<section title="Hierarchy">
<t>
  The "thing descibed by" a network resource may bear little
  relationship to the "thing described by" a relative pointer,
  so the "tdb" URI scheme seems to have no use cases for
  using "/" as a hierarchical delimiter.
</t>
</section>

<section anchor="timestamps" title="Timestamps in tdb URIs">
<t>
  It is traditional in convention references and citations in printed
  works to include the date of publication; this practice serves the
  important purpose that the context of the naming can be determined.
</t>

<t>While one could imagine using tdb
  without a timestamp, it would leave the possibility that a reference that
  is unambiguous at one time might become ambiguous at some other
  time. There are two ways that the date is useful for "tdb":
  it fixes the time of access of the resource, for variable descriptions,
  and it fixes the time of interpretation, for descriptions whose
  meaning (in natural language) might vary.
</t>  

<t>A timestamp SHOULD be supplied, since the network
resources which provide descriptions can also change over time.
The timestamp is allowed to be quite broad -- only a year --
or with as much precision as needed. This keeps "tdb" URIs
relatively short.  To avoid ambiguity, a single instant has been chosen --
for tdb this is "the last possible instant of the indicated range".
</t>

<t>
  A timestamp in the tdb scheme is a simple expression
  of date, optional time, with arbitrary precision.
  The goal is to allow relatively short
  expressions with no ambiguity, but also with arbitrary
  precision. (Other date formats were considered, but arbitrary
  precision syntactic simplicity of
  only using digits time zones not.)

</t>
<figure>
<artwork>
  date = [ year [ month [ day [ hour [ minute [ second [ fraction ]]]]]]]

   year     = 4digit
   month    = 2digit
   day      = 2digit
   hour     = 2digit
   minute   = 2digit
   second   = 2digit
   fraction = *digit
</artwork>
</figure>
<t>
  The representation of a date or time refers to the (open interval)
  instant just before the end
    of the given date/time range at the resolution supplied. 
  199912 is "just before" 1999, but 19991231 falls between them.
   If necessary, timestamps can
  include times and even fractional times, so that a generator of
  tdbs can be arbitrarily precise.
</t><t>
  Timestamps are interpreted relative to International Atomic Time (TAI)
  <xref target="TAI" />.  The syntax and semantics are similar to
  those in <xref target="RFC2550" />; in particular,
  using TAI avoids ambiguity about time zones and difficulties with
  leap seconds. 
</t>

<t>

There are actually two dates to consider, with "tdb". There is
the date that the resource is obtained, and there is the date
that the description it makes is read, understood, and used to denote.
Normally in a literary work in natural language which makes
a reference to another work, both the reference itself and the
work referenced are dated, e.g., a footnote in an article
written in 1967 might talk about a "private communication" which
itself had a date. The difference between a URI and a conventional
literary reference is the desire to be able to extract the URI
from its context and still retain its meaning.
</t>

</section>

<section anchor="additional" title="Additional Considerations">
<section title="URI schemes for the description resource">
<t>

  The "tdb" scheme is intended for use with resources which
   have retrievable resources that describe something else --
   these "description resources" are intended as "information
   resources".
</t><t>
  
  For example, use with a "http" URI can be used to refer to the
  subject of a web page (at it was described at the given time.)
  This can be a way of referring to a web site at some time in the past,
   or an  organization that has changed, merged, split, or
  disappeared.
  
</t><t>
  Local systems that have known-to-be unique host names can use "file" URIs
  with "tdb", for example, </t>
<figure><artwork>
    tdb:20010814142327:file://this.example.com/c|/temp/test.txt
</artwork></figure>
  <t> since this use is primarily focused on providing a unique way of
  identifying an abstraction, even if the referent of the abstraction
  is not widely known. (Using 'file:' URIs in this way without a fully
  qualified domain name would not be appropriate, because the interpretation
  is not uniform.)
</t>
<t>
  One might consider using "tdb" with "data" to designate concepts
  that can be described uniquely briefly inline. For example, </t>
<figure><artwork>
     tdb:2001:data:,The%20US%20president
</artwork></figure>
  
  <t> names the concept described by the (text/plain) string "The US
  president" at the very last instant of 2001. Of
  course, this practice is only useful if the referent of the data is
  (or was at the time) completely unique. Since "data" does not
  contain a way to designate content-language, the string in question
  would have to not be ambiguous as to its language.  In the case of
  'data', there is no assigning authority at all; the interpretation
  of the 'tdb' depend on the interpreting community.
</t><t>
  Many URIs identify resources which do not clearly describe
  anything at all. The "home page" for an organization isn't
  nearly as good a resource to use to describe an organization
  as the organization's "about" page. But it is up to the minter
  of the tdb URI to choose wisely.
</t>
</section>

<section title="Useful timestamps">

<t>
  Timestamps far in the future are suspect, because the future
  content of a description resource cannot usually
  reliably predicted.  Timestamps which preceed
  the availability of the description resource should
  not be used either.  For example, using a http URI with
  a timestamp before the description resource is also
  not recommended.
</t><t>
  However, although these practices are not recommended, there is no
  assurance that they haven't been used; by itself, a tdb does not
  constitute an assertion that the description resource was available or
  assigned at the date specified.
</t><t>
  Note that the use of the "very last instant" allows for the 
  conventional bibliographic convention that a work published
  in 2009 can use "2009" as the date string, to refer to the
  work in the year of publication. 
</t>
</section>
<section title="Free assignment">
<t>
  
  Because of the many possible schemes that can be used in the
  &lt;URI&gt; portion, there should be no difficulty in almost any
  computational process being able to assign tdbs at will. Of
  course, it is necessary for there to be some resource which is
  available at some point in time, and to have a clock which is
  accurate to the granularity of the frequency of assignment.
  
</t>
</section>
<section anchor="resolution" title="Resolution">
<t>
  There no resolution servers or processes for tdb URI. However,
  a tdb URI might be "resolvable" in the sense that a resource that was
  accessed at a point in time might have the result of that access
  cached or archived in an Internet archive service. See, for example,
  the "Internet Archive" project <xref target="archive" />. And the
  "tdb" is "resolvable" in the sense that the description resource
  can be accessed and interpreted.
</t>
</section>
<section title="Why Names with Semantics?">
<t>
  
  There are a number of URI and URN schemes that create otherwise
  unbound "names", where the scheme only provides for uniqueness,
  with some other agent or process or context providing the
  authority to interpret the meaning of the identifier at
  some point in the future. "tdb" is different, in that 
  it is the agreement between the describer (the agent creating
  the tdb URI) and the receiver of the URI (the agent interpreting
  the tdb URI) to agree upon the semantics without any reference
  to any third party.
</t>
</section>
<section title="Avoiding MetaData">
<t>
  One might consider the date in a tdb URI to be just one piece of
  additional metadata about the URI, and consider adding other
  pieces of metadata as annotation.
</t><t>
  However, the use of the date in a tdb URI is intended primarily as a
  mechanism of accomplishing uniqueness over time. No other bit of
  metadata or description readily fills that purpose. Further, the date
  is not descriptive (an assertion about the URI) but merely
  refining.
</t>
</section>
<section title="Avoiding tdb">
<t>
  Many applications of URIs already provide a context of timestamp. For
  example, one could imagine a hypertext system where the URIs contained
  within a document were intended to refer to the resources as of the
  date of the enclosing document. This would be a reasonable
  interpretation of URIs within an Internet archive system, for example.
  
</t><t>
  And some applications of URIs arguably already contain the level of
  interpretive indirection that is explicit with "tdb". For example,
  one might consider the use of URIs as namespace names within XML
  <xref target="namespaces" /> as a reference to the "thing
  described by" the URI used.
  
</t>
</section>

<section title="tdb and levels of indirection">
<t>
  
  The "tdb" scheme introduces a level of semantic indirection. The
  puzzles and confusions about use and mention, name and reference,
  and levels of indirection have been puzzling and amusing for quite a
  while.
  
 <list style="empty">
  <t>"It's long," said the Knight, "but it's very, very beautiful. Everybody that hears me sing it--either it brings tears into their eyes, or else--"<vspace />
"Or else what?" said Alice, for the Knight had made a sudden pause. <vspace />
"Or else it doesn't, you know. The name of the song is called 'Haddock's Eyes.'" <vspace />
    "Oh, that's the name of the song, is it?" Alice said, trying to
     feel interested. <vspace />
    "No, you don't understand," the knight said, looking a little
     vexed. "That's what the name is called. The name really is 'The
     Aged Aged Man.'" <vspace />
"Then I ought to have said 'That's what the song is called'?" Alice corrected herself.<vspace />
"No, you oughtn't: that's quite another thing! The song is called 'Ways and Means': but that's only what it's called, you know!" <vspace />
"Well, what is the song, then?" said Alice, who was by this time completely bewildered.<vspace />
"I was coming to that," the Knight said. "The song really is 'A-sitting On A Gate': and the tune's my own invention."

 <xref target="LOOK" />
</t></list>

</t>

</section>
</section>
<section anchor="spec" title="URI Specification Template">

<t>

<list style="hanging">

<t hangText="URI scheme name:">
      tdb
</t><t hangText="Status:">
      permanent

</t><t hangText="URI scheme syntax:">
      Briefly, the syntax is  <vspace />
          tdb:&lt;date&gt;:&lt;URI&gt;  <vspace />
      The syntax is described in this document.



</t><t hangText="URI scheme semantics:">
      Semantic indirection at indicated date.
      Semantics are described in detail in this document.
</t><t hangText="Encoding considerations:">
      tdb URIs consist of a prefix followed by
      another URI, and should have the same 
      encoding considerations as others.
</t><t hangText="Applications/protocols that use this URI scheme name:">
      This scheme was designed to resolve some of the
      use/mention ambiguities in semantic web applications
      that wish to "denote" concepts and other ideas
      and not just access resources over the Internet.

</t><t hangText="Interoperability considerations:">
      Existing semantic web applications may have other
      means of fixing meaning at a particular time or
      semantic indirection, but this should not
      in itself cause interoperability difficulties.
</t><t hangText="Security considerations:">
      See <xref target="security" /> of this document.
</t><t hangText="Contact:">
     Larry Masinter
     tdb:2009:http://larry.masinter.net
</t><t hangText="Author/Change controller:">
     as above
</t><t hangText="References:">
      See References of this document.
</t>
</list>
</t>
</section>

<section anchor="iana" title="IANA considerations">
<t>

  This document includes a URI scheme registration (<xref
  target="spec" /> that should be
  entered into the IANA registry of URI schemes as 
  a permanent registration (once approved.)
</t>
</section>
<section anchor="security" title="Security Considerations">
<t>
  "tdb" identifiers are not any more reliable because they 
  have dates.  URIs don't contain enough information to supply the
  authority for deciding what was or wasn't at a given URI at a given
  date.
</t>
</section>
<section anchor="acks" title="Acknowledgements">
<t>
  There have been many discussions over several years on the relationship of URLs, URNs, URIs, resources and resource identifiers, with many contributions.
   Particular thanks to Al Gilman, Aaron Swartz, Brian McBride,
   Stuart Williams, Michael Mealling, Ray Denenberg and Pat Hayes.
</t>
</section>
</middle>
<back>

<references title="Normative References">

<reference anchor="RFC3986">
 <front>
  <title>Uniform Resource Identifiers (URI): Generic Syntax</title>
  <author initials='T.' surname='Berners-Lee' fullname='Tim Berners-Lee'>
   <organization />
  </author>
  <author initials='R.T.' surname='Fielding' fullname='Roy T. Fielding'>
   <organization />
  </author>
  <author initials='L.' surname='Masinter' fullname='Larry Masinter'>
   <organization />
  </author>
  <date month='January' year='2005' />
 </front>
<seriesInfo name='RFC' value='3986' />
</reference>

<reference anchor="TAI">
<front>
 <title>International Atomic Time</title>
 <author initials='' surname=''>
   <organization>Bureau International des Poids et Mesures
    </organization>
 </author>
  <!-- http://www.bipm.fr/enus/5_Scientific/c_time/time_1.html -->
  <date month='' year='' />
</front>
</reference>

<reference anchor="namespaces" target="http://www.w3.org/TR/REC-xml-names">
<front>
<title>Namespaces in XML</title>
    <author initials="T." surname="Bray" fullname="Tim Bray">
  <organization /> </author>
    <author initials="D." surname="Hollander" fullname="Dave Hollander">
   <organization /> </author>
    <author initials="A." surname="Layman" fullname="Andrew Layman">
   <organization /> </author>

<date month='January' year='1999' />
</front>
  <seriesInfo name="W3C Recommendation" value="REC-xml-names"/>
</reference>

</references>
<references title="Informative References">

<reference anchor="COOL" target="http://www.w3.org/Provider/Style/URI.html">
<front>
  <title>Cool URIs don't change</title>
  <author initials="T." surname="Berners-Lee">
   <organization>W3C</organization>
   </author>
  <date month="" year="1998" />
 </front>
</reference>
<reference anchor="RFC1737">
<front>
 <title>Functional Requirements for Uniform Resource Names</title>
  <author initials="K." surname="Sollins">
  <organization></organization> </author>
  <date month="December" year="1994" />
  </front>
 <seriesInfo name="RFC" value="1737" />
</reference>
<reference anchor="RFC2550">
<front>
 <title>Y10K and Beyond</title>
  <author initials="S." surname="Glassman">
  <organization /> </author>
  <author initials="M." surname="Manasse">
  <organization /> </author>
  <author initials="J." surname="Mogul">
  <organization /> </author>

  <date month="April 1" year="1999" />
  </front>
 <seriesInfo name="RFC" value="2550" />
</reference>

  

  
<reference anchor="archive" target="http://www.sciam.com/0397issue/0397kahle.html">
  <front>
<title>Preserving the Internet</title>
<author initials="B." surname="Kahle" fullname="Brewster Kahle">
<organization>Alexa Internet</organization></author>
  <date month='March' year='1997' />
  
</front>
 <seriesInfo name="Scientific American" value="" />
</reference>

<reference anchor="LOOK" target="http://www.literature.org/authors/carroll-lewis/through-the-looking-glass/chapter-08.html">
<front>
<title>Through the Looking Glass</title>
<author initials="L." surname="Carroll" fullname="Lewis Carroll"><organization /></author>
<date month='' year='1872' />
</front>
</reference>
</references>

</back>
</rfc>
