Abstract
The Web is designed to support flexible exploration of information by
human users and by automated agents. For such exploration to be
productive, information published by many different sources and for a
variety of purposes must be comprehensible to a wide range of Web client
software, and to users of that software.
HTTP and other Web technologies can be used to deploy
resource representations that are in an important sense self-describing:
information about the encodings used for each representation is provided explicitly
within the representation.
Starting
with a URI, there is a standard algorithm that a user agent can apply to
retrieve and interpret such representations.
Furthermore, representations can be grounded
in the Web, by ensuring that specifications required to
interpret them are determined unambiguously based on the URI, and that explicit
references connect the pertinent specifications to each other.
Web-grounding reduces ambiguity as to
what has been published in the Web, and by whom.
When such
self-describing, Web-grounded resources are linked together,
the Web as a whole can support reliable,
ad hoc discovery of information.
This finding describes how document
formats, markup conventions, attribute values, and other data formats
can be designed to facilitate the deployment of self-describing,
Web-grounded Web content.