datavow.com - Documents

1) Introduction
- 1) Purpose
- 2) Requirements
- 3) Terminology
- 4) Overall Operation
2) Notational Conventions and Generic Grammar
- 1) Augmented BNF
- 2) Basic Rules
3) Protocol Parameters
- 1) HTTP Version
- 2) Uniform Resource Identifiers
- 3) Date/Time Formats
  - 1) Full Date
  - 2) Delta Seconds
- 4) Character Sets
  - 1) Missing Charset
- 5) Content Codings
- 6) Transfer Codings
  - 1) Chunked Transfer Coding
- 7) Media Types
  - 1) Canonicalization and Text Defaults
  - 2) Multipart Types
- 8) Product Tokens
- 9) Quality Values
- 10) Language Tags
- 11) Entity Tags
- 12) Range Units
4) HTTP Message
- 1) Message Types
- 2) Message Headers
- 3) Message Body
- 4) Message Length
- 5) General Header Fields
5) Request
- 1) Request-Line
  - 1) Method
  - 2) Request-URI
- 2) The Resource Identified by a Request
- 3) Request Header Fields
6) Response
- 1) Status-Line
  - 1) Status Code and Reason Phrase
- 2) Response Header Fields
7) Entity
- 1) Entity Header Fields
- 2) Entity Body
  - 1) Type
  - 2) Entity Length
8) Connections
- 1) Persistent Connections
- 2) Message Transmission Requirements
9) Method Definitions
- 1) Safe and Idempotent Methods
  - 1) Safe Methods
  - 2) Idempotent Methods
- 2) OPTIONS
- 3) GET
- 4) HEAD
- 5) POST
- 6) PUT
- 7) DELETE
- 8) TRACE
- 9) CONNECT
10) Status Code Definitions
- 1) Informational 1xx
  - 1) Continue
  - 2) Switching Protocols
- 2) Successful 2xx
- 3) Redirection 3xx
- 4) Client Error 4xx
- 5) Server Error 5xx
11) Access Authentication
12) Content Negotiation
- 1) Server-driven Negotiation
- 2) Agent-driven Negotiation
- 3) Transparent Negotiation
13) Caching in HTTP
- 1) ..
- 2) Expiration Model
- 3) Validation Model
- 4) Response Cacheability
- 5) Constructing Responses From Caches
- 6) Caching Negotiated Responses
- 7) Shared and Non-Shared Caches
- 8) Errors or Incomplete Response Cache Behavior
- 9) Side Effects of GET and HEAD
- 10) Invalidation After Updates or Deletions
- 11) Write-Through Mandatory
- 12) Cache Replacement
- 13) History Lists
14) Header Field Definitions
- 1) Accept
- 2) Accept-Charset
- 3) Accept-Encoding
- 4) Accept-Language
- 5) Accept-Ranges
- 6) Age
- 7) Allow
- 8) Authorization
- 9) Cache-Control
  - 1) What is Cacheable
  - 2) What May be Stored by Caches
  - 3) Modifications of the Basic Expiration Mechanism
  - 4) Cache Revalidation and Reload Controls
  - 5) No-Transform Directive
  - 6) Cache Control Extensions
- 10) Connection
- 11) Content-Encoding
- 12) Content-Language
- 13) Content-Length
- 14) Content-Location
- 15) Content-MD5
- 16) Content-Range
- 17) Content-Type
- 18) Date
  - 1) Clockless Origin Server Operation
- 19) ETag
- 20) Expect
- 21) Expires
- 22) From
- 23) Host
- 24) If-Match
- 25) If-Modified-Since
- 26) If-None-Match
- 27) If-Range
- 28) If-Unmodified-Since
- 29) Last-Modified
- 30) Location
- 31) Max-Forwards
- 32) Pragma
- 33) Proxy-Authenticate
- 34) Proxy-Authorization
- 35) Range
  - 1) Byte Ranges
  - 2) Range Retrieval Requests
- 36) Referer
- 37) Retry-After
- 38) Server
- 39) TE
- 40) Trailer
- 41) Transfer-Encoding
- 42) Upgrade
- 43) User-Agent
- 44) Vary
- 45) Via
- 46) Warning
- 47) WWW-Authenticate
15) Security Considerations
- 1) Personal Information
- 2) Attacks Based On File and Path Names
- 3) DNS Spoofing
- 4) Location Headers and Spoofing
- 5) Content-Disposition Issues
- 6) Authentication Credentials and Idle Clients
- 7) Proxies and Caching
  - 1) Denial of Service Attacks on Proxies
16) Acknowledgments
17) References
18) Authors' Addresses
19) Appendices
- 1) Internet Media Type message/http and application/http
- 2) Internet Media Type multipart/byteranges
- 3) Tolerant Applications
- 4) Differences Between HTTP Entities and RFC 2045 Entities
- 5) Additional Features
  - 1) Content-Disposition
- 6) Compatibility with Previous Versions
20) Index
21) Full Copyright Statement
22) Acknowledgement

13.3.3 Weak and Strong Validators

Since both origin servers and caches will compare two validators to decide if they represent the same or different entities, one normally would expect that if the entity (the entity-body or any entity- headers) changes in any way, then the associated validator would change as well. If this is true, then we call this validator a "strong validator."

However, there might be cases when a server prefers to change the validator only on semantically significant changes, and not when insignificant aspects of the entity change. A validator that does not always change when the resource changes is a "weak validator."

Entity tags are normally "strong validators," but the protocol provides a mechanism to tag an entity tag as "weak." One can think of a strong validator as one that changes whenever the bits of an entity changes, while a weak value changes whenever the meaning of an entity changes. Alternatively, one can think of a strong validator as part of an identifier for a specific entity, while a weak validator is part of an identifier for a set of semantically equivalent entities.

Note: One example of a strong validator is an integer that is incremented in stable storage every time an entity is changed.

An entity's modification time, if represented with one-second resolution, could be a weak validator, since it is possible that the resource might be modified twice during a single second.

Support for weak validators is optional. However, weak validators allow for more efficient caching of equivalent objects; for example, a hit counter on a site is probably good enough if it is updated every few days or weeks, and any value during that period is likely "good enough" to be equivalent.

A "use" of a validator is either when a client generates a request and includes the validator in a validating header field, or when a server compares two validators.

Strong validators are usable in any context. Weak validators are only usable in contexts that do not depend on exact equality of an entity. For example, either kind is usable for a conditional GET of a full entity. However, only a strong validator is usable for a sub-range retrieval, since otherwise the client might end up with an internally inconsistent entity.

Clients MAY issue simple (non-subrange) GET requests with either weak validators or strong validators. Clients MUST NOT use weak validators in other forms of request.

The only function that the HTTP/1.1 protocol defines on validators is comparison. There are two validator comparison functions, depending on whether the comparison context allows the use of weak validators or not:

- The strong comparison function: in order to be considered equal, both validators MUST be identical in every way, and both MUST NOT be weak.

- The weak comparison function: in order to be considered equal, both validators MUST be identical in every way, but either or both of them MAY be tagged as "weak" without affecting the result.

An entity tag is strong unless it is explicitly tagged as weak. Section 3.11 gives the syntax for entity tags.

A Last-Modified time, when used as a validator in a request, is implicitly weak unless it is possible to deduce that it is strong, using the following rules:

- The validator is being compared by an origin server to the actual current validator for the entity and, - That origin server reliably knows that the associated entity did not change twice during the second covered by the presented validator.

- The validator is about to be used by a client in an If- Modified-Since or If-Unmodified-Since header, because the client has a cache entry for the associated entity, and

- That cache entry includes a Date value, which gives the time when the origin server sent the original response, and

- The presented Last-Modified time is at least 60 seconds before the Date value.

- The validator is being compared by an intermediate cache to the validator stored in its cache entry for the entity, and

- That cache entry includes a Date value, which gives the time when the origin server sent the original response, and

- The presented Last-Modified time is at least 60 seconds before the Date value.

This method relies on the fact that if two different responses were sent by the origin server during the same second, but both had the same Last-Modified time, then at least one of those responses would have a Date value equal to its Last-Modified time. The arbitrary 60- second limit guards against the possibility that the Date and Last- Modified values are generated from different clocks, or at somewhat different times during the preparation of the response. An implementation MAY use a value larger than 60 seconds, if it is believed that 60 seconds is too short.

If a client wishes to perform a sub-range retrieval on a value for which it has only a Last-Modified time and no opaque validator, it MAY do this only if the Last-Modified time is strong in the sense described here.

A cache or origin server receiving a conditional request, other than a full-body GET request, MUST use the strong comparison function to evaluate the condition.

These rules allow HTTP/1.1 caches and clients to safely perform sub- range retrievals on values that have been obtained from HTTP/1.0 servers.