JSGI/Level0/A

From CommonJS Spec Wiki
Jump to: navigation, search

NOTE: the more recent version of this specification is Draft 2


JSGI Level 0 Proposal A, Draft 1 (in development)

Philosophy

  • A straightforward but more user friendly mapping to existing gateway interface conventions
  • Suitable for extension via some forthcoming "Level 1" JSGI
  • Addressing all known shortcomings that would be difficult to correct with a "Level 1" layer (header name mangling, header case consistency, raw/decoded values)
  • Tweak to be JSGI more inline with javascript idioms (proper environment variable namespacing and dot access, mixedCase keys)

Principles

  • Do not duplicate data in the request or response
  • Optimize for the application developer, not the server or middleware developer

Specification

Applications

A JSGI application is a javascript function. It takes exactly one argument, the environment, and returns a JavaScript Object containing three attributes: the status, the headers, and the body.

Request

The request environment must be a JavaScript Object instance that includes CGI-like headers. The application is free to modify the environment. The environment is required to include these variables (adopted from PEP333 and Rack), except when they would be empty (exceptions noted below):

  • .requestMethod: The HTTP request method, such as "GET" or "POST". This MUST NOT be an empty string, and so is always required.
  • .scriptName: The initial portion of the request URL's "path" that corresponds to the application object, so that the application knows its virtual "location". This MAY be an empty string, if the application corresponds to the "root" of the server. Restriction: if non-empty, MUST start with "/" and MUST be decoded.
  • .pathInfo: The remainder of the request URL's "path", designating the virtual "location" of the request's target within the application. This may be an empty string, if the request URL targets the application root and does not have a trailing slash. Restriction: if non-empty, MUST start with "/" and MUST be decoded.
  • .queryString: The portion of the request URL that follows the "?", if any. Restriction: MAY be empty but key MUST exist.
  • .serverName, serverPort: When combined with scriptName and pathInfo, these variables can be used to complete the URL. Note, however, that .headers.host, if present, should be used in preference to serverName for reconstructing the request URL. Restriction: serverName and serverPort can never be empty strings, and so MUST be present.
  • .scheme: URL scheme (per RFC 1738). "http", "https", etc.
  • .body: The request body, which MAY be empty. Requirement: MUST be an input stream.

Variables corresponding to the client-supplied HTTP request headers are stored in the .headers object. The presence or absence of these variables should correspond with the presence or absence of the appropriate HTTP header in the request. All keys in the .headers object MUST be the lower-case equivalent of the request's HTTP header keys.

In addition to this, the request environment MUST include these JSGI-specific variables:

  • .jsgi:
    • .jsgi.version: The Array [0,3], representing this version of JSGI.
    • .jsgi.errors: See below, the error stream.
    • .jsgi.multithread: true if the application object may be simultaneously invoked by another thread in the same process, false otherwise.
    • .jsgi.multiprocess: true if an equivalent application object may be simultaneously invoked by another process, false otherwise.
    • .jsgi.runonce: true if the server expects (but does not guarantee!) that the application will only be invoked this one time during the life of its containing process. Normally, this will only be true for a sever based on CGI (or something similar).

JSGI server implementations are encouraged to include as much information in the request environment as possible. CGI-like keys (TODO: define?) should be all lower case, except letters following the underscores, which should be upper case, with the underscore removed, and stored at the top level of the environment (TODO: this wording sucks, somebody FIXME). These keys MUST have string values.

Servers or applications can also store their own data in the request environment. In order to prevent collisions with unspecified CGI keys, custom keys MUST be stored in the .env top level object.

Input Stream

Must be an input stream. (TODO can we reference something in particular?)

Output Stream

Must be an output stream.

Response

.status
The status MUST be an integer greater than or equal to 100. (TODO changed from "if parsed as an integer" -- was this right?)
.headers
The header MUST be a JavaScript object containing key/value pairs of Strings. All keys SHOULD be lower-case to prevent confusion. The header must not contain a Status key, contain keys with : or newlines in their name, contain keys names that end in - or _, but only contain keys that consist of letters, digits, _ or - and start with a letter. The values of the header must be Strings, consisting of lines (for multiple header values) separated by “\

”. The lines must not contain characters below 037.

  • content-type: There must be a Content-Type, except when the Status is 1xx, 204 or 304, in which case there must be none given.
  • content-length: There must not be a Content-Length header when the Status is 1xx, 204 or 304.

Compliant Level 0 servers MUST accept header keys that are lower-case over keys that contain capital letters but are otherwise identical. Servers SHOULD ignore keys that contain capital letters. Servers MAY case-correct keys when serving the response in any manner. (TODO: DL assumed this was decided but it's still up in the air -- if you have an opinion one way or another please show your hand here or on the mailing list)

.body
The Body must respond to forEach and must only yield objects which have a toByteString method (including Strings and Binary objects) (TODO: is this a dependency on the binary spec?). If the Body responds to close, it will be called after iteration. The Body commonly is an array of Strings or ByteStrings.

Summary of changes from JSGI 0.2

NOTE: this area is not aligned with proposals in the Open Issues section and is most certainly out of date.

Environment

Top level keys

  • UPPER_SNAKE_CASE to mixedCase
  • No alterations to key names themselves (env.REQUEST_METHOD to env.requestMethod, not env.method)
  • Server or application-specific environment keys no longer required to contain a "." to differentiate from CGI keys. Since CGI keys MUST be strings; added top-level keys in the environment MUST be objects.
  • CONTENT_LENGTH and CONTENT_TYPE moved into headers object and case-normalized (TODO: should content-length still be parsed down to an int?)

JSGI variables

  • moved env["jsgi.URL_SCHEME"] to env["scheme"] (rationale: it's not a CGI key so no need to maintain consistency in naming)
  • moved env["jsgi.input"] to env["body"] (rationale: consistent with response object)
  • env["jsgi.*"] to env["jsgi"]["*"]; changed snake_case to mixedCase (technically, all lowercase)
  • lower_snake_case to mixedCase

Headers variables

  • env["HTTP_*"] to env["headers"]["*"]
  • Header keys MUST NOT be mangled in any way, and must be set in lower-case form

Server or application variables:

  • Instead of differentiating these by requiring a ".", we require added top-level keys to contain an object

Notes

The "jsgi" property only contains metadata related to JSGI itself, not HTTP, top level contains HTTP info. "input" is moved top level "body", and "url_scheme" is "scheme", like servlet's request.

Open questions

The "CGI" problem
JSGI server implementations built atop CGI SHOULD provide a truthy .jsgi.cgi parameter (in case we discover multiple levels of cgi FAIL)
  • DL: What should our NOTE FOR CGI IMPLEMENTERS say? They have to s/_/-/g header keys, for one.
Response key casing
(a) lower-case
(b) UPPER-CASE
(c) Mixed-case
(d) specify a case-insensitive object
(e) unspecified
  • DL: (a) +1; (c), (d), (e) -1
Content-Length -- where should it appear?
(a) .headers["content-length"] as String
(b) .contentLength as integer, added by server if response is buffered
(c) Both
  • DL: (a) +1; (c) -1: including both would violate our principle of non-duplication
Header modification
Should it be noted whether servers or applications are encouraged, discouraged or prevented from adding or modifying headers (e.g. .headers["content-length"])? Is there any rationale for preventing this (e.g. the headers SHOULD reflect the initial state of the request)?
  • DL: -1, I think it should stay unspecified
More spec-like
Should we call out normative references where they have been codified for required environment variable? This would allow us to explicitly override them in specific cases (PATH_INFO, SCRIPT_NAME).
  • DL: +1: I'll add them -- they'll be easy to remove if this is a bad idea
Response Reason
Specifying a .reason or .statusText on the response object would duplicate information. If this is necessary servers could provide a mechanism to make this happen -- does this merit a note?
  • DL: probably not: "The client is not required to examine or display the Reason-Phrase." -- RFC 2616
Replacing the "request" object
Overwriting the "request" object with a new environment -- should this be explicitly disallowed? If L0 middleware does this it could break L1 middleware. But in order for L1 to exist on an L0 server it would need to wrap the request. The function of L1 doesn't belong in this discussion, but we should probably at least discuss the implications here.
  • DL: +1 on explicitly disallowing this
Response object defaults
Should applications and middleware be REQUIRED to return a full response object -- status, headers and body? If not, middleware developers will always have to feature-test the response -- is this a problem?
  • DL: a related question is async testing promises
Async
How much should we say about async usage patterns? Kris Zyp has gone into great detail on the list, discussions we could link to -- but what kind of normative language should the spec contain about async?
Top-level env conflicts
When additional CGI keys are added to the environment, they should exist at the top level and be camelCased -- this opens the door to conflicts with user-added keys. How do we fix this?
  • DL: I've got a suggestion on the ML

Acknowledgements

This specification is adapted from the Rack specification written by Christian Neukirchen.

Some parts of this specification are adopted from PEP333: Python Web Server Gateway Interface v1.0.