Modules/PythonicModules

From CommonJS Spec Wiki
Jump to: navigation, search

STATUS: PROPOSAL, NO LONGER BEING PURSUED, SEE Modules

Pythonic Modules

This proposal describes a module system similar to the one implemented in RingoJS. I call it Pythonic Modules because it is heavily inspired by the way modules are implemented in Python. It provides protection against name collisions by isolating module scopes, while being reasonably easy to implement in a server or standalone JavaScript runtime.

I have written about this in other places, but I try to rephrase its essential properties here in a more general way. I will for the purpose of this proposal refer to a generic JavaScript runtime that implements the pythonic module system. Note that while our JavaScript runtime uses the file system to store and access its modules, this is no requirement of this proposal.

Scripts are modules

For our JavaScript runtime, every script represents a module. This is true for all scripts, regardless of whether they are part of a core library or a user-written application. There is nothing special a script must contain in order to make it a module.

Module names

Modules are managed by the JavaScript runtime by looking for files within one or more directories which we call module directories. A module name translates to a file name by adding the .js extension to it. Thus, when our JavaScript runtime tries to load a module named A, it will look for a file called A.js in its module directories. Modules that live in subdirectories of a module directory are accessed using a dotted module name where each element in the module name corresponds ot an element in the file path. For example, a module named A.B.C will cause our JavaScript runtime to search its module directories for a file called A/B/C.js.

Every module has its own scope

This is maybe the most radical step away from JavaScript as we know it, since the shared global scope is one of JavaScript's more prominent features. But it is also one of the most critisized one, and one that will seriously hamper development of real large scale applications. As it turns out, giving each script its own top level scope is both easy and backwards compatible.

When our JavaScript runtime starts up, it creates the familiar global JavaScript object containing the Object, Array, Date, Math, etc. objects. However, whenever the JavaScript runtime loads a module, it doesn't use the global object but instead creates a new, empty JavaScript object to evaluate the module on. This object, which we call the module scope, has two important features:

1. It represents a top-level scope, i.e. its parent scope is set to null. 2. It has the shared global object in its prototype chain.

This makes sure module code will never pollute the shared global object (or any other module scope, for that matter), because it is the top-most object in its scope chain, but can still see the standard global objects through the module scope's prototype chain. With this setup, modules code will never unintentionally pollute any other scope. Users of our JavaScript runtime can just write global functions and variables, even accidentally omitting the var keyword, without any risk of disturbing with other modules or global code.

Importing modules

Since modules are shielded from each other, there must be a well-defined way for one module to load and make use of another module. Our JavaScript runtime provides one global require() function to allow one module to load and use another:

require('modulename')

This causes our JavaScript runtime to load the module with the given name, evaluate it on a module scope, and return the module scope to the caller.

Visibility of loaded modules

One great feature of the separated module scopes is protection against name collisions as described above. Another, maybe equally important feature is the fact that imported modules and module properties are only visible to the very modules that imported them.

Module loading and caching

Our JavaScript runtime keeps a map of loaded modules. Before a module is loaded, the runtime first checks whether the module has already been loaded before. If so, the existing module scope is reused. Our runtime also makes sure module scopes are registered in the loaded module map before evaluating them in order to be able to deal with cyclic imports.

Programming Notes

All top-level variables inside the module are exported. Variables can be hidden by using the module pattern and ensuring the use of "var" so they are not exported.

(function() {
  var modulePrivate = 10;
  moduleExported=11;
})(); 
var otherModuleExported=12;

Open issues

A discussion on irc happened about the separator in the require argument. Should it be require('A.B') or require('A/B')? There are other options but these seemed to be the popular options.

As discussed on irc, the argument to require probably shouldn't have a file extension (e.g. ".js" or ".so") because that ties to an implementation. The JavaScript code should not need to know how a library is implemented.

Should the module lookup path be mutable during program execution? Perhaps an array require.path could be mutated for this purpose. This may be tricky to implement and seems non-essential. Can this be postponed to a later edition of the spec so as not to slow things down?

Should the module automatically-tried extensions be mutable during program execution. Something like require.extensions=["dll","js"] ?

Should there be a require.loaded array-valued property with all the loaded module names or scopes? This seems non-essential. Can this be postponed to a later edition of the spec so as not to slow things down?

What is the value of "this" when evaluated in the module scope?