IO/B/Buffer

From CommonJS Spec Wiki
< IO‎ | B
Jump to: navigation, search

This proposal defines a io/buffer module that defines 3 classes Buffer, StringBuffer, and BlobBuffer.

This proposal depends on Binary/C and uses a lot of terminology from there, please reference Binary/C for more information.

Binary/B originally defined a ByteArray, none of the prior art actually implemented a ByteArray as Binary/B proposes. Most implementations implemented a Blob type similar to Binary/B's ByteString, and any that implemented something called ByteArray actually implemented something more like a stream API based buffer rather than anything remotely resembling an Array.

This proposal and Binary/C is based off of API's drafted for MonkeyScript (Blob Buffer).

Prior art

Java's java.lang.StringBuffer is a very good reference for prior art. It is made for Strings rather than bytes, but nonetheless it's a api designed solely for the purpose of mutation of a string, not one designed for one purpose and hacked to suit another.

Java's StringBuffer works using by append[ing](), insert[ing](), strings to grow the buffer. .delete() removes portions of the buffer, .indexOf() and .lastIndexOf() can search, .replace() and .reverse() are available, .length() shows the length of the data itself, .capacity() shows the current amount of memory allocated, and .substring can grab a substring from the StringBuffer.

The API

Buffers

Buffers are accompanied by three classes; Buffer, StringBuffer, and BlobBuffer. Buffer itself is the generic class, making calls to it will normally create either a StringBuffer or a BlobBuffer. Both StringBuffer and BlobBuffer should inherit from Buffer and return true in a buf instanceof Buffer.

Buffers may implement smart resizing in the background (ie: padding arrays or whatnot to sizes to avoid reallocating on each insert) but information on this is not available to the JavaScript API.

Buffers will only take their own data type as arguments. If you try to insert a String into a BlobBuffer or a Blob into a StringBuffer a TypeError will be thrown.

new Buffer();
No-op... This simply creates an instance of Buffer. On it's own the Buffer class does nothing so this simply exists so that prototypes may be made that inherit from Buffer.
new StringBuffer();
Creates a new empty text buffer.
new BlobBuffer();
Creates a new empty binary buffer.
new StringBuffer(len);
Creates a new text buffer of len size.
new BlobBuffer(len);
Creates a new binary buffer of len size.
new Buffer(String);
Creates a new empty StringBuffer.
new Buffer(Blob);
Creates a new empty BlobBuffer.
new Buffer(String, len);
Creates a new StringBuffer of len size.
new Buffer(Blob, len);
Creates a new BlobBuffer of len size.
new Buffer(string);
Creates a new StringBuffer with the same size and contents as the string.
new Buffer(blob);
Creates a new BlobBuffer with the same size and contents as the blob.
buf.length;
buf.length = len;
Get or set the length of the buffer (For binary buffers this is number of bytes, for text buffers this is number of characters).
When length is set the buffer is dynamically resized. If shrunk it is truncated to size discarding items from the end. If grown the buffer is padded with 0 bytes for binary, and '\0' (null characters) for text.
buf.contentConstructor;
Returns Blob from a BlobBuffer to indicate it has binary content, and String from StringBuffer to indicate it has text content. Implementations should make an effort to make this readonly.
blob.codeAt(index);
Extracts a single byte from the blob and returns it as a unsigned integer (Number) such that the number will be in the range 0..255.
buf.splice(offset, length, data, ...);
Remove a section of the buffer and insert chunks of data starting from the place it was removed from. If data is another Buffer memcopy should be used.
buf.slice();
buf.slice(start);
buf.slice(start, end);
Extract a subsection of the buffer and return it as a new sequence. (Behaves the same as the string and blob counterparts)
buf.copy(data, offset, length, [dataOffset]);
Uses memcopy to copy a section of data directly into buf. data may either be another buffer of the same type, or a sequence (String/Blob) of same type as indicated by contentConstructor.
buf.indexOf(sequence, offset=0);
buf.lastIndexOf(sequence, offset=0);
Returns the index within the calling buffer object of the first or last (depending on which method is used) occurrence of the specified value, or -1 if not found.
buf.valueOf();
Return the non-mutable sequence for the buffer.
  • In a BlobBuffer this returns a Blob which matches the contents of the buffer.
  • In a StringBuffer this returns a String which matches the contents of the buffer.
buf.valueAt(index);
Returns a string or blob representing the unit at a specified index.
buf.append(data);
Append a chunk of data to the end of the buffer growing it by data.length. If data is another Buffer memcopy should be used.
buf.insert(data, index);
Insert a chunk of data into a buffer growing it by data.length and shifting the data to the right of the specified index towards the end of the buffer. If data is another Buffer memcopy should be used.
buf.clear(start, length);
Zero out a section of the buffer. Binary buffers have bytes replaced with 0 bytes and text buffers have characters replaced with '\0' (null characters).
buf.fill(start, length, seq);
Zero out a section of the buffer. Binary buffers have bytes replaced with 0 bytes and text buffers have characters replaced with '\0' (null characters).
buf.replace(data, index);
Insert data into the buffer starting from an index overwriting any existing bytes after that index up to the end of the data. Expands the buffer if necessary (index + data.length > buf.length).
buf.remove(offset, length);
Remove a section of the buffer starting at offset and continuing for length units, shrinking it by length.
buf.split();
buf.split(separator);
buf.split(separator, limit);
Splits the buffer based on a sequence and returns an array of strings or blobs. (When used on text buffers this may or may not chose to support regular expressions)

Ranges

IO/B/Buffer supports an additional pattern for memcopy of data from one mutable or non-mutable sequence to another mutable sequence.

This part of IO/B/Buffer extends Binary/B, native String, IO/B/Stream/Raw. Under this extension all Buffer and stream methods accepting a sequence type of data to be placed into the buffer/stream MUST accept an OpaqueRange and SHOULD memcopy that data. As well String, Blob, and Buffer MUST contain the additional method:

.range(???, ???) -> OpaqueRange
Returns an OpaqueRange which can be used to memcopy the specified range of data from the sequence.

OpaqueRange

An OpaqueRange MUST contain enough information to reconstruct a specific range of data from the sequence .range was called on. Normally this may consist of the original sequence, and two numbers that can be used to determine what portion of the sequence to memcopy. Or could perhaps consist of the chunk of data itself in memory.

An OpaqueRange is NOT required to remain safe after first use, and NOT required to provide valid data if not passed directly to a method for use. In short this means that the ONLY supported use of .range is a use such as bufB.append(bufA.range(???, ???)); where you pass the range directly to a method like .append that uses it. The result of attempting to reuse a pre-used range or storing an OpaqueRange for any period of time is not defined, not supported, and may typically result in anything from errors, corrupt or unexpected data, to a segfault of the application.

Abstract API

Like Binary/C, IO/B/Buffer is able to work abstractly on both strings and blobs. Every method on the Buffer classes is built abstractly. There are a number of useful code stubs:

Create a new buffer with the same type as another form of data (be it a string, blob, another buffer, stream, or anything following the same contentConstructor rule).

var buf = new Buffer(data.contentConstructor);

Cast some data to the same type as the sequence type of a buffer (to string or to blob, or throw an error if bad data; good when writing Buffer prototypes):

buf.contentConstructor(data);

Create a new empty sequence of the same type as the sequence type of a buffer (empty string or blob, like what you'd return to signal EOF).

buf.contentConstructor();

Get a character or byte in sequence form of the same type as a buffer corresponding to a byte or character code. In this case, the most useful purpose is returning the 0 byte or null character for an operation like clearing data.

buf.contentConstructor.fromCode(0);

These are key to string/blob abstract programming and must be supported by implementations.

Notes

  • Buffer was made independent of whether the data is binary or text. To avoid implicit string conversion TypeErrors are thrown when giving data of the incorrect type to a buffer. But you are still able to write code using buffer that works on either strings or blobs and doesn't care which mode it is in.
    • Note how Buffer accepts String or Blob to determine it's data type. You could actually write code like var buf = new Buffer(sequence.constructor); and create a buffer based on the type of a sequence without checking what it is.