TIP 450: Add [binary] subcommand "set" for in-place modification

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Author:		Arjen Markus <arjen.markus895@gmail.com>
State:		Draft
Type:		Project
Vote:		Pending
Created:	18-Jul-2016
Post-History: 
Tcl-Version:	8.7
Keywords:	Tcl, binary data

Abstract

This TIP proposes a simpler extension of the [binary] command than the related [418], namely just a subcommand set that updates an existing byte array in a variable instead of creating a new one like binary format. It does not propose an extension of the various formatting codes.

Rationale

As already argued in [418], the binary command is efficient in creating new objects of binary data or in parsing existing objects with such data. It is not currently efficient in updating existing objects. However, such data objects are commonly used by compiled extensions.

As a consequence, if you want to manipulate such data objects from Tcl directly, it is easier to parse the object into, say, a list of numbers, use list commands like lset to replace individual values and pack it into a new binary array before passing it to a compiled extension.

Specification

This TIP proposes to add a subcommand set to the binary command with the following signature:

binary set varName formatString arg1 arg2 arg3 ...

The effect of this subcommand is that the byte array data contained in the variable "varName" is updated in a manner analogous to lset, but using a format string like binary format. It could be implemented in Tcl as:

set varName [binary format "a*$formatString" $varName $arg1 $arg2 $arg3 ...]

except that this allocates a new block of memory, sets that to null, copies the contents of varName into that new block and then does the update.

The new command will have the effect that the first few steps are not necessary anymore.

Implementation Notes

Besides the nominal case of a variable that contains a binary array that is to be updated within the bounds of that array, three other cases exist and need to be prepared for:

  • The variable varName does not exist yet

  • The variable varName does not contain a binary array

  • The updating would go past the memory allocated for the binary array

Each of these cases and perhaps others will have to be taken care of. The first case might be treated as if binary format was meant. For the second case the implementation can convert the current value.

The third case might either cause an error (we are updating an existing block of memory after all) or silently extend the memory, effectively performing what the Tcl implementation shown above would do. If an error is thrown, then the first case should probably throw an error as well.

Reference Implementation

To be committed to a fossil branch.

A few remarks:

  • The third case is not completely treated yet (see "TODO" in the code)

  • The binary set command does not properly invalidate the string representation. The binary array is, however, updated properly - at least according to the very limited tests that were performed.

  • There are no proper test cases yet.

  • There is no proper performance measurement yet; this could be based on Wiki page http://wiki.tcl.tk/44363.

  • It does not deal with the possibility that the binary array is shared.

Copyright

This document is placed in public domain.

History