TIP 720: Updated Tcl Bytecode Opcodes

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Author:         Donal Fellows <dkf@users.sourceforge.net>
State:          Voting
Type:           Project
Created:        09-May-2025
Tcl-Version:    9.1
Tcl-Branch:     no-variable-width-instruction-issue

Abstract

This TIP proposes to change the set of bytecodes used in the Tcl bytecode engine. The primary goal of this is simplification to make the compiler easier to maintain.

Rationale

Tcl bytecode is complex to issue, and quirky in quite a few places. Chief among those are:

  • Some opcodes come in two variants, especially the ones relating to jumps. That makes rewriting bytecodes in the optimiser much more difficult, and more confusing too. It also requires a significantly more elaborate scheme for creating the jumps, as the fulfilment of a forward jump may result in previously issued code having to be moved. That's horrible.
  • Some opcodes have arguments that take single byte lengths of things, a distinct limitation at times. This is particularly a problem for opcodes relating to incr, which only have a single byte for the variable index. Procedures with more than 256 local variables are not the most common case, but are common enough.
  • The INST_RETURN_CODE_BRANCH opcode effectively does address arithmetic using the current Tcl result code, and that gives me the cold shivers.

Additionally, the instruction sequences for some commands (especially try) can be very complex. We should simplify.

Specification

Except where noted below, the tcl::unsupported::assemble command is already transparently aware of these changes.

Deprecations of old opcodes

This TIP proposes to deprecate these opcodes:

  • INST_PUSH1
  • INST_INVOKE_STK1
  • INST_LOAD_SCALAR1
  • INST_LOAD_SCALAR_STK
  • INST_LOAD_ARRAY1
  • INST_STORE_SCALAR1
  • INST_STORE_SCALAR_STK
  • INST_STORE_ARRAY1
  • INST_INCR_SCALAR1
  • INST_INCR_ARRAY1
  • INST_INCR_SCALAR1_IMM
  • INST_INCR_ARRAY1_IMM
  • INST_JUMP1
  • INST_JUMP_TRUE1
  • INST_JUMP_FALSE1
  • INST_APPEND_SCALAR1
  • INST_APPEND_ARRAY1
  • INST_LAPPEND_SCALAR1
  • INST_LAPPEND_ARRAY1
  • INST_RETURN_CODE_BRANCH
  • INST_TAILCALL1 (renamed from INST_TAILCALL)
  • INST_TCLOO_NEXT1 (renamed from INST_TCLOO_NEXT)
  • INST_TCLOO_NEXT_CLASS1 (renamed from INST_TCLOO_NEXT_CLASS)

Where known to be supported by the compiler, these elements of the TclInstruction enumeration will be marked with the deprecated attribute so that uses of them will result in warnings. If REMOVE_DEPRECATED_OPCODES is defined during compilation, they will be entirely elided including their bytecode engine implementations (resulting in a bytecode engine that cannot have bytecodes for Tcl 9.0 loaded into it, a non-issue without the use of tbcload).

Renaming of opcodes

The following opcodes are renamed (with no other change to them):

  • INST_PUSH4 to INST_PUSH
  • INST_INVOKE_STK4 to INST_INVOKE_STK
  • INST_LOAD_SCALAR4 to INST_LOAD_SCALAR
  • INST_LOAD_ARRAY4 to INST_LOAD_ARRAY
  • INST_STORE_SCALAR4 to INST_STORE_SCALAR
  • INST_STORE_ARRAY4 to INST_STORE_ARRAY
  • INST_JUMP4 to INST_JUMP
  • INST_JUMP_TRUE4 to INST_JUMP_TRUE
  • INST_JUMP_FALSE4 to INST_JUMP_FALSE
  • INST_BEGIN_CATCH4 to INST_BEGIN_CATCH
  • INST_APPEND_SCALAR4 to INST_APPEND_SCALAR
  • INST_APPEND_ARRAY4 to INST_APPEND_ARRAY
  • INST_LAPPEND_SCALAR4 to INST_LAPPEND_SCALAR
  • INST_LAPPEND_ARRAY4 to INST_LAPPEND_ARRAY

New replacement opcodes

These replace existing, similarly-named, opcodes with versions with wider operands.

  • INST_INCR_SCALAR
  • INST_INCR_ARRAY
  • INST_INCR_SCALAR_IMM
  • INST_INCR_ARRAY_IMM
  • INST_TAILCALL
  • INST_TCLOO_NEXT
  • INST_TCLOO_NEXT_CLASS

Completely new opcodes

  • INST_SWAP: This swaps the two elements on the top of the stack, and is significantly more efficient than INST_REVERSE 2.
  • INST_ERROR_PREFIX_EQ: This is a special comparison for handling trap clauses in try. Due to it requiring the two arguments to be different objects, this is not exposed by any other mechanism.
  • INST_TCLOO_ID: This is info object creationid; it's cheaply available information.
  • INST_DICT_PUT: This lets code add a key/value to a dictionary value (i.e., the guts of dict replace). It also simplifies try.
  • INST_DICT_REMOVE: This lets code remove a key from a dictionary value (i.e., the guts of dict remove). To complete the set of operations given that INST_DICT_PUT is there.
  • INST_IS_EMPTY: This provides access to the new Tcl_IsEmpty() function. Typically introduced by the bytecode optimiser when presented with code like expr {$val eq ""}.
  • INST_JUMP_TABLE_NUM: This is similar to INST_JUMP_TABLE except that keys are integers (up to what can be expressed in a Tcl_Size). This simplifies many cases of try, replaces INST_RETURN_CODE_BRANCH in subst, and is expected to power a proper switch -integer mode in a future TIP.

Note: INST_JUMP_TABLE_NUM introduces a new aux data type, where the internal model is a hash table with TCL_ONE_WORD_KEYS that maps Tcl_Size to Tcl_Size.

New internal types

There's a number of new internal types in tclCompile.h. The main ones of interest are:

  • Tcl_LVTIndex; an alias for Tcl_Size that specifically contains either a reference to a local variable or TCL_INDEX_NONE.
  • Tcl_AuxDataRef; an alias for Tcl_Size that specifically contains an index into the auxiliary data table.
  • Tcl_ExceptionRange; an alias for Tcl_Size that specifically contains a reference to an exception range.
  • Tcl_BytecodeLabel; an alias for Tcl_Size that specifically is treated as if it contains the address of a jump target (replacing JumpFixup records in many cases).

New instruction issuing macros

Note that some of these were previously used in just one file. For their usage and exact definitions, see the code!

// Issue an instruction without an argument.
#define OP(name)
// Issue an instruction with a single-byte argument.
#define OP1(name,val)
// Issue an instruction with a four-byte argument.
#define OP4(name,val)
// Issue an instruction with a single-byte argument and a four-byte argument.
#define OP14(name,val1,val2)
// Issue an instruction with two four-byte arguments.
#define OP44(name,val1,val2)
// Issue an instruction with a foun-byte argument and a single-byte argument.
#define OP41(name,val1,val2)
// Issue a potentially break/continue generating instruction without an argument.
#define INVOKE(name)
// Issue a potentially break/continue generating instruction with a single argument.
#define INVOKE4(name,arg1)
// Issue a potentially break/continue generating instruction with two arguments.
#define INVOKE41(name,arg1,arg2)
// Push a string literal.
#define PUSH(string)
// Push a string whose is computed with strlen().
#define PUSH_STRING(strVar)
// Push a string from a TCL_TOKEN_SIMPLE_WORD token.
#define PUSH_SIMPLE_TOKEN(tokenPtr)
// Take a reference to a Tcl_Obj and arrange for it to be pushed.
#define PUSH_OBJ(objPtr)
// Take a reference to a Tcl_Obj and arrange for it to be pushed.
// Handles extra flags, typically used for command names.
#define PUSH_OBJ_FLAGS(objPtr, flags)
// Push a general token. Needs which index of its command it is.
#define PUSH_TOKEN(tokenPtr, index)
// Push a token that is an expression.
#define PUSH_EXPR_TOKEN(tokenPtr, index)
// Compile the body of a command (e.g., [if], [while])
#define BODY(tokenPtr, index)
// Set the label to the current address. Typically paired with BACKJUMP.
#define BACKLABEL(var)
// Jump (of given type) backwards to the label defined by BACKLABEL.
#define BACKJUMP(name, var)
// Jump (of given type) forwards to the label defined by FWDLABEL.
#define FWDJUMP(name, var)
// Set the label to the current address. MUST be paired with FWDJUMP.
#define FWDLABEL(var)
// Create an unplaced CATCH exception range.
#define MAKE_CATCH_RANGE()
// Create an unplaced LOOP exception range.
#define MAKE_LOOP_RANGE()
// Wrap the given range around a body of code, placing its start and end.
#define CATCH_RANGE(range)
// Define where caught exceptions in the CATCH range branch to.
#define CATCH_TARGET(range)
// Define where caught BREAKs in the LOOP range branch to.
#define BREAK_TARGET(range)
// Define where caught CONTINUEs in the LOOP range branch to.
#define CONTINUE_TARGET(range)
// Finalize the LOOP exception range, setting the destinations for jumps.
#define FINALIZE_LOOP(range)
// Apply a correction to the stack depth.
#define STKDELTA(delta)

New macros in tclCompile.c

To keep things clearer and less prone to errors, the following macros are used for building the entries in the tclInstructionTable global constant:

#define TCL_INSTRUCTION_ENTRY(name,stack) \
    {name,1,stack,0,{OPERAND_NONE,OPERAND_NONE}}
#define TCL_INSTRUCTION_ENTRY1(name,size,stack,type1) \
    {name,size,stack,1,{type1,OPERAND_NONE}}
#define TCL_INSTRUCTION_ENTRY2(name,size,stack,type1,type2) \
    {name,size,stack,2,{type1,type2}}

These have no effect other than to make building the entries a bit less error-prone. (There's equivalent DEPRECATED_... ones for the deprecated opcodes, but they're currently otherwise defined identically; they're just visual markers when reading the source code.)

New bytecode engine macros

Mostly the changes here are small, but there is one new general macro:

  • NEXT_INST_F0(pcAdjustment, nCleanup)

That's a cut-down version of NEXT_INST_F() for the case when there's no result to handle, which is really quite common and means we can omit quite a bit of code that the compiler would otherwise have to work at to remove. If we're lucky, it just makes the bytecode engine faster to compile. If we're unlucky, it shrinks the size of the built code (due to removal of code that should have been unreachable).

Compatibility

There are no changes to the public Tcl C API. All API changes are strictly internal only.

If REMOVE_DEPRECATED_OPCODES is not defined, full compatibility with Tcl 9.0 is maintained, though possibly with warnings. Code that must handle the old opcodes, such as the bytecode engine, does:

#define ALLOW_DEPRECATED_OPCODES

prior to #include "tclCompile.h to disable the warnings.

Code that saves and loads bytecodes is not expected to be able to handle these new opcodes without changes; the new auxiliary record type causes that.

To Tcl scripts, there should be no visible changes, other than the lifting of some limits and new opcodes in tcl::unsupported::assemble.

Performance

The purpose of this change was to improve my sanity when reading the bytecode compilation code! However, a simple evaluation of the performance seems to indicate no substantive performance difference, and some increase in size of bytecode (to be expected as many common operations are now always issued with 4-byte operands). This is in line with expectations.

Implementation

See the no-variable-width-instruction-issue

Future Directions

NB: These all lie outside the scope of this TIP.

This TIP lays the groundwork for making more commands be bytecode compiled with expansion present, though more opcodes are likely to be required for much of that project.

There are several proposed routes for removing the deprecated bytecodes:

  1. Do not remove the existing bytecode implementations for now.
  2. Branch remove-deprecated-opcodes-level1 removes the implementations but leaves the other bytecodes as they are. This is compatible with existing code so long as the deprecated bytecodes are not used.
  3. Branch remove-deprecated-opcodes-level2 compacts the remaining bytecodes. This is definitely not compatible with existing bytecodes... but that only matters for code that uses the TDK compiler and tebcload.

Other things examined during the development of this TIP:

  • Adding bytecodes for pushing special constants.
  • Adopting C23 [[deprecated]] annotations. (C23 has some other interesting goodies too.)
  • Adopting the <stdint.h> and <stdbool.h> standard headers.

Copyright

This document has been placed in the public domain.

History