TIP 484: Merge 'int' and 'wideInt' Obj-type to a single 'int'

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Author:         Jan Nijtmans <jan.nijtmans@gmail.com>
State:          Final
Type:           Project
Vote:           Done
Created:        06-Nov-2017
Post-History:
Keywords:       Tcl
Tcl-Version:    8.7
Tcl-Branch:     no-wideint

Abstract

The 'wideInt' type was invented for Tcl because the 'int' type (which was actually 'long') was not sufficient to store numbers larger than [+-]2**31. This TIP proposes to merge the 'int' and the 'wideInt' Obj-types such that 'int' does all internal calculations using Tcl_WideInt in stead of long.

Rationale

For linux64, the 'wideInt' type doesn't give any advantage, it is #ifdef'd out in the code. For Windows 64, since 'long' is the same as 'int', it would be an advantage to use __int64 for all internal number processing. Since __int64 calculations are just as efficient as int/long calculations on most (64-bit and 32-bit) processors, it would save all kinds of range checks in the Tcl code. That's the main gain of this TIP.

Finally, for 32-bit systems, getting rid of 'int' could give a small disadvantage on systems which split 64-bit calculations in terms of 32-bit calculations. There 32-bit calculations could become slightly slower.

Since the gain for 64-bit platforms (mainly Win64) is expected to be more than the possible loss on some 32-bit system, I still propose to follow this road.

Another advantage of this change is maintainability of number handling. Currently the C code contains a lot of #ifdef's which check whether sizeof(long) is the same a sizeof(wideint). Most of those checks can be removed, so the number handling becomes almost identical on all platforms supported by Tcl.

Proposal

Merge 'int' and 'wideInt' Obj-type to a single 'int'. Make the following changes:

  • Change the signature of the internal function TclFormatInt to int TclFormatInt(char *buffer, Tcl_WideInt n) so it can handle the full range of wide integers.

  • Modify the internal TclGetNumberFromObj function to always return TCL_NUMBER_WIDE in stead of TCL_NUMBER_LONG. This indicates that the internalRep.wideValue contains the actual internal representation of numbers. TCL_NUMBER_LONG is not used anymore, except in the bytecode implementation for string is integer and string is wideinteger. On script level this change is not detectable.

  • Modify the 'int' type to store its internal representation in internalRep.wideValue in stead of internalRep.longValue. This means that the 'int' type gains the full functionality of 'wideInt' on all platforms. This makes all 'wideInt' special code redundant, so it can full be removed from the Tcl code base.

  • On 32-bit systems, use 'wideInt' in stead of 'int', this makes it more clear to extensions that the internal implementation changed. But also a new dummy "int" type is introduced, in order to restore compatibility with extensions like "tbcload", "tclxml", "VecTcl", which (incorrectly) assume that Tcl_GetObjType("int") will never return NULL.

  • Starting with Tcl 9.0, no longer register the "int"/"wideInt" types. So the only way to get hand on the int type is:

    Tcl_Obj *obj = Tcl_NewIntObj(0);
    intType = obj->typePtr;
    Tcl_DecrRefCount(obj);
    

Compatibility

Starting with Tcl 9.0:

  • The function call Tcl_GetObjType("int") will return NULL, where it returned non-NULL before.

On systems where sizeof(Tcl_WideInt) != sizeof(long)

  • The function call Tcl_GetObjType("wideInt") will return NULL, where it returned non-NULL before.

No test-cases are modified in the 'no-wideint' branch, all current test-cases continue to pass. This shows that separating the 'int' and the 'wideInt' implementation is nowhere required to let Tcl number handling function the way it functions now. At the script level, the change is invisible.

Reference Implementation

An implementation of this TIP can be found in the no-wideint branch.

The code is licensed under the same license as Tcl.

Caveats

  • Starting with Tcl 9.0, Extensions which assume that Tcl_GetObjType("int") returns non-NULL will no longer work. No external code may make assumptions of implementation details for 'int'/'wideInt'. There are a few extensions which assume that Tcl_GetObjType("int") always return non-NULL. It's easy to fix that. If any packages maintainers need help doing that, just ask.

Some examples (the only 3 I found):

  • tbcload (cmdRead.c)
  • tclxml (tclxslt-libxslt.c)
  • VecTcl (nacomplex.c, vectcl.c, vectcl.tcl.c)

Rejected Alternatives

  • It would have been possible to keep the objType to be named "int" on all platforms. That would mean that 32-bit extensions which access internalRep.longValue directly might silently fail without any warning.

  • Another possibility is just remove the registration of the "int"/"wideInt" objType altogether. This would mean that all extensions using Tcl_GetObjType("int") will start failing during start-up immediately. This will be done for Tcl 9.0, but would be too early for 8.7.

Copyright

This document has been placed in the public domain.

History