Author:         Jan Nijtmans <jan.nijtmans@gmail.com>
State:          Final
Type:           Project
Vote:           Done
Created:        20-Aug-2018
Post-History:   
Keywords:       Tcl
Tcl-Version:	8.7
Tcl-Branch:     tip-514

Abstract

This TIP proposes to resolve the platform differences between int/wide/entier math functions and commands like "sting is integer"/"string is wide"/"string is entier". At the script level it should not be relevant whether the platform is 32-bit or 64-bit any more.

Most Tcl commands already accept unlimited integers, so there is hardly any command left which need to checked for correct range.

Rationale

Some examples:

% string is int -4294967296
0
% string is int -4294967295
1
% string is int 4294967295
1
% string is int 4294967296
0

So, valid integers appear to range from -(2^32-1) to +2^32-1. Most people learn in school that 32-bit integers range from -2^31 to 2^31-1. Are Tcl's integers 33-bit, but then excluding -4294967296?

% string is wide -18446744073709551616
0
% string is wide -18446744073709551615
1
% string is wide 18446744073709551615
1
% string is wide 18446744073709551616
0

So, valid wide integers appear to range from -(2^64-1) to +2^64-1. Most people learn in school that 64-bit integers range from -2^63 to 2^63-1. Are Tcl's wide integers 65-bit, but then excluding -18446744073709551616?

% expr int(2147483648)          ; #on LP64/ILP64 platforms
2147483648
% expr int(2147483648)          ; #on other platforms
-2147483648
% expr int(9223372036854775808) ; #on LP64/ILP64 platforms
-9223372036854775808
% expr int(9223372036854775808) ; #on other platforms
0
% expr wide(9223372036854775808); #all platforms
-9223372036854775808

So, int() does either 32-bit or 64-bit truncation, but the highest left-over bit becomes the sign-bit. wide() does 64-bit truncation.

Specification

On all platforms, the int() math function is modified not to do truncation any more. int() will thus become synonym for entier(). The wide() and entier() functions will become deprecated in Tcl 9.0, but they will not be removed yet.
On all platforms, "string is integer" will become synonym to "string is entier". So these functions will report true (1) if the number looks like an integer with unlimited range.
On all platforms, "string is wideinteger" will start checking for the proper 64-bit signed range. So this functions will only report true (1) if the value is in the range of -9223372036854775808..9223372036854775807
The "string is wideinteger" and "string is entier" commands will become deprecated in Tcl 9.0, but they will not be removed yet.
The C function Tcl_GetIntFromObj() is changed to return TCL_OK if the Tcl_Obj contains a value in the range of -2147483648..4294967295. So, it succeeds if the number fits in either a platform "int", either a platform "unsigned int" type.
The C function Tcl_GetWideIntFromObj() is changed to return TCL_OK if the Tcl_Obj contains a value in the range of -9223372036854775808..9223372036854775807. So, it succeeds if the number fits in a Tcl_WideInt.
The C function Tcl_GetLongFromObj() is changed to behave like Tcl_GetIntFromObj() if sizeof(long) == sizeof(int), and to behave like Tcl_GetWideIntFromObj() if sizeof(long) == sizeof(Tcl_WideInt)

Implications

"string is integer" can no longer be used to check for a specific range. That doesn't matter any more, because the command argument that was being protected, doesn't throw an exception any more in case of under/overflow since the introduction of bignums.
int() can no longer be used for 32-bit/64-bit (platform-dependent) truncation.
If you still really want to protect some command argument from overflowing, Use Tcl_GetWideIntFromObj() in this command, and use "string is wide" to check for proper range. But - still better - is use Tcl_GetWideIntFromObj(), while falling back to Tcl_GetBignumFromObj() if the range requires it. That's what Tcl itself is doing almost everywhere to prevent under/overflow errors.

Implementation

Currently, the proposed implementation is available in the tip-514 branch.

Copyright