Author: Nathan Coulter <org.tcl-lang.tips@pooryorick.com>
State: Draft
Type: Project
Vote: Pending
Tcl-Version: 9.0
Tcl-Branch: tip-667
Obsoleted-By: 657
Abstract
Although TIP #657 purported to be about making the strict profile the default, it also specified other things that were out of scope, specified unnecessary implementation details, and included a partial alternative to TIP #653 in its Compatibility section (those changes have since been incorporated into TIP #653. This TIP proposes that "strict" become the default encoding profile for all operations.
Rationale
The tcl8 profile was until recently the only option for handling
encoding errors in channel content. Now there are two additional profiles
available, strict and replace.
The most common use case for encoded data is to expect that if the operation
completed without error, the data were correctly encoded and that no data were
lost in the result. This corresponds to the strict encoding profile, so it
makes sense to make this profile the default. Where it is not the default,
data may be silently corrupted, with the corruption being discovered only at
some later date after collateral damage, possibly including exploitation by bad
actors, has been discovered.
It is expected that scripts that must be adapted due to this change in default
behaviour will fail early and before real damage is done, making it easy to
detect where change is necessary and leading to a more secure and correct
scripting environment overall. Functions like fcopy, read and gets throw
exceptions as soon as bad data is detected. Where this is not desired it is
easy to remedy through trivial mechanical changes to existing scripts.
Specification
New channels are by default assigned the strict profile, and both
encoding convertfrom and encoding convertto use the strict profile
by default.
Tcl_FSEvalFileEx() uses the strict profile, and therefore source uses
the strict profile. The http package leaves any channels it opens in their
default strict configuration, so it too uses the strict profile.
Tcl_ExternalToUtfDStringEx(), Tcl_UtfToExternalDStringEx(),
Tcl_ExternalToUtf() and Tcl_UtfToExternal(), support operation in a mode
where any encoding error that occurs results in an EILSEQ POSIX error. That
mode is now the default. Other modes can be explicitly configured by the
caller to specify how these functions behave when invalid data are encountered.
Any test that in the Tcl test suite that requires a channel that is not configured for strict encoding explicitly configures the channel according to its needs.
Further explanation
Compatibility
This is an incompatible change for Tcl_ExternalToUtf()/Tcl_UtfToExternal(),
but since those functions are often called to operate in strict mode, it will
have little effect.
This is an incompatible change for Tcl_Read(), Tcl_Write(), Tcl_Gets().
See TIP 653 for details.
Implementation
The branch trunk-encodingdefaultstrict implements this TIP.
Copyright
This document has been placed in the public domain.