TIP 59: Embed Build Information in Tcl Binary Library

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Author:         Andreas Kupries <andreas_kupries@users.sourceforge.net>
State:          Final
Type:           Project
Vote:           Done
Created:        04-Sep-2001
Post-History:   
Tcl-Version:    8.5

Abstract

This TIP provides an interface through which Tcl may be queried for information on its own configuration, in order to extract the information directly instead of reading it from a Bourne shell file. An important reason to do this is to have the information not only available but also tightly bound to the binary configured by it, so that the information doesn't get lost.

Foreword

This TIP proposes a rather small change to Tcl and tries very hard to follow the KISS principle. Given that the casual observer might find it rather long, be assured, the actual specification in here is not very long, nor complicated. Most of the following explanations were added to preserve the KISS principle and head off attempts to extend the TIP beyond its small goal and scope.

Note: All instances of "Tcl library" in the following text refer to the generated installable library and not the script library coming with the core.

Background and Rationale

The main reason for writing this TIP are the disadvantages inherent in the current way of storing the configuration of Tcl, namely in the file tclConfig.sh.

  • It is a separate file, easily lost or not installed at all, making it difficult for extension developers to access this information.

    Note: The non-installation of development files like tclConfig.sh might even be required by through vendor policies and such and thus not under the control of the package author or builder.

  • The name does not convey that tclConfig.sh contains platform and build specific information. When installing different builds this usually leads to clashes. This makes it again difficult for extension developers to find the right file for their current build.

  • Not every extension generates such a file for use by other extensions.

Thus, this TIP proposes:

  • an extension of the public API so that extensions are able to define configuration introspection commands and to declare the returned information during initialization with the information embedded into their installable libraries during compilation.

  • to embed the information about the configuration of the Tcl library as strings into the generated installable library and make them accessible at the script level through Tcl variables, thus allowing developers on any platform Tcl compiles on to access this information.

The file tclConfig.sh is not replaced by this system, both sets of information exist in parallel.

Neither is the variable tcl_platform replaced. This means that some information, like threaded, is held redundantly. Other information in tcl_platform, like user is runtime and not configuration information. The operating system information is important to a build system, but this is out of the scope of this TIP.

Interface Specification

Any embedded information is made accessible at the Tcl level through a new command. The name of the command used by Tcl itself is ::tcl::pkgconfig. Extensions have to use their own commands. These commands will be named pkgconfig too and have to be placed within in the namespaces owned by the extensions initializing them.

At the C-level the public API of the Tcl core is extended with a single function to register the embedded configuration information. This function is added to the public stub table of the Tcl core so that it can be used by Tcl and extensions to register their own configuration information in the system during initialization.

The function takes three (3) arguments; first, the name of the package registering its configuration information, second, a pointer to an array of structures, and third a string declaring the encoding used by the configuration values. Each element of the array refers to two strings containing the key and the value associated with that key. The end of the array is signaled by an empty key.

Formalized, name and signature of this new function are

 Tcl_RegisterConfig (CONST char* pkgName, Config* configuration, CONST char* valEncoding)

 typedef struct Config {
    char* key;
    char* value;
 }

The string valEncoding contains the name of an encoding known to Tcl. All these names are use only characters in the ASCII subset of UTF-8 and are thus implicity in the UTF-8 encoding. It is expected that keys are legible english text and therefore using the ASCII subset of UTF-8. In other words, they are expected to be in UTF-8 too. The values associated with the keys can be any string however. For these the contents of valEncoding define which encoding was used to represent the characters of the strings.

During compile time the value of valEncoding is specified as a makefile variable for non-configure based build systems and through the new option --with-encoding=FOO of configure otherwise. The default value is iso8859-1.

This approach gives us all what we desire with not too many drawbacks.

  • The default case (no special characters) requires no action on part of the builder at all.

  • The non-default case (path containing special characters like Kanji) is supported.

  • Cross-compilation is unimpeded and no more complex than normal compilation.

  • The requirement for conversion of strings is a drawback, but should not have a big impact on performance. It has no impact on the performance of scripts which do not use the embedded information. The impact is even more negligigble if the result of the conversion is cached.

The function will

  • create a namespace having the provided pkgName, if not yet existing.

  • create the command pkgconfig in that namespace and link it to the provided information so that the keys from configuration and their associated values can be retrieved through calls to pkgconfig.

The command pkgconfig will provide two subcommands, list and get. The first subcommand, list takes no arguments and returns a list containing the names of the defined keys. The second subcommand takes one argument, the name of a key and returns the string associated with that key.

How to Gather the Embedded Information

The information to be embedded is gathered in a platform-specific way and written into the file generic/tclPkgConfig.c. The different platforms may employ platform specific intermediate files to hold the information, but in the compilation phase only tclPkgConfig.c will be used.

  • Under unix it is determined primarily by the existing configure script. The configuration information coming from the Makefile, or from other compile time means, is embedded into the tclConfig.c file by means of preprocessor statements (#ifdef ... #endif).

  • For the Windows and Mac platforms volunteers may have to create files tclWinConfig.c.some-ext containing this information for each supported build environment, like VC++, Borland, Cygwin, etc.

    tclWinConfig.c.vc = VC++.

    tclWinConfig.c.bc = Borland.

    tclWinConfig.c.in = Cygwin. .in is used because Cygwin can use configure to determine the values and embed them into a template.

  • As for other platforms, these are handled either like Unix or like Mac, depending on the availability and usability of configure.

    Volunteers are required to write the appropriate files for their build environment.

Specification of Tcl Configuration Information

The configuration information registered by Tcl itself is specified here. A discussion of the choices made here follows in the next section. Please read this discussion before commenting on the specification.

The values associated with the keys below are all of one of the following types:

  • Boolean flag. Allowed values are all the values which are accepted by Tcl itself. Examples are:

           true, false, on, off, 1, 0
    
  • String. General container for all other information.

  • Templated string. With respect to placeholders in the same format as 'Script' below, but does not have to be a valid Tcl script.

  • Script. A string containing a full Tcl script. The user should handle this string like a procedure body. The script is allowed to contain placeholders to be filled by the user of the string. Placeholders follow the syntax of full-braced Tcl variables, i.e. _${some_name}'. The actual values can be filled in by the user of the configuration information. Possible ways to do so are [regsub], [string map] or [subst -nocommand]. The best way however would be to use the script as a procedure body, with the placeholders as the arguments of the procedure. This will avoid many problems regarding bracing and the protection of special characters.

    Which placeholders are possible for a particular script or template is described together with the meaning of the key.

    Beyond the placeholders a script or templated string is allowed to contain references to other keys returned by the config command. These references use the same variable syntax as the placeholders.

The registered keys follow below. They will be always present with some value. Non-boolean keys not applicable to a particular platform will contain the empty string as their value.

  • Configuration of Tcl itself:

    * debug. Boolean flag. Set to false if Tcl was not compiled to contain debugging information.

    * threaded. Boolean flag. Set to false if Tcl was not compiled as thread-enabled.

    * profiled. Boolean flag. Set to false if Tcl was not compiled to contain profiling statements.

    * 64bit. Boolean flag. Set to false if Tcl was not compiled in 64bit mode.

    * optimized. Boolean flag. Set to false if Tcl was compiled without compiler optimizations.

    * mem_debug. Boolean flag. Set to false if Tcl has no memory debugging compiled into it.

    * compile_debug. Boolean flag. Set to false if Tcl has no bytecode compiler debugging compiled in.

    * compile_stats. Boolean flag. Set to false if Tcl has no bytecode compiler statistics compiled in.

  • Installation configuration of Tcl. In other words, various important locations.

    * prefix,runtime. String. The directory for platform independent files as seen by the interpreter during runtime.

    * exec_prefix,runtime String. The directory for platform dependent files as seen by the interpreter during runtime.

    * prefix,install. String. The directory for platform independent files as seen by the installer at install-time.

    * exec_prefix,install. String. The directory for platform dependent files as seen by the installer at install-time.

Discussion

The placement of this information into a separate package was proposed but rejected because of the trouble of finding the right information for the right library in the case of multiple configurations installed into the same directory space. Embedding into the library does not cost much space and binds the information tightly to the right spot.

Another reason to do it this way is that this enables us to embed information coming from the Makefile itself (like MEM_DEBUG) or from other compile time means. This would not be possible for a file generated solely by the Tcl configure. It would also restrict the embedding to the platforms which allow the use of configure script.

The usage of a separate package to just access the information placed into the Tcl library was also proposed. This was rejected too, due to the overhead for the management of the package in comparison to the small size of the code actually involved.

Another proposal rejected in the early discussions was to have this TIP define an entire build system based upon Tcl. This TIP is certainly a step in this direction and facilitates the building of such a build system (sic!). Still, specifying such here was seen as too large a step right now, with too many issues to be solved and thus delaying the implementation of this TIP.

Only the configuration of the particular variant of the Tcl library or extension which was generated is recorded in the library. No attempt is made to record the information required to allow the compilation of any possible variant of an extension. Doing so would reach again into the bigger topic of specifying a full build system. We've already established that as being out of the intended scope of this TIP.

Note further that the scheme as specified above does not prevent us from adding the full information in a later stage. In other words, it does not restrict the development of a more powerful system in the future.

This should be enough reasoning to allow the acceptance of even this admittedly simple system.

The configuration information registered by Tcl is currently a very small subset of the information in tclConfig.sh. A future TIP is planned to provide the missing information in a regular and generalized manner.

If an extension requires more information than provided by the Tcl configuration it will have to obtain this information itself. For instance, TclBlend requires a CLASSPATH, the name of a Java compiler, etc. whereas the TclPython and TclPerl extensions require paths to those environments, etc. It is not reasonable that the configure script for Tcl itself have to accommodate all requirements of all extensions of Tcl. Instead, the configure scripts or whatever other means is used to obtain the configuration information for the extensions should reflect their needs, and register the requirements gathered into their own configuration command. Note that an extension is only expected to create variables for information unique to it. Everything else can be had from the configuration command of Tcl and the extensions it depends on.

This TIP is not in opposition to [34] but rather fleshes out one of the many details in the specification which were left open by that TIP.

This TIP also does not propose to change the process for building Tcl itself. The goal is rather to make the building of extensions easier in the future.

A naming convention for keys returned by the config command would have been possible but would also require quite a lot more text, both in careful definition of the general categories and in explanations of the choices made.

Implementation

Work on implementing this feature is tracked at Tcl Patch 507083 at the Tcl project at SourceForge. Implementation effort also takes place on the tip-59-implementation branch in the Tcl CVS repository (see [31]).

Copyright

This document is in the public domain.

History