Author: Andreas Kupries <akupries@shaw.ca>
Author: Jean-Claude Wippler <jcw@equi4.com>
Author: Jeff Hobbs <jeffh@activestate.com>
State: Draft
Type: Informative
Vote: Pending
Created: 24-Mar-2004
Post-History:
Abstract
This document is an informational adjunct to [189] "Tcl Modules", describing a number of choices for the implementation of Tcl Modules, pure-Tcl, binary, or mixed. It lists these choices and then discusses their relative merits and problems, especially their interaction with wrapping, i.e. when used in a wrapped application. The main point of the document is to dispel the illusion that the restriction to the "source" command for the loading Tcl Modules is an actual limitation. A secondary point is to make recommendations regarding preferred implementations, based the merits and weaknesses of the various possibilities.
Implementation Choices
A small recap first: Tcl Modules are Packages in a single file, and only source is used to import them into the running interpreter. These restrictions are the backdrop to all implementations discussed here.
Packages Written in Tcl
These are easy.
A package which is implemented in a single file is already in the form required for a Tcl Module and nothing has to be done at all.
+--------------+ | Tcl File | +--------------+
Most packages in Tcllibhttp://tcllib.sf.net can be of this form.
In the case of a package whose implementation is spread over multiple .tcl files the solution is equally simple. Just concatenate all the files into one file when generating the distribution. This is a trivial operation.
+--------------++--------------+...+--------------+ | Tcl File 1 Tcl File 2 Tcl File n | +--------------++--------------+...+--------------+
Some packages in Tcllibhttp://tcllib.sf.net can be of this form, or rewritten into it.
The usage of an compiler/obfuscater like TclPro/Tcl Dev Kit on such Tcl Modules is also no problem. While the result of these compilers contains binary, it is in encoded form, and the file is still a proper Tcl script which can be handled by source. The encoded binary is decoded by an adjunct package, tbcload, whose import is the first action done by the script.
This also points us already to the general solution for binary packages, i.e. usage of supporting packages to handle arbitrary data embedded or attached in some way in/to an initialization script.
The usage of pure-Tcl Modules within wrapped applications poses no problems at all.
Also note that all of the choices available to binary packages, as explained in the next section, are available to pure-Tcl packages as well.
Binary Packages
A binary package consists of a shared library, possibly with adjunct Tcl and data files. These have to be bundled into a single file to be a Tcl Module.
The general approach to this is to combine an init script written in Tcl with binary data attached to it, both sections separated by a ^Z character. This is possible since 8.4, where source was changed to read only up to the first ^Z and ignore the remainder of the file, whereas other file and channel operations will see it.
Embedding a Virtual Filesystem
The most obvious way of doing this is the any-kit approach: a small initialization script in front which loads all required supporting packages and then uses them to mount an attached virtual filesystem containing all the other files. After the mount any package specific initialization can be performed, either in the initialization script itself, or in a separate script file stored in filesystem. The latter is the recommended form as it keeps the main initialization script small and package neutral i.e. it will be only filesystem specific, and not package specific. These two tasks are kept separate, which is good design in general, and becomes more important later on as well.
+-------------||------------------------------------+
| Init header ^Z VFS +------------+ +-----...-----+ |
| ^Z | Shared lib | | Other files | |
| ^Z +------------+ +-----...-----+ |
+-------------||------------------------------------+
A concrete example of this are starkits, except that they use this technique to wrap an application into a single file, and not a package.
When interacting with wrapping this approach runs into problems. It is not possible to simply copy the module file into the wrapped application and then use it. The problem is a limitation in most implementations of alternate filesystem: they are not able to mount a virtual file again as a directory, i.e. nested mounting. This however is required when a Tcl Module using the any-kit approach is placed into a wrapped application. It was thought for a while that this could be a limitation in the VFS core of Tcl itself, but further investigation proved this to be wrong. This proof came in the form of TROFS, the Tcl Read Only Filesystem, by Don Porter. This filesystem supports nesting and thus shows that the Tcl core is strong enough for this as well. It is the implementation of a filesystem which determines if nesting is possible or not, and most do not support this.
See also SF bug report [941872] path<->FS function limitations http://sf.net/tracker/?func=detail&aid=941872&group_id=10894&atid=110894 for more details on this and other problems.
There are two ways to work around this limitation, while it exists. These are explained below. However note that even if the limitation is removed we may still run into performance problems because of a file accessed through several layers of file systems, each with its own overhead. The workarounds we are about to discuss will help with this as well by removing layers of indirection and are therefore of general importance.
The standard initialization script of the module is given code to recognize that it is stored in a virtual filesystem, and will copy the whole file to a temporary location and perform the mount on that. This workaround has to be done by every package.
The wrapper application used to create the wrapped application is extended with code which works around the problem. It would basically convert the Tcl Module into a regular package by copying the virtual filesystem in the module as a directory, and adding all the necessary scripts, like "pkgIndex.tcl". Here the separation of filesystem specific from package specific initialization comes into play as well as it makes the unbundling much easier. The generated package index file can simply refer to the same package initialization script as the filesystem specific header of the bundled module.
It should be noted that unbundling is limited to the filesystems which are recognized by the wrapper application. Because of this a combination of this and the previous approach might be best, as it allows the module to function even if the wrapper application was not able to unbundle it.
More a problem of taste might be that Tcl Modules in this form require additional packages which implement the filesystem they use. This can be remedied in the future by adding additional reflection capabilities to the Tcl core which would allow the implementation of channel drivers, channels transformations, and filesystems in pure Tcl, and then implementing simple filesystems based on that. This would also allow the Tcl core itself to make use of filesystems attached to its shared libraries and executables.
Appending a Shared Library
Should the binary package consist of only one shared library we can forgo the use of a full-blown virtual filesystem and simply attach the shared library to the init script as is. Instead of mounting anything the init script just has to copy the library to a temporary place and then "load" it.
+------------------------------||----------------+
| Init script (p name, p size) ^Z Shared library |
+------------------------------||----------------+
Tcl Modules implemented in this way will have no problems when used in a wrapped application as they will always copy their relevant file to the native filesystem before using it.
The disadvantage is that this is not a very general scheme. There are not that much packages which consist of only one shared library and nothing else.
Note: Should we ever get loading of a shared library directly from memory or from a location in another file, then copying the library to the filesystem won't be necessary anymore either.
Appending a Library and a VFS
An extension of the last approach is to attach the virtual file system not to the init script, but the shared library.
+-------------||----------------++---------------------+
| Init script ^Z Shared library // VFS +-----...-----+ |
| ^Z // | Other files | |
| ^Z // +-----...-----+ |
+-------------||----------------++---------------------+
This approach has the same advantages as the last with regard to its interaction with wrapping, i.e no problems, and additionally handles additional files coming with the shared library. The initialization of the VFS happens in the init script, but after the shared library has been loaded.
I have to admit that I am not sure if this will truly work. In essence the library will have to be told about the directory for its files after its C level initialization has been run.
If the VFS is required during the C level initialization then the VFS has to be initialized and mounted from within the shared library, i.e. at the C level. This is not very convenient as we need an embedded Tcl script for this, and that makes the code of the library more complicated than required.
Recommendations
We currently recommend usage of the any-kit approach for binary packages, despite its problem with nested mounting. This approach has an existing implementation in the metakit-based starkits and is thus well tested in general. The other two approaches are currently purely theoretical, with neither any implementation, nor testing.
Regarding Tcl packages no recommendation is necessary as we have in essence only one possibility for the more complex case, the simple concatenation of multiple files into a single one.
Questions
Comments
[ Add comments on the document here ]
Copyright
This document has been placed in the public domain.