Update of "ZIP virtual file system"
Not logged in

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview

Artifact ID: 794b224e3948682ee7293cd5e296700fe2ec075f
Page Name:ZIP virtual file system
Date: 2020-10-24 05:34:01
Original User: chw
Parent: 5f2735dc004f8e10d18f6fc30bb360462cda24fb
Content

ZIPFS

AndroWish comes with a special ZIP virtual file system which uses mmap(2) to read-only map a ZIP file (in this case AndroWish's APK, i.e. its own installation package) into the process address space to speed up startup time and subsequent read accesses. While this file system was designed primarily for AndroWish it can be used on other platforms, too. Namely, undroidwish uses it on Windows and Linux to mount an archive of Tcl and native extensions which is appended to the executable portion of its binary. It is implemented in the files zipfs.c and zipfs.h in AndroWish's .../jni/tcl/generic folder and enabled in the Tcl core by the presence of the C preprocessor macro ZIPFS_IN_TCL.

Low-level C interface

Tclzipfs_Init(Tcl_Interp *interp)

Performs one-time initialization of the file system and registers it process wide. Additionally, a package named zipfs is provided and supplemental Tcl commands are created in the given interpreter.

Tclzipfs_Mount(Tcl_Interp *interp, const char *zipname, const char *mntpt, const char *passwd)

Mounts the ZIP archive file zipname on the mount point mntpt using the optional ZIP password passwd. Errors during that process are reported in the interpreter interp. If zipname is a NULL pointer, information on all currently mounted ZIP file systems is written into interp's result as a sequence of mount points and ZIP file names.

Tclzipfs_MountBuffer(Tcl_Interp *interp, const char *mntpt, unsigned char *data, int length, int copy)

Mounts the ZIP archive contained in the memory buffer described by data and length on the mount point mntpt. Depending on copy a private copy of this memory buffer is made and used for the mount operation. Errors during that process are reported in the interpreter interp. If the mount operation succeeds, a string of the form "memory_<size>_<id>" is left in interp’s result identifying the archive from the memory buffer. This information is useful as zipname parameter in a later unmount operation. If mntpt is a NULL pointer, information on all currently mounted ZIP file systems is written into interp's result as a sequence of mount points and ZIP file names.

Tclzipfs_Unmount(Tcl_Interp *interp, const char *zipname)

Undoes the effect of Tclzipfs_Mount(), i.e. unmounts the mounted ZIP archive file zipname. Errors are reported in the interpreter interp.

Tcl commands

The zipfs package provides Tcl with the ability to mount the contents of a ZIP file as a virtual file system.

zipfs::exists filename

Return 1 if the given filename exists in the mounted zipfs and 0 if it does not.

zipfs::find dir

Recursively lists files including and below the directory dir. The result list consists of relative path names starting from the given directory. This command is also used by the zipfs::mkzip and zipfs::mkimg commands.

zipfs::info file

Return information about the given file in the mounted zipfs. The information consists of (1) the name of the ZIP archive file that contains the file, (2) the size of the file after decompression, (3) the compressed size of the file, and (4) the offset of the compressed data in the ZIP archive file.
Note: querying the mount point gives the start of ZIP data offset in (4), which can be used to truncate the ZIP info off an executable.
Note: the file of a mounted ZIP archive appears as directory but can be opened and read like a regular file if the mount process detected a non archive area in front of the ZIP archive, e.g. when the ZIP archive was appended to an executable file. In this case that area can be read using the Tcl open and read commands but file copy treats the mounted archive as a directory.

zipfs::list ?-glob|-regexp? ?pattern?

Lists files of any or all of the mounted ZIP archives. If pattern is omitted all files are listed. Otherwise pattern is interpreted as a glob or regexp pattern and used to list only files matching this pattern.

zipfs::lmkimg outfile inlist ?password? ?infile?

Like zipfs::mkimg but instead of an input directory inlist must be a list where the odd elements are the original input file names as copied into the archive and the even elements their respective names within the archive.

zipfs::lmkzip outfile inlist ?password?

Like zipfs::mkzip but instead of an input directory inlist must be a list where the odd elements are the original input file names as copied into the archive and the even elements their respective names within the archive.

zipfs::mkimg outfile indir ?strip? ?password? ?infile?

Create an image (potentially a new executable file) similar to zipfs::mkzip. If the infile parameter is specified, this file is prepended in front of the ZIP archive, otherwise the file returned by Tcl_NameOfExecutable(3) (i.e. the executable file of the running process) is used. If the password parameter is not empty, an obfuscated version of that password is placed between the image and ZIP chunks of the output file and the contents of the ZIP chunk are protected with that password.
Caution: highly experimental, not usable on Android, only partially tested on Linux and Windows.

zipfs::mkkey password

For the clear text password argument an obfuscated string version is returned with the same format used in the zipfs::mkimg command.

zipfs::mkzip outfile indir ?strip? ?password?

Creates a ZIP archive file named outfile from the contents of the input directory indir (contained regular files only) with optional ZIP password password. While processing the files below indir the optional prefix given in strip is stripped off the beginning of the respective file name.
Caution: the choice of the indir parameter (less the optional strip prefix) determines the later root name of the archive's content.

zipfs::mount ?zipfile ?mountpoint? ?password?

zipfs::mount -file zipfile mountpoint ?password?

zipfs::mount -- zipfile mountpoint ?password?

This command mounts a ZIP archive file as a VFS. After this command executes, files contained in zipfile will appear to Tcl to be regular files at the mount point.
In the first command form, with no mountpoint, returns the mount point for zipfile. With no zipfile, return all zipfile/mount point pairs. If mountpoint is specified as an empty string, the mount point will be the current directory. If password is specified, files from zipfile are decrypted using this password when read.

zipfs::mount -data bytearray mountpoint

The data in bytearray must represent a ZIP archive which gets mounted on mountpoint. If the mount operation succeeds, the result is a string of the form "memory_<size>_<id>" which can later be used as zipfile parameter in an unmount operation.

zipfs::mount -chan channelId mountpoint

A ZIP archive is read from channel channelId and mounted on mountpoint. If the mount operation succeeds, the result is a string of the form "memory_<size>_<id>" which can later be used as zipfile parameter in an unmount operation.

zipfs::unmount zipfile

Unmounts the mounted ZIP archive file zipfile.

zipfs::unwrap ?filename?

If filename is the root of a mounted ZIP archive its content is unpacked to a local directory named filename.vfs. This directory must not exists prior to the call. Otherwise, filename is temporarily mounted before the unpack operation takes place and unmounted afterwards. If filename is omitted the result of info nameofexecutable is used instead, i.e. the main ZIP archive of the running process is unpacked.

The commands described above are available as subcommands in the zipfs ensemble, i.e. zipfs list is equivalent to zipfs::list.

zipfs as Tcl (and Tk) bootstrap file system

On the Android platform zipfs is used to boot Tcl/Tk from the APK by early mounting the APK file on the file system root as seen by Tcl. Since nearly all relevant files within the APK are below the assets folder, this lets Tcl see the directory /assets with its library directories, e.g. the /assets/tcl8.6 directory with Tcl's library modules, encoding tables etc. That relationship to /assets/tcl8.6 is hard coded into the Tcl shared library and based on it all other packaged library directories can be found during Tcl initialization.

For standalone apps a similar approach is chosen by hard coding the file /assets/app/main.tcl as the file to be sourced (if present) right after Tcl's initialization. This allows for packaging Tcl based apps as an APK, see the description in AndroWish SDK for instructions.

On other platforms (currently tested Linux and Windows) the initial mount of an embedded ZIP file system is done on the executable itself, e.g. if /home/john/awish is the Tcl/Tk binary with an included ZIP file system, the Tcl library directory of the file system when mounted becomes /home/john/awish/tcl8.6. Similarly, built in application code will be started from the file /home/john/awish/app/main.tcl if present. Additionally, the contents of the optional file /home/john/awish/app/cmdline are appended to the command line before Tk is initialized and control is transferred to the main.tcl script. This is useful to setup certain aspects of SDL, e.g. to start in full screen mode with or without changed display resolution (see description of SDL startup options in Beyond AndroWish). Another hook is /home/john/awish/app/icon.bmp which (if present) should be a Windows BMP 24 bit RGB bitmap file used as the icon for the SDL root window.

On Windows platforms the drive letter of the base executable is prepended to the respective path names. For the example above this means: C:\home\john\awish.exe is the binary, C:/home/john/awish.exe/tcl8.6 becomes the Tcl library directory, C:/home/john/awish.exe/app/main.tcl is the optional application script, and so on.

For a small sample script refer to Make minimal vanillawish binary.

Some delicate implementation details

For loading binary Tcl extensions (shared libraries) on certain platforms (Linux and FreeBSD) special handling is tried to be carried out:

For improving glob operations the ZIP virtual file system uses two hash based data structures: ZipEntry for regular files and ZipDirEntry for directories which additionally contains a hash table to accelerate lookups in this directory. For typical searches, this usually outperforms the native OS functions.