TIP 529: Add metadata dict property to tk photo image

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Author:         Harald Oehlmann <oehhar@users.sourceforge.net>
State:          Draft
Type:           Project
Vote:           Pending
Created:        07-Dec-2018
Keywords:       Tk, image
Tcl-Version:    8.7
Tk-Branch:     tip-529-image-metadata

Abstract

An additional property is proposed for photo images to hold a dictionary with image metadata:

myimage cget -metadata
myimage configure -metadata [dict create DPI 300.0]

The content of the dictionary is initialized on image load and used on image save.

Rationale

Image files may contain a lot of metadata like resolution, comments, GPS location etc. This metadata should be accessible and setable for the following aims:

  • Make it available after image load on script level
  • Make it setable within the image
  • Write its data to the image file.

image resolution

This TIP specially targets the resolution (DPI) value of the image.

The image resolution included in an image file is crucial for its usage, as many applications (word & co.) use this field to calculate a default size. One may imagine, that image files used in pdf4tcl are automatically scaled at the right resolution (e.g. the resolution saved in the image file).

This information is included in png files (supported by tk core) and many other image formats included in the Img patch.

I authored an extension to the Img patch to specify the dpi field of a bmp file on file writing. The syntax was accorded with Jeff Hobbs:

myimage write file.bmp -format [list bmp -resolution 300 i]

This may be expressed (when all packages are adopted) by:

myimage configure -metadata [dict create DPI 300.0]
myimage write file.bmp

Comment data

A comment may be used to save custom data in the image file.

An example is a vision automation project where a test procedure is connected to each image. My solution is to use GIF images and to store the test procedure (a TCL script) in the gif comment.

Preview extension for new command "image find"

The match functions should also be able to output the metadata dict. This is due to the plan by Paul Obermeier to make the match function alone available on the script level. See the discussion section for the message.

Specification

Metadata Dict

The propery "-metadata" is added to each image. It contains a dictionary, where the keys of the dictionary are specific to each photo image format.

The following default keys are proposed:

Key Description Example image formats
DPI Horizontal Image resolution in DPI (double) png
aspect Aspect ratio horizontal/vertical (double) png,gif
comment Text comment png, gif

Comments on the key choice:

  • Abreviation are in upper case
  • Words are in Americal English in small case (except propper nouns)
  • Vertical DPI is expressed as DPI/aspect. The reason is, that some image formats may feature aspect and no resolution value.

It is valid to set any key within the application. Any unknown key should be ignored by the application and image format drivers.

If a particular image does not specify any keys (whether during creation or otherwise) then the dictionary will be empty.

Each photo image format driver may define additional keys and may decide to use them for input (as a parameter for image read and/or image write), output (as an image read result) or both.

The TIP implementation does not target to implement all possible keys of all image formats for reading and writing imediately. Image key format may grow over time on a use-case basis.

Commands

The following commands are extended by a -metadata parameter:

image create photo myimage -metadata $metadict
myimage cget -metadata
myimage configure -metadata $metadict
myimage data -metadata $metadict
myimage put -metadata $metadict
myimage read -metadata $metadict
myimage write -metadata $metadict

Any image format handler may use the content of the metadata dict. This may be an ongoing process, specially within the Img patch.

Here is an overview, which command reads or sets the metadata dict:

Command Reads current image metadata dict reads command options metadata dict Writes current image metadata dict Driver data merged in
image create no yes yes yes
myimage cget yes no no no
myimage configure yes (1) yes yes yes
myimage put no yes no no
myimage read no yes no no
myimage data yes (1) yes no no
myimage write yes (1) yes no no

Footnotes:

(1) The current metadata is ignored if a metadata dict is given as command parameter.

(3) The current metadata is ignored by the image export if a metadata dict is given as command parameter.

Each command is now discussed within a subchapter:

image create

Image create will parse the image data and create the metadata dict of the image.

As an example, a gif file with a comment would create a comment metadata key within the image:

% image create photo myimage -format GIF -file testwithcomment.gif
% myimage cget -metadata
Comment "This is the image comment"

A metadata dict given on the command line will be merged with the parsed metadata dict with priority to the file metadata. This allows to specify default values for keys which should be present.

An example with the same image as above:

% image create photo myimage -format GIF -file testwithcomment.gif\
   -metadata [dict create User A Comment "Comment from command line"]
% myimage cget -metadata
User A Comment "This is the image comment"

myimage cget

The metadata dict may be retrieved by:

myimage cget -metadata

myimage configure

The metadata dict of the image may be overwritten by:

myimage configure -metadata [dict create Comment "Comment from cconfigure"]

The image data is not touched and no image data interpretation is triggered.

The retrival methods will return the metadata dict as for any other option:

% myimage configure -metadata
-metadata {} {} {} {Comment "Current comment"}

Setting one of the -format, -data or -file option to a different value will recreate the image with the new parameters. In this case, an eventually present -metadata option will first replace the present metadata of the image. Then, the image recreation will take place (using an eventually specified metadata dict) and may add keys to the image metadata dict.

It is not possible to trigger an image recreation by just specifying a metadata dict. This is to avoid unneeded image recreation.

Note: parameters to change the rendered image should use the -format option. The metadata may provide additional data.

When the image is rendered again due to a change of the options -file, -data or -format, the following procedure applies:

  • The current image metadata is replaced by an eventually specified metadata.
  • The image driver is called with the resulting metadata to render the image.
  • Any metadata key outputted by the image driver rendering is set in the image metadata. In consequence, we get a merge of the current and the generated metadata.

myimage put

The put command sets (parts of) the image data by specified new image data.

The -metadata property of the image is not changed. This is consistent of other parameters like -format.

To replace the whole image including metadata, the configure command may be used by setting the -data option.

Example with gif data containing a comment:

% image create myimage -metadata [dict create Comment "Comment from image create"]
% myimage put $GIFWithCommentData
% myimage cget -metadata
Comment "Comment from image create"

A -metadata option may be specified to support the image read. Nevertheless, this metadata is not included in the metadata property of the image.

Example:

% image create myimage -metadata [dict create Comment "Comment from image create"]
% myimage put $GIFWithCommentData -metadata [dict create Comment "Comment from put command line"]
% myimage cget -metadata
Comment "Comment from image create"

myimage read

The read command sets (parts of) the image data by new image data read from a file. This command acts like the put command with the difference, that the image data comes from a file.

The -metadata property of the image is not changed. This is consistent of other parameters like -format.

To replace the whole image including metadata, the configure command may be used by setting the -file option.

Example with a gif file containing a comment:

% image create myimage -metadata [dict create Comment "Comment from image create"]
% myimage read gifwithcomment.gif
% myimage cget -metadata
Comment "Comment from image create"

A -metadata option may be specified to support the image read. Nevertheless, this metadata is not included in the metadata property of the image. There is currently no practical application for this, but there might be examples which use that.

Example:

% image create myimage -metadata [dict create Comment "Comment from image create"]
% myimage read test.gif -metadata [dict create Comment "Comment from put command line"]
% myimage cget -metadata
Comment "Comment from image create"

myimage data

The data command writes the image data into a variable.

If the image formats supports a specified metadata key, it is included in the output file.

If a -metadata option is given, the metadata property of the image is ignored. Otherwise, the metadata property of the image is used.

Example to write a comment in gif data included in the image properties:

% image create myimage -file test.png -metadata [dict create Comment "Comment from image create"]
% myimage data -format "GIF"
... GIF data with comment included

Example to specify the comment with the command options

% image create myimage -file test.png
% myimage data -format "GIF"-metadata [dict create Comment "Comment from data command"]
... GIF data with comment included

myimage write

The write command writes the image data to a file. With respect to metadata, it works the same way as the data command.

Example to write a metadata comment:

% image create myimage -file test.png
% myimage write GifwithComment.gif -format "GIF"-metadata [dict create Comment "Comment from write command"]
... GIF data with comment included

Notes on Options to image and metadata creation

The metadata is not suited to pass processing options to the driver. For this aim, options should be added to the "-format" option.

In contrast, a driver may understand options passed by the "-format" option to modify its metadata processing.

Lets try the following imaginary example: An image driver contains a full EXIF parser which creates many keys as output. This processing is expensive in processing time and output data creation. In consequence, the driver creator may decide to only create the EXIF output on a given option:

image create photo photo.jpg -format "jpg -exif 1"

Image format driver interface

The image format driver interface is changed in the following aspects:

Pass metadata dict as parameter

Each driver function gets a tcl object pointer "metadataIn" as parameter. This parameter serves to input a metadata dict to the driver function. It may be NULL to flag that the metadata dict is empty.

A typical driver code snipped to check for a metadata key is:

if (NULL != metadataIn) {
    Tcl_Obj *itemData;
    Tcl_DictObjGet(interp, metadataIn, Tcl_NewStringObj("Comment",-1), &itemData));

The receiving of the metadata by the format driver functions is only necessary for the Write functions to fulfill strictly the objective of the TIP. Nevertheless, it is implemented the same way as the -format parameter which is available to all functions. The reason is to support use cases not specified jet. My feeling is, that it would be sad to limit the functionality by not passing the metadata to all format driver functions.

Receive a metadata dict from the driver (FileRead,StringRead)

The image match and read functions (FileMatch, StringMatch, FileRead, StringRead) may set keys in a prepared metadata dict to return them. Those function get an additional tcl object pointer as "metadataOut" as parameter.

This parameter may be NULL to indicate, that no metadata return is attended (put, read subcommands).

This parameter is initialized to an empty unshared dict object if metadata return is attended (image create command, configure subcommand). The driver may set dict keys in this object to return metadata.

A sample driver code snippet is:

if (NULL != metadataOut) {
    Tcl_DictObjPut(NULL, metadataOut, Tcl_NewStringObj("XMP",-1), Tcl_NewStringObj(xmpMetadata);

Image format driver interface

For image format drivers, a new registration procedure is proposed which includes functions with the new parameters. In addition, the parameters are reordered to always have the order interp, input parameter, output parameter, accillary functions.

The new stubs enabled function is:

void Tk_CreatePhotoImageFormatVersion3(const Tk_PhotoImageFormatVersion3 *formatPtr)

The function parameters in Tk_PhotoImageFormatVersion3 are as follows:

int (Tk_ImageFileMatchProcVersion3) (Tcl_Interp *interp, Tcl_Channel chan,
        const char *fileName, Tcl_Obj *format, Tcl_Obj *metadataIn, int *widthPtr,
        int *heightPtr, Tcl_Obj *metadataOut,);

int (Tk_ImageStringMatchProcVersion3) (Tcl_Interp *interp, Tcl_Obj *dataObj,
        Tcl_Obj *format, Tcl_Obj *metadataIn, int *widthPtr, int *heightPtr,
        Tcl_Obj *metadataOut);

int (Tk_ImageFileReadProcVersion3) (Tcl_Interp *interp, Tcl_Channel chan,
        const char *fileName, Tcl_Obj *format, Tcl_Obj *metadataIn,
        Tk_PhotoHandle imageHandle, int destX, int destY, int width,
        int height, int srcX, int srcY, Tcl_Obj *metadataOut);

int (Tk_ImageStringReadProcVersion3) (Tcl_Interp *interp, Tcl_Obj *dataObj,
        Tcl_Obj *format, Tcl_Obj *metadataIn, Tk_PhotoHandle imageHandle,
        int destX, int destY, int width, int height, int srcX, int srcY,
        Tcl_Obj *metadataOut);

int (Tk_ImageFileWriteProcVersion3) (Tcl_Interp *interp, const char *fileName,
        Tcl_Obj *format, Tcl_Obj *metadataIn, Tk_PhotoImageBlock *blockPtr);

int (Tk_ImageStringWriteProcVersion3) (Tcl_Interp *interp, Tcl_sObj *format,
        Tcl_Obj *metadataIn, Tk_PhotoImageBlock *blockPtr);

Documentation

The manual page of "Tk_CreatePhotoImageFormat" will describe the new version 3 interface. The following remark about the life cycle of the interfaces is added in a similar way as the current remark about the version 1 (old) interface:

Version 3 format driver interface is recommended for new projects. Expect version 2 format driver interface and the command Tk_PhotoImageFormat to be removed with Tk 9.0.

In addition, it might be a good idea to speak about version 1 (old), version 2 and version 3 interface.

Implementation

Implementation is in branch "tip-529-image-metadata". The following metadata keys are implemented:

  • gif: comment
  • png: DPI, ratio

Thanks to Paul Obermaier, the TkImg package has implemented the new interface and uses it currently for DPI setting and reporting.

A set of test cases is included in the implementation. In addition, the TkImg package features additional tests for this patch which exercise additional features like stub table, image parameters. The TkImg patch links currently against the optional branch (see below), not against the branch tag tip-529-image-metadata.

It was necessary to fix the file tests/earth.gif as the file is incomplete.

Discussion

image find command planned by Paul Obermeier

What about extending and exposing the functionality of the MatchProc function at the Tcl level? That way it would be possible to implement a command like "image info ", where you can retrieve the image size, resolution and additional metadata without explicitely loading the image. In my image browser (http://www.posoft.de/html/poImgBrowseShots.html#Img1) I am currently using a modified version of the Tcllib fileutil::fileType procedure to extract the image size without creating a photo image. Getting that information (and additional metadata) directly from the C-based image parsers would be faster and there would be no need to code that functionality twice.

HaO: This functionality is prepared by the possibility that the match driver functions also may output metadata.

Update DPI metadata property on image script

Paul Obermeier has made the following proposal:

How do you want to handle the physical resolution (DPI) in the case of image scaling? Just keep the original DPI value or adjust the DPI values automatically, maybe using an option.

HaO: Currently, this is not prevued and may be implemented by another TIP.

No use of other optional features by Paul Obermeier

I asked Paul, if he sees any use of the optional features below for him or TkImg. The answer was: no, I don't see any use.

Rejected Alternatives

Within the last two years development process, the following additional ideas were implemented in addition. They are all implemented in the tk branch "tip-529-image-metadata-optional". They are not included in the TIP and not included in the main implementation. People may speak up to get any feature back in the main feature branch.

The optional features are:

  • Implement XMP metadata type for gif.
  • Optimize the SVG processing to store the preparsed binary blob in the metadata. This blob may be used for fast scaling of the image.
  • Optimize driver internal communication: provide a DString memory to the image format driver to pass data from the match functions to the read functions.
  • Optimize file access: allow the image driver to indicate, that the file is not needed any more after the match call.

The following subchapter discuss the optional features. The format driver interface with all options is shown in a following additional subchapter.

Another rejected alternative to use only one metadata pointer in the interface is following as last subchapter.

XMP data

Photo images may contain an XMP data structure which may hold structured data. The aim is to make this data accessible. The parsing of the XML structure is not part of this TIP and may be done by other packages.

The metadata key is:

Key Description Example image formats
XMP xmp image data gif,png

XMP support is implemented for gif format. Due to missing use case, it was removed from the main branch.

SVG optimization by a metadata key holding the preparsed svg blob

Rationale

The application is within the current SVG implementation included in Tk8.7a3.

The used svg routines from the nanosvg project split svg processing into two steps:

  • Step 1: transform the xml data to a binary representation of the splines
  • Step 2: render the splines to an image presentation

When svg files are loaded by:

image create photo i1 -file test.svg -format {svg -scaletoheight 16}

then the file is accessed and the xml data loaded and processing step 1 and 2 is performed.

When the image is scaled by:

i1 configure -file "" -data {<svg source="metadata" >} -format {svg -scaletoheight 32}

then the same steps are performed as on image load, while only step 2 would be necessary. The performance is poor and the file must still be available.

The idea is to store the binary representation of the splines (result of processing step 1) as a key in the -metadata dict (say SVGBLOB) and to achieve to only perform step 2 on scaling.

In addition, svg image may even be "compiled" to the metadata structure, so the following command may work:

image create i1 -metadata {SVGBLOB ...}  -data {<svg source="metadata" >}-format {svg -scaletoheight 32}

This will only work within the same patchlevel of TK on the same architecture (endianess, int size) , as the format may change, but it may be useful for example when packing to a starkit.

In my talk on ETCL 2019, I showed an Android GUI where buttons may be scaled by a pinch to zoom gesture. The current performance is quite poor.

Specification

The svg driver returns a preparsed image blob in the metadata key "SVGBLOB". This data is used as image data, if the -data parameter contains the string "<svg data="metadata" />".

A sample script is as follows:

image create photo foo -data $svggradient -format svg
foo configure -file "" -data "\<svg data=\\"metadata\\" />" -format "svg -scale 2"

Internally, the driver uses the driver internal DString to communicate between the match and read functions.

The output of the svg parser is serialized in arrays and put into the memory block. All functions of the rendering functions dealing with the input data are changed to use the array.

Discussion

I see only a small speed-up (around 10%) by this solution. In addition, the version without this option is as fast as the optimized version. So, we have a slowdown for the normal case and no gain for the optimized version.

My test script is as follows:

  • take the file from [https://svgstudio.com/pages/free-sample]
  • use the following script:

image create photo foo -file Freesample.svg -format svg proc switch {} { foo configure -format {svg -scale 2} foo configure -format {svg -scale 1} } time switch 100 40139.367 microseconds per iteration

* now activate the use of the metadata

foo configure -file "" -data "<svg data=\"metadata\" />" -format "svg"
38641.935 microseconds per iteration

For me, even removing the file operations and the parsing should be a magnitude faster. But the facts are different. Apparently copying all this data around in addition takes a lot of time. And the svg-nano parser is super fast. And most processing time is taken by image rendering.

Match and read function communication memory

The match functions and the image read functions get an additional parameter "driverInternalPtr" which points to an initialized DString. The DString is cleared by the framework.

Using this DString, the driver match function may pass data to its read function.

The rationale is the current implementation of the SVG driver:

  • The driver currently uses ThreadSpecific data to pass data from the match procedure to the read procedure. Due to that, a more simple alternate possibility is proposed.

Flag that the match function does not need the channel any more

The driver file match function may flag, that it does not need the channel any more. Only in this case, the additional output int "closeChannel" should be set to 1. In this case, a NULL driver is passed to the read driver function.

The rationale is the current implementation of the SVG driver:

  • The driver does not need the file any more after the match procedure. Thus, any preparing file operations (seek etc) may be omitted and a NULL channel may be passed.

Format driver interface with all options

The function parameters in Tk_PhotoImageFormatVersion3 are as follows:

int (Tk_ImageFileMatchProcVersion3) (Tcl_Interp *interp, Tcl_Channel chan,
    const char *fileName, Tcl_Obj *format, Tcl_Obj *metadataIn, int *widthPtr,
    int *heightPtr, Tcl_Obj *metadataOut, int *closeChannelPtr,
    Tcl_DString *driverInternalPtr);

int (Tk_ImageStringMatchProcVersion3) (Tcl_Interp *interp, Tcl_Obj *dataObj,
    Tcl_Obj *format, Tcl_Obj *metadataIn, int *widthPtr, int *heightPtr,
    Tcl_Obj *metadataOut, Tcl_DString *driverInternalPtr);

int (Tk_ImageFileReadProcVersion3) (Tcl_Interp *interp, Tcl_Channel chan,
    const char *fileName, Tcl_Obj *format, Tcl_Obj *metadataIn,
    Tk_PhotoHandle imageHandle,
    int destX, int destY, int width, int height, int srcX, int srcY,
    Tcl_Obj *metadataOut, Tcl_DString *driverInternalPtr);

int (Tk_ImageStringReadProcVersion3) (Tcl_Interp *interp, Tcl_Obj *dataObj,
    Tcl_Obj *format, Tcl_Obj *metadataIn, Tk_PhotoHandle imageHandle,
    int destX, int destY, int width, int height, int srcX, int srcY,
    Tcl_Obj *metadataOut, Tcl_DString *driverInternalPtr);

int (Tk_ImageFileWriteProcVersion3) (Tcl_Interp *interp, const char *fileName,
    Tcl_Obj *format, Tcl_Obj *metadataIn, Tk_PhotoImageBlock *blockPtr);

int (Tk_ImageStringWriteProcVersion3) (Tcl_Interp *interp, Tcl_sObj *format,
    Tcl_Obj *metadataIn, Tk_PhotoImageBlock *blockPtr);

Single metadata parameter for input and output

A first approach was to use one metadata parameter to the format driver functions which allowed combined input and output. The properties are:

  • no new image format definition required.
  • it is not possible to inform the driver routines, that no metadata output is expected.
  • the image driver function must take care about shared object and create a copy on modification. Thus, a pointer to an object pointer must be passed.
  • a metadata must be prepared even if there is no metadata and cleaned after each match round.

This solution was not chosen due to the complicated way to set a metadata dict by the format driver functions. In addition, it is seen as valueable, that the information "no metadata output, please" may be transmitted.

The implementation is in branch "tip-529-image-metadata-jan".

Copyright

This document has been placed in the public domain.

History