TIP 421: A Command for Iterating Over Arrays

Login
Bounty program for improvements to Tcl and certain Tcl packages.
State:		Final
Type:		Project
Tcl-Version:	8.7
Vote:		Done
Post-History:
Author:		Karl Lehenbauer <karllehenbauer@gmail.com>
Author:		Brad Lanam <brad.lanam.comp@gmail.com>
Author:		Donal K. Fellows <dkf@users.sf.net>
Created:	28-Nov-2012
Updated:	24-Oct-2017
For:		DKF, AF, JN, SL, KBK, DGP, AK
Against:	none
Present:	none
Tcl-Branch:     tip-421

Abstract

This TIP proposes an efficient mechanism for iterating over the contents of a large array.

Rationale

Tcl currently provides three main mechanisms for iterating over the contents of an array, but none are quite perfect when dealing with a large array.

  • array get is simple to use (especially with a two-variable foreach) but requires the contents of the array to be effectively duplicated; even with the use of the Tcl_Obj system for value reference management, this is an expensive operation.

  • array names (with a simple foreach) is also relatively simple to use, but requires producing a list whose size is the same as the number of elements of the array. (This is half the size that would be required with array get, but can still be large.)

  • array startsearch et al. provide a memory-efficient general iteration mechanism, but in a way that is rather difficult to use. It is also subject to significant hazards if the array is modified during iteration (a particular problem for the global env array, as that is regenerated on almost any read).

The authors propose that there be a new subcommand of array which allows for efficient iteration over an array's elements.

Proposed Change

There should be a new command, array for, that has this syntax:

array for {keyVar valueVar} arrayName body

This will iterate internally over the elements of the array called arrayName in array-iteration order (i.e., the same as that used by the other array subcommands), setting the variable keyVar to the name of the element and the variable valueVar to the content of the element before evaluating the script body. The result will be the empty string (excepting errors, return, etc.) and any contained break and continue will have their normal interpretation as loop control operations.

Modifying the array while this command is iterating over it will make the array for command terminate with an error once the current execution of body finishes.

Implementation

Branch: tip-421 (Code is also available here within a comment, but that is not the definitive version.)

The implementation is no faster then using array names or array get, but uses much less memory.

References

Bounty #12, Bounty #27

Copyright

This document has been placed in the public domain.

History