SECTION 7 PROPERTY LISTS AND HASH LINKS 7.1 Property Lists Property lists are entities associated with literal atoms, and are stored on cdr of the atom. Property lists are conventionally lists of the form (property value property value ... property value) although the user can store anything he wishes in cdr of a literal atom. However, the functions which manipulate property lists observe this convention by cycling down the property lists two cdrs at a time. Most of these functions also generate an error, ARG NOT LITATOM, if given an argument which is not a literal atom, i.e., they cannot be used directly on lists. The term "property name" or "property" is used for the property indicators appearing in the odd positions, and the term "property value" or "value of a property" or simply "value" for the values appearing in the even positions. Sometimes the phrase "to store on the property --" is used, meaning to place the indicated information on the property list under the property name --. Properties are usually atoms, although no checks are made to eliminate use of non-atoms in an odd position. However, the property list searching functions all use eq. Property List Functions getproplist[atm] if atm is a literal atom, returns property list of atm. Otherwise, generates ARG NOT LITATOM error. getproplist compiles open without any error checks. setproplist[atm;lst] if atm is a non-NIL literal atom, sets property list of atm to be lst, and returns lst as its value. If atm is NIL, generates an ATTEMPT TO RPLAC NIL (unless lst is also NIL). If atm is not a literal atom, generates an ARG NOT LITATOM error. 7.1 1 putprop[atm;prop;val] puts on the property list of atm, the property prop with value val. val replaces any previous value for the property prop on this property list. Generates an error, ARG NOT LITATOM, if atm is not a literal atom. Value is val. addprop[atm;prop;new;flg] adds the value new to the list which is the value of property prop on property list of atm. If flg is T, new is consed onto the front of value of prop, otherwise it is nconced on the end (nconc1). If atm does not have a property prop, the effect is the same as putprop[atm;prop;list[new]], for example, if addprop[FOO;PROP;FIE] is followed by addprop[FOO;PROP;FUM], getprop[FOO;PROP] will be (FIE FUM). The value of addprop is the (new) property value. If atm is not a literal atom, generates an error, ARG NOT LITATOM. remprop[atm;prop] removes all occurrences of the property prop (and its value) from the property list of atm. Value is prop if any were found, otherwise NIL. If atm is not a literal atom, generates an error, ARG NOT LITATOM. changeprop[x;prop1;prop2] Changes name of property prop1 to prop2 on property list of x, (but does not affect the value of the property). Value is x, unless prop1 is not found, in which case, the value is NIL. If x is not a literal atom, generates an error, ARG NOT LITATOM. 2 getprop[atm;prop] gets the property value for prop from the property list of atm. The value of getprop is NIL if atm is not a literal atom, or prop is not found. Note: the value of getprop may also be NIL, if there is an occurrence of prop but the corresponding property value is NIL. ------------------------------------------------------------------------ 1 listput, listput1, listget, and listget1 are functions similar to putprop and getprop that work directly on lists. they are described in Section 5. 2 getprop used to be called getp. 7.2 Note: Since getprop searches a list two items at a time, the same object can be used as both a property name and a property value, e.g., if the property list of atm is (PROP1 A PROP2 B A C), then getprop[atm;A] = C. getlis[x;props] searches the property list of x, and returns the property list as of the first property on props that it finds e.g., if the property list of x is (PROP1 A PROP3 B A C), getlis[x;(PROP2 PROP3)]=(PROP3 B A C) Value is NIL if no element on props is found. x can also be a list itself, in which case it is searched as above. deflist[l;prop] is used to put values under the same property name on the property lists of several atoms. l is a list of two-element lists. The first element of each is a literal atom, and the second element is the property value for the property prop. The value of deflist is NIL. Note: Many atoms in the system already have property lists, with properties used by the compiler, the break package, DWIM, etc. Be careful not to clobber such system properties. The value of sysprops gives the complete list of the property names used by the system. 7.2 Hash Links The description of the hash link facility in Interlisp is included in the chapter on property lists because of the similarities in the ways the two features are used. A property list provides a way of associating information with a particular atom. A hash link is an association between any Interlisp pointer (atoms, numbers, arrays, strings, lists, et al) called the hash-item, and any other Interlisp pointer called the hash-value. Property lists are stored in cdr of the atom. Hash links are implemented by computing an address, called the hash-address, in a specified array, called the hash-array, and storing the hash-value and the hash-item into the cell with that address. The contents of that cell, i.e. the hash-value and hash-item, is then called 3 the hash-link. Since the hash-array is obviously much smaller than the total number of ------------------------------------------------------------------------ 3 The term hash link (unhyphenated) refers to the process of associating information this way, or the "association" as an abstract concept. 7.3 4 possible hash-items, the hash-address computed from item may already 5 contain a hash-link. If this link is from item, the new hash-value simply replaces the old hash-value. Otherwise, another hash-address (in the same hash-array) must be computed, etc, until an empty cell is 6 found, or a cell containing a hash-link from item. When a hash link for item is being retrieved, the hash-address is computed using the same algorithm as that employed for making the hash link. If the corresponding cell is empty, there is no hash link for item. If it contains a hash-link from item, the hash-value is returned. 7 Otherwise, another hash-address must be computed, and so forth. Note that more than one hash link can be associated with a given hash-item by using more than one hash-array. Hash Link Functions In the description of the functions below, the argument array has one of three forms: [1] NIL, in which case the hash-array provided by the 8 system, syshasharray, is used; [2] a hash-array created by the function harray; or [3] a list car of which is a hash-array. The latter form is used for specifying what is to be done on overflow, as described below. harray[n] creates a hash-array of size n, equivalent to clrhash[array[n]]. ------------------------------------------------------------------------ 4 which is the total number of Interlisp pointers, i.e. in Interlisp- 10, 256K. 5 eq is used for comparing item with the hash-item in the cell. 6 After a certain number of iterations (the exact algorithm is complicated), the hash-array is considered to be full, and the array is either enlarged, or an error is generated, as described below in the discussion of overflow. 7 For reasonable operation, the hash array should be ten to twenty percent larger than the maximum number of hash links to be made to it. 8 syshasharray is not used by the system, it is provided solely for the user's benefit. It is initially 512 words large, and is automatically enlarged by 50% whenever it is "full". See page 7.6. 7.4 clrhash[array] sets all elements of array to 0 and sets left half of first word of header to -1. Value is array. puthash[item;val;array] puts into array a hash-link from item to val. Replaces previous link from same item, if any. If val=NIL any old link is removed, (hence a hash-value of NIL is not allowed). Value is val. gethash[item;array] finds hash-link from item in array, and returns the hash-value. Value is NIL, if no link exists. gethash compiles open. Note that gethash makes no legality checks on either argument. rehash[oldar;newar] hashes all items and values in oldar into newar. The two arrays do not have to be (and usually aren't) the same size. Value is newar. maphash[array;maphfn] maphfn is a function of two arguments. For each hash-link in array, maphfn will be applied to the hash-value and hash-item, e.g. maphash[a;(LAMBDA(X Y) (AND(LISTP Y) (PRINT X)))] will print the hash-value for all hash-links from lists. The value of maphash is array. dmphash[arrayname] Nlambda-nospread that prints on the primary output file a loadable form which will restore what is in the hash-array specified by arrayname, e.g. (E (DMPHASH SYSHASHARRAY)) as a prettydef command will dump the system hash-array. Note: all eq identities except atoms and small integers are lost by dumping and loading because read will create new structure for each item. Thus if two lists contain an eq substructure, when they are dumped and loaded back in, the corresponding substructures while equal 9 are no longer eq. ------------------------------------------------------------------------ 9 circlprint and circlmaker (Section 21) provide a way of dumping and reloading structures containing eq substructures so that these identities are preserved. 7.5 Hash Overflow By using an array argument of a special form, the user can provide for automatic enlargement of a hash-array when it overflows, i.e., is full and an attempt is made to store a hash link into it. The array argument is either of the form [1] (hash-array . n), n a positive integer; [2] (hash-array . f), f a floating point number; [3] (hash-array); or [4] (hash-array . fn), fn a function name lambda expression. In the first case, a new hash-array is created with n more cells than the current hash-array. In the second case, the new hash array will be f times the size of the current hash-array. The third case, (hash-array), is equivalent to (hash-array . 1.5). In the fourth case, (hash- array . fn), fn is called with (hash-array . fn) as its argument. If fn returns a number, the number will be the size of the new hash array. Otherwise, the new size defaults to 1.5 times the size of the old hash array, e.g. fn could be used to print a message, or perform some monitor function. In each case, the new hash-array is rplacaed into the dotted pair, and the computation continues. If a hash-array overflows, and the array argument used was not one of these three forms, the error HASH TABLE FULL is generated, which will either cause a break or unwind to the last errorset, as per treatment of errors described in Section 16. The system hash array, syshasharray, is automatically enlarged by 1.5 when it is full. 7.6