Arc Forumnew | comments | leaders | submitlogin
2 points by almkglor 5725 days ago | link | parent

I made a suggested implementation of lists as unrolled lists which preserves the use of 'scdr on the Arc Forum a long time ago. . Basically it would be effectively arrays.

1 point by almkglor 5725 days ago | link

The problem with the proposed implementation however is that the "cons pointer" is a data structure of at least two cells (a pointer to the root of the underlying array, which includes a pointer to the cdr-table somewhere, and a pointer to the actual entry within the array).

However recently I thought of a method by which all that is necessary would be a (probably tagged) pointer to the actual entry within the array. This would require tagged pointers (i.e. no Boehm-Weiser!).

Basically we would define a tagged pointer type which, when found in a list-as-array, would not be a valid 'car of that entry, but rather would be a pointer to an object containing the real 'car and 'cdr for that object. It might be possible to also use a tagged pointer (potentially the same tag, but say with a null pointer) to denote the end of a list.

Let's call this invalid tag a "NON_CAR_TAG", and let's call the tag for a pointer to an array-as-list-cons-cell a "CONS_ARRAY"

Basically a (list 1 2 3 4) would be represented with the array:

  [0] INTEGER(1) // tagged with an INTEGER
  [1] INTEGER(2)
  [2] INTEGER(3)
  [3] INTEGER(4)
  [4] NON_CAR_TAG( NULL ) // null pointer
A 'cons object would either be a (potentially nontagged) pointer to a real cons cell, or a CONS_ARRAY() tagged pointer to an entry in an array such as the above.

Now suppose we have the operation (= foo (list 1 2 3 4)). 'foo would then contain a CONS_ARRAY() tagged pointer to the above. Suppose the pointer to the above array is 0xDEADBEE0. 'foo would then be CONS_ARRAY(0xDEADBEE0).

Now suppose that a cell is exactly 16 bytes (just an example for easier reasoning). So (cdr foo) would be CONS_ARRAY(0xDEADBEF0), (cdr:cdr foo) would be CONS_ARRAY(0xDEADBF00), etc. Basically, 'cdr would first check if the input is a CONS_ARRAY() tagged pointer, and just add 0x10 (or the cell size). If the resulting address points to an entry that happens to be NON_CAR_TAG(NULL), it should return a NILOBJ, otherwise it returns the result of the addition.

Now, suppose we then do (scdr (cdr foo) 42). 'scdr should first detect if the the pointer it is given is a CONS_ARRAY() tagged pointer. If so, it determines if the value being pointed to is a NON_CAR_TAG() or not. If it's a NON_CAR_TAG(), it gets the underlying cons cell pointed to by the NON_CAR_TAG and modifies that. Otherwise, it allocates a new cons cell, populates it with the existing 'car and the new 'cdr, tags the cons cell pointer with NON_CAR_TAG(), and replaces the entry:

                   +---------> CONS: A INTEGER(2)
  [0] INTEGER(1)   |                 D INTEGER(42)
  [1] NON_CAR_TAG( * )
  [2] INTEGER(3)
  [3] INTEGER(4)
Note however that it has a drawback that l.index is still O(N) T.T