Arc Forumnew | comments | leaders | submitlogin
1 point by stefano 5726 days ago | link | parent

We could give only lists to the user, and implement them internally as arrays when they are frequently used with indexed access and with lists when they are mainly used with car/cdr. Not an easy thing to implement, though.

2 points by almkglor 5725 days ago | link

I made a suggested implementation of lists as unrolled lists which preserves the use of 'scdr on the Arc Forum a long time ago. . Basically it would be effectively arrays.


1 point by almkglor 5725 days ago | link

The problem with the proposed implementation however is that the "cons pointer" is a data structure of at least two cells (a pointer to the root of the underlying array, which includes a pointer to the cdr-table somewhere, and a pointer to the actual entry within the array).

However recently I thought of a method by which all that is necessary would be a (probably tagged) pointer to the actual entry within the array. This would require tagged pointers (i.e. no Boehm-Weiser!).

Basically we would define a tagged pointer type which, when found in a list-as-array, would not be a valid 'car of that entry, but rather would be a pointer to an object containing the real 'car and 'cdr for that object. It might be possible to also use a tagged pointer (potentially the same tag, but say with a null pointer) to denote the end of a list.

Let's call this invalid tag a "NON_CAR_TAG", and let's call the tag for a pointer to an array-as-list-cons-cell a "CONS_ARRAY"

Basically a (list 1 2 3 4) would be represented with the array:

  [0] INTEGER(1) // tagged with an INTEGER
  [1] INTEGER(2)
  [2] INTEGER(3)
  [3] INTEGER(4)
  [4] NON_CAR_TAG( NULL ) // null pointer
A 'cons object would either be a (potentially nontagged) pointer to a real cons cell, or a CONS_ARRAY() tagged pointer to an entry in an array such as the above.

Now suppose we have the operation (= foo (list 1 2 3 4)). 'foo would then contain a CONS_ARRAY() tagged pointer to the above. Suppose the pointer to the above array is 0xDEADBEE0. 'foo would then be CONS_ARRAY(0xDEADBEE0).

Now suppose that a cell is exactly 16 bytes (just an example for easier reasoning). So (cdr foo) would be CONS_ARRAY(0xDEADBEF0), (cdr:cdr foo) would be CONS_ARRAY(0xDEADBF00), etc. Basically, 'cdr would first check if the input is a CONS_ARRAY() tagged pointer, and just add 0x10 (or the cell size). If the resulting address points to an entry that happens to be NON_CAR_TAG(NULL), it should return a NILOBJ, otherwise it returns the result of the addition.

Now, suppose we then do (scdr (cdr foo) 42). 'scdr should first detect if the the pointer it is given is a CONS_ARRAY() tagged pointer. If so, it determines if the value being pointed to is a NON_CAR_TAG() or not. If it's a NON_CAR_TAG(), it gets the underlying cons cell pointed to by the NON_CAR_TAG and modifies that. Otherwise, it allocates a new cons cell, populates it with the existing 'car and the new 'cdr, tags the cons cell pointer with NON_CAR_TAG(), and replaces the entry:

                   +---------> CONS: A INTEGER(2)
  [0] INTEGER(1)   |                 D INTEGER(42)
  [1] NON_CAR_TAG( * )
  [2] INTEGER(3)
  [3] INTEGER(4)
Note however that it has a drawback that l.index is still O(N) T.T