Arc Forum | Newbie questions about list operations

Arc Forum

Newbie questions about list operations

3 points by map 6485 days ago | 16 comments

How would these list operations be done in Arc?

NewLisp has two ways to get the nth element of a list:

  > (set 'lst '(zero one two three four five six))
  (zero one two three four five six)
  > (nth (lst 5))
  five
  > (lst 5)
  five

Is this the best way to do it in Arc?

  (car (nthcdr 5 lst))

To tack on to the end of the list:

  > (push 'seven lst -1)
  seven
  > lst
  (zero one two three four five six seven)

The manual states:

"Repeatedly using push to the end of a list using -1 as the int-index is optimized and as fast as pushing to the front of a list with no index at all. This can be used to efficiently grow a list."

By default, push adds at the beginning of the list:

  > (push 'xxx lst)
  xxx
  > lst
  (xxx zero one two three four five six seven)

To pop from the end of the list:

  > (pop lst -1)
  seven
  > lst
  (xxx zero one two three four five six)

5 points by absz 6485 days ago | link

The nth element of a list is best obtained by (lst n), e.g. (lst 5). Also note that assignment is best written =, not set. As far as I know, there's no library function to push or pop from the end of the list, but (though it's probably unintentional) (++ lst '(seven)) will append the list (seven) to lst, doing what you want. However, here are (untested) pushend and popend functions:

  (mac pushend (elem lst)
    `(++ ,lst (list ,elem)))
  
  (mac popend (lst)
    `(do1
       (last ,lst)
       (zap [cut _ 0 -1] ,lst)))

I hope that clears some things up for you.

-----

4 points by almkglor 6485 days ago | link

Just as an efficiency warning, the above will be reasonably slow and will actually keep reallocating space for the entire list at each pushend/popend.

Generally it's easier to just push everything, then do (zap rev your-variable) at the end.

-----

1 point by absz 6485 days ago | link

Hmm, hadn't thought about that. Good point and you're almost certainly right; I did dash them off in five minutes for this post. I'd like to benchmark the two to be sure, though.

-----

3 points by almkglor 6485 days ago | link

Try building lists that are a reasonably large fraction of your gc-able space; push-rev creates 2N storage, while the [+ lst (list _)] creates about N^2/2 storage.

-----

2 points by absz 6485 days ago | link

Right, analysis pf algorithms--forgot about that :P In all seriousness, though, good point. The time complexity should be the same for both cases: push is O(1), and rev is O(n), so push-rev is O(n); + is O(n), so push-+ is also O(n). (Warning: IANACS (in a few years, I hope), so my analysis could be woefully, laughably wrong.)

Why O(n^2) memory, though? + will reallocate the memory for the list, so that will double the memory requirements; I can't figure out where the squaring is coming from.

-----

3 points by almkglor 6485 days ago | link

let's say we create a list of N elements.

First, we create an empty list (0 elements).

Then, we add an element. + creates a new list of 1 element.

Then we add an element, we discard the previous list, and + creates a list of 2 elements. We've created a total of 3 cons cells (1 discarded)

Then we add an element, we discard the previous list, and + creates a list of 3 elements. We've created a total of 6 cons cells (3 discarded).

Basically this is a summation of an arithmetic progression. Memory used isn't exactly n^2, it's a formula ((n^2 + n)/2 or some such, IIRC), basically equal to sum of i from 1 to n; since the most significant component is the n^2, it's O(n^2).

edit: even processing time complexity is not the same, either - item copy operations still take O(1) time. rev makes O(n) copy operations and + does too, but push-rev does rev only once, while pushend does + at each push.

-----

2 points by absz 6485 days ago | link

OK, I see what you mean. I was referring to the behaviour of pushend on one element (as opposed to using it to create a list in general), which is linear; of course, using pushend to build up a list of n elements has n * O(n) = O(n^2) memory usage. If push-rev were simply

  (def push-rev (elem lst)
    (rev (push elem (rev lst))))

, then a sequence of those would have the same problem: n things that require O(n) memory. But if you're intelligent about it, and reverse your list, push on all the elements, and then reverse again, you come out with O(n) space usage.

(Oh, and the formula is n(n+1)/2.)

Ah well, reading about CS in one's spare time is one thing, but actual study is another. I'm off to study all this next year, so I'll hopefully have a better grasp on it soon :) Thanks for the help.

-----

1 point by map 6485 days ago | link

It's good to see that (lst n) works in Arc; you can't get any more concise than that!

Perhaps NewLisp's fast (push 'seven lst -1) will be added to Arc someday.

-----

3 points by absz 6485 days ago | link

If you want to get a feel for Arc, I recommend either reading or skimming the tutorial: http://ycombinator.com/arc/tut.txt . It covers the basics of Arc and its idioms.

I have three objections to (push obj lst -1), however. First, it's not pushing (since that only refers to the top of the list). Second, the last argument is meaningless: -1? Why that? Can I do (push obj lst 2)? (push obj lst -4)? And third, pushing on the end can't be as efficient: lists don't know their last element. If we push 'a onto '(b c d), we create a list whose car is 'a and whose cdr is '(b c d); if we shove 'a onto the end of '(b c d), we have to iterate until we find the element whose car is 'd and whose cdr is nil, and then set its cdr to '(a). That will take more time, and there's no way around that, so it's not fast, per se.

-----

3 points by map 6484 days ago | link

--- quote

Can I do (push obj lst 2)? (push obj lst -4)?

--- /quote

Yes, NewLisp:

  > (set 'x '(2 4 6))
  (2 4 6)
  > (push 3 x 1)
  3
  > x
  (2 3 4 6)
  > (push 5 x -2)
  5
  > x
  (2 3 4 5 6)

The next test indicates that appending to the end of a list is just as fast as pushing to the beginning of a list (times are in milliseconds).

Add at the beginning of the list:

  > (set 'x '(2))
  (2)
  > (time (push 8 x) 99000)
  40
  > (x 0)
  8
  > (x -1)
  2

Add at the end of the list:

  > (set 'x '(2))
  (2)
  > (time (push 8 x -1) 99000)
  40
  > (x -1)
  8
  > (length x)
  99001

Don't ask me how Lutz did it. (Maybe he keeps a pointer to the end of the list?) I think that lists in NewLisp aren't quite the same as lists in Common Lisp. Lutz wrote:

The concept of a 'cons cell' is about details of early LISP implementations (CPU registers, etc). newLISP also has a basic cell structure which can be linked to a list but has no concept of a dotted pair. When have you last seen a dotted pair used in a real world program? newLISP has 'cons' but it works differently,

By the way, it would be nice if Arc let you get the last element with -1, as NewLisp does:

  arc> ('(2 3 4) -1)
  Error: "list-ref: expects type <non-negative exact
  integer> as 2nd argument, given: -1;
  other arguments were: (2 3 4 . nil)"

-----

2 points by absz 6484 days ago | link

Huh. That (in my humble opinion) is a terribly misnamed function (push already has a specific meaning; what about ins, for insert, which is what it actually does?), but certainly a very useful one.

I'd be curious to see what (time (push 8 x 4)) looks like: is that also constant-time? What sort of data structure is newLISP using, if not a linked list? Linked list + end pointer, perhaps--but wouldn't that break on a push? Ah well. I (unfortunately) don't really have time to construct an in-depth analysis, but I'm still curious.

I'm not sure why the Anarki won't let you do ('(2 3 4) -1); it would indeed be useful. Mayhap there are efficiency concerns? Negative indexing works in cut.

-----

3 points by map 6484 days ago | link

In Ruby, push appends to a list; unshift prepends:

  E:\>irb --prompt xmp
  lst = [3,4,5]
      ==>[3, 4, 5]
  lst.push( 6 )
      ==>[3, 4, 5, 6]
  lst.unshift( 2 )
      ==>[2, 3, 4, 5, 6]
  lst.pop
      ==>6
  lst.shift
      ==>2
  lst
      ==>[3, 4, 5]
  lst << 6
      ==>[3, 4, 5, 6]
  lst += [7]
      ==>[3, 4, 5, 6, 7]
  lst
      ==>[3, 4, 5, 6, 7]

NewLisp can insert in the midst of the list just as fast.

  > (set 'x '(0 1 2 3 4 5 6))
  (0 1 2 3 4 5 6)
  > (time (push 8 x 4) 99000)
  50
  > (set 'x '(0 1 2 3 4 5 6))
  (0 1 2 3 4 5 6)
  > (time (push 8 x 4) 99000)
  40
  > (set 'x '(2))
  (2)
  > (time (push 8 x) 99000)
  40

Ruby has the opposite problem from conventional Lisp: it appends quickly, but prepends slowly:

  t = Time.now; a=[2]; 99_000.times{ a.push( 8 ) }; p Time.now - t
  0.14
      ==>nil
  t = Time.now; a=[2]; 99_000.times{ a.unshift( 8 ) }; p Time.now - t
  70.902
      ==>nil
  t = Time.now; a=[0,1,2,3,4]; 99_000.times{ a.insert(2, 8) }; p Time.now - t
  73.346

That's 73 seconds! NewLisp can insert in a list 1800 times as fast as Ruby can!

-----

1 point by absz 6484 days ago | link

Right. In Ruby, pushing, unshifting, and spliceing are all different operations, as they should be. Meaningful names are important.

It's not quite as fast to insert in the middle, but much faster than it should be. This would, to me, imply that newLISP isn't even using linked lists at all, but is instead using some form of array, which is decidedly unLispy. If that is the case (if, I say), then of course this will only work with "one reference only", as you can't construct lists which share cdrs; "one reference only" being an awful idea, I would consider the speed a fair tradeoff.

-----

4 points by almkglor 6485 days ago | link

I assume that NewLisp can transform looped code of the form:

  (while (something)
    (push obj lst -1))

to:

  (let tl (lastcdr lst)
    (while (something)
      (= (cdr tl) (cons obj nil))
      (= tl (cdr tl))))

-----

1 point by gaah 6480 days ago | link

Have you never seen a good implementation of singly-linked lists? Because accessing the last element is a relatively common operation, a decent implementation will keep a pointer to the last element. Because it's singly-linked, the last element is the only one that pointer is any good for, so (lst -2) is still O(n), but (lst -1) becomes O(1).

The weird thing is, I actually found this good idea in an APCS review book.

-----

3 points by sacado 6484 days ago | link

"It's good to see that (lst n) works in Arc; you can't get any more concise than that!"

Yes, you can be more concise : lst.n This is shorter (in number of chars), as a consequence it can be done with Arc :)

-----