Arc Forum | On a slightly different topic, I find '+ vs 'join really weird. They are only su...

Arc Forum

3 points by zck 4645 days ago | link | parent

On a slightly different topic, I find '+ vs 'join really weird. They are only superficially similar; '+ actually reaches inside its arguments to muck with the cons cells:

  arc> (def check-cars (join-fun)
            (let x (list (list 1))
                 (is (car (join-fun x))
                     (car x))))
  #<procedure: check-cars>

  arc> (check-cars +)
  nil

  arc> (check-cars join)
  t

(originally written up by me at http://zck.me/golf). 'mappend uses '+, so it's also weird. This behavior seems insane to me, and was the cause of some hard-to-track-down bugs in the code at the above link.

2 points by Pauan 4645 days ago | link

Interesting... doing that test in Arc/Nu returns "t" in both cases. I wasn't even aware of that issue but apparently "fixed" it accidentally.

By comparing Arc to Arc/Nu, it seems that the problem is that Arc returns a new list that is 'nil terminated, whereas Arc/Nu uses () for 'nil, so it doesn't have to do any conversion.

-----

1 point by akkartik 4644 days ago | link

It took me a while to understand why this should be. cartesian calls mem which calls testify, and testify uses is.

  (def testify (x)
    (if (isa x 'fn) x [is _ x]))

I think it should use iso. Can y'all think of any reasons that's a bad idea?

If we make that change, I'm not too concerned about whether + is like join. A caller of either shouldn't typically care whether they create new conses or not.

-----

2 points by rocketnia 4644 days ago | link

"I think it should use iso ."

Of course! A quick Google search for [site:arclanguage.org testify iso] turns up a few places Arc Forumgoers have talked about this before:

http://arclanguage.org/item?id=9183 (aw)

http://arclanguage.org/item?id=13678 (me)

http://arclanguage.org/item?id=13267 (me and you)

These were just my first three search results. :)

---

"A caller of either shouldn't typically care whether they create new conses or not."

Does anyone mutate association lists? It might matter there.

-----

1 point by akkartik 4644 days ago | link

There'll totally be places where it matters whether something conses or not. But that should reflect in the (longer) name, IMO. cons-minimal-append, mayhap?

-----

2 points by akkartik 4644 days ago | link

OMG, why didn't I fix anarki? Doing so now.

-----

2 points by zck 4643 days ago | link

It seems ugly that calling '+ changes the elements inside its arguments. Even if 'testify were changed, I still think '+ shouldn't do that.

-----

2 points by Pauan 4643 days ago | link

Yeah, definitely. The reason for the bug is because the Arc compiler uses Racket's "append" to do the join, which means it first needs to convert the Arc list to a Racket list, do the append, then convert back to an Arc list.

So there's two ways to fix this:

1) Change Arc to be like Arc/Nu, so that it doesn't need to do the conversion. I don't expect this to be too hard.

2) Write + in Arc, and have it call "join" when the first argument is a list. Then there's no discrepency between the two.

I actually dislike how the overloaded + is written in Arc, so I think having it call join instead would be great.

-----

1 point by akkartik 4643 days ago | link

Yeah I don't disagree. I just found it more fundamental to ask why it should be so subtle to debug.

Also, if the language used immutable conses then every cons would indeed have to change, so there's some rationale for keeping it purely functional.

I do totally agree that the two should be consistent. That sucks a lot.

-----

3 points by zck 4643 days ago | link

If the language used immutable conses, it still shouldn't have to change every cons. The issue here arises not because the list you pass in has different conses than the list returned, but because the elements of the list you pass in have different conses than the elements of the list returned.

Let's say we call (+ '((1) (2))). I'll draw the list this way:

  ( , . , )                   
    |   `-------> ( , . nil)  
    |               |         
    |               |         
    |               |         
    v               v         
   (1 . nil)       (2 . nil)

To my brain, there are two "kinds" of conses in the list: those that make up the top-level list itself, and those that make up the elements.

  top-level  ( , . , )                   
    conses     |   `-------> ( , . nil)  
               |               |         
               |               |         
               |               |         
   element     v               v         
    conses    (1 . nil)       (2 . nil)

And here's what I think should happen in the call to +

  these conses  ( , . , )
  can change      |   `-------> ( , . nil)
                  |               |
                  |               |
                  |               |
  these conses    v               v
  shouldn't      (1 . nil)       (2 . nil)

Similarly, if you call

  (+ (list (obj 1 2)))

Would you expect the hash table to have its elements copied into a new hash table? No, the same hashtable should be shared between the calling list and the returned list, even if the list itself has different conses. And indeed, this is the case (thanks again, evanrmurphy, for tryarc):

  arc> (def check-cars-obj (join-fun)
              (let x (list (obj 1 2))
                   (is (car (join-fun x))
                       (car x))))
  #<procedure: check-cars-obj>
  arc> (check-cars-obj +)
  t

-----

1 point by akkartik 4643 days ago | link

But the result should be:

  ( 1 . , )                   
        `-------> ( 2 . nil)

So the first of your element conses has to change too.

But you are right that not all the conses need to change. I finally understood Pauan's diagnosis, that it was arc copying nil vs () terminated lists. I've been against that 'feature' for a long time. My private fork just stays with () terminated lists: http://github.com/akkartik/arc

-----

2 points by zck 4643 days ago | link

That's neither '+ nor 'join :

  arc> (+ '((1) (2)))
  ((1) (2))
  arc> (join '((1) (2)))
  ((1) (2))

That's flatten:

  arc> (flat '((1) (2)))
  (1 2)

-----

1 point by akkartik 4642 days ago | link

Sorry, I think we've both managed to confuse each other. + and join on a single arg is just the identity function, right? Aren't these the cases we care about?

  arc> (+ '(1 2) '(3 4))
  (1 2 3 4)

My argument was that in the presence of mutable conses all the conses would have to change. After your response I'm correcting that to "all the outer conses (except in the last arg)".

Does this seem right? I was reading your previous comment without one wrapping set of parens.

-----

2 points by zck 4642 days ago | link

Yes, the outer conses all have to change, except for the last one. But the inner conses don't, and they are changed right now. That, specifically, is my problem with '+ . If you call (+ '(((((((1))))))) '(((((((2)))))))) , you shouldn't have to change fifteen conses, just one or two.

-----

4 points by Pauan 4642 days ago | link

akkartik: "I finally understood Pauan's diagnosis, that it was arc copying nil vs () terminated lists. I've been against that 'feature' for a long time."

Yes, but Arc/Nu manages to implement that Arc feature without conversion, so I don't consider it a problem with the feature, but a problem with the implementation.

---

akkartik: "+ and join on a single arg is just the identity function, right?"

Yes, but Arc does the conversion even if you pass in a single argument. So another possible solution to the problem is to change Arc to return the first argument unchanged when passed in a single argument.

---

zck: "If you call (+ '(((((((1))))))) '(((((((2)))))))) , you shouldn't have to change fifteen conses, just one or two."

Yes, you shouldn't have to. Arc uses ar-nil-terminate and ac-niltree to do the Arc->Racket->Arc conversion. ar-nil-terminate does a shallow copy, but ac-niltree does a deep copy. Simply changing ac-niltree wouldn't work, because I'm pretty sure other stuff depends on the deep copy.

However, the Arc compiler could be changed so that + uses a shallow version of ac-niltree. That's an interesting idea: how much stuff in the Arc compiler is currently using the deep ac-niltree when it could instead use a shallow ac-niltree?

-----

3 points by rocketnia 4642 days ago | link

"Arc uses ar-nil-terminate and ac-niltree to do the Arc->Racket->Arc conversion. ar-nil-terminate does a shallow copy, but ac-niltree does a deep copy.

However, the Arc compiler could be changed so that + uses a shallow version of ac-niltree."

That's the essence of the bug, as I see it. This fix is much shallower than the other fixes discussed, so this fix would make the most sense in a minimally updated variant of pg's Arc.

Should Anarki go for shallow or deep improvement? I've advocated shallow in the past, but now I'm thinking Arc oughta follow through on its promise to "break all your code," which of course means the entire network of hacks, libraries, help resources, and alternate language implementations we've built up so far. It would be nice to see Anarki become a fork of Arc/Nu and undergo deep improvements, without losing the Arc flavor.

-----

1 point by akkartik 4642 days ago | link

I agree. I feel uneasy about the shallow change to +; I'm sure the same bug exists in some other functions since deep niltree is the default.

(Confusing that you're using deep/shallow with two meanings in the same comment :)

-----

4 points by rocketnia 4641 days ago | link

"I feel uneasy about the shallow change to +; I'm sure the same bug exists in some other functions since deep niltree is the default."

From what I can see, there are only three places in the pg-Arc code where 'ac-niltree traverses too far, and they're all in ac.scm. Two are the definitions of + and ar-+2, and one is a misleading comment:

  ; Arc primitives written in Scheme should look like:
  
  ; (xdef foo (lambda (lst)
  ;           (ac-niltree (scheme-foo (ar-nil-terminate lst)))))
  
  ; That is, Arc lists are NIL-terminated. When calling a Scheme
  ; function that treats an argument as a list, call ar-nil-terminate
  ; to change NIL to '(). When returning any data created by Scheme
  ; to Arc, call ac-niltree to turn all '() into NIL.
  ; (hash-table-get doesn't use its argument as a list, so it doesn't
  ; need ar-nil-terminate).

From another point of view, there are only a few places where 'ac-niltree probably needs to be recursive. Those are in the definitions of 'ac-call, 'ac-mac-call, and 'ac-macex, where they deal with quotation and macroexpansion, the two ways literal code is made available to Arc programs.

The other uses of 'ac-niltree are in 'ar-coerce (string to cons), 'dir, and 'timedate, where the lists are flat anyway.

I only looked for uses in pg-Arc, not any code that's been derived from it.

https://github.com/nex3/arc/blob/official/ac.scm

-----

2 points by rocketnia 4641 days ago | link

"Confusing that you're using deep/shallow with two meanings in the same comment :)"

Whoops! My comment was originally going to stand on its own, but I adapted it into a reply when Pauan got to what I wanted to say first. :) I didn't notice the word overlap.

-----

1 point by akkartik 4642 days ago | link

Arc/Nu manages to implement that Arc feature without conversion

Hmm, can you elaborate on how it manages this? Is it just printing nil, but otherwise leaving the actual cons cells ()-terminated? (That option has probably been discussed multiple times: http://arclanguage.org/item?id=3094. I'm sure I mentioned it too at some point.)

-----

2 points by Pauan 4642 days ago | link

"Hmm, can you elaborate on how it manages this? Is it just printing nil, but otherwise leaving the actual cons cells ()-terminated? (That option has probably been discussed multiple times: http://arclanguage.org/item?id=3094. I'm sure I mentioned it too at some point.)"

Yes. And then "nil" is a global variable that is bound to (). I also have to do a few other things like changing quote so that it changes the symbol "nil" to (), but overall it isn't that hard to have perfect Arc compatibility, even without doing the conversion.

-----

2 points by Pauan 4641 days ago | link

In case you're curious, in order to accomodate () as 'nil, Arc/Nu had to change the following Arc functions: coerce, type, disp, write, quote, sread

And the places in the Arc/Nu source that treat () as 'nil:

https://github.com/Pauan/ar/blob/c8dea3f2d3a4f343ec633c81437...

16 lines total. Probably less than it would have taken to do the 'nil -> () conversion. But Arc can't tell the difference, unless I've overlooked something.

-----

1 point by akkartik 4642 days ago | link

ar-nil-terminate does a shallow copy, but ac-niltree does a deep copy.

Insane. I'm bummed I haven't noticed this before :( Has it come up in the past?

---

I also just noticed: It is _utterly_ insane that niltree converts () to nil, while nil-terminate does the opposite.

-----

1 point by Pauan 4642 days ago | link

"I also just noticed: It is _utterly_ insane that niltree converts () to nil, while nil-terminate does the opposite."

It's called "tree" because it recurses on the car. In other words, it traverses sublists. As for why it's called "nil", well, I guess that's because you can't use () in the name without using || quotes. I'd call them "nil->null" and "null->nil". Or perhaps "arc->racket" and "racket->arc".

-----

1 point by akkartik 4642 days ago | link

I've renamed ar-nil-terminate: http://github.com/nex3/arc/commit/c7a27e0157

-----