Arc Forum | #2 is the old foldl vs foldr conundrum[1]: http://en.wikipedia.org/wiki/Fold

Arc Forum

1 point by akkartik 5048 days ago | link | parent

#2 is the old foldl vs foldr conundrum[1]: http://en.wikipedia.org/wiki/Fold_%28higher-order_function%2.... I think it's irrelevant which fold you use for best because the answer will only diverge if all the elements in the list are 'equal' according to the metric. Your example is like that.

Then again, you could argue that the original behavior is more 'stable': it's like sorting the list and taking the first element.

[1] Rich Hickey recently called fold/reduce 'complex' because of the variants. http://arclanguage.org/item?id=15269; I didn't expect to be able to cite it so soon :)

2 points by rocketnia 5047 days ago | link

"I think it's irrelevant which fold you use for best because the answer will only diverge if all the elements in the list are 'equal' according to the metric."

If anyone called 'best on an infinite list, I'd expect it to diverge regardless of how it was implemented. You never know if something's the maximal/minimal element unless you've checked all the others.

(If 'best operated on heaps or something, then you might be able to say "this potentially infinite substructure has no better candidates" because the heap structure makes guarantees like that.)

Anyway, #2 is still irrelevant because it's trivial to fix:

   (def best (f seq)
     (reduce (fn (a x)
  -            (if (f a x) a x))
  +            (if (f x a) x a))
             seq))

Even if we used 'rreduce instead of 'reduce here, this fix would still be necessary (if we consider #2 important to fix at all).

Here's a fix for #1 as well, along with a return to the original variable names:

   (def best (f seq)
     (when seq
       (reduce (fn (wins elt)
                 (if (f elt wins) elt wins))
               seq)))

This still leaves one inconsistency with the original: This version behaves more intuitively in the presence of continuations, because it doesn't use mutation.

-----

2 points by rocketnia 5047 days ago | link

"If anyone called 'best on an infinite list, I'd expect it to diverge regardless of how it was implemented."

Oops, I didn't consider that 'best in Arc actually does operate on a data structure that has enough information to tell you whether an infinite substructure can contain a better candidate: If any tail of the list is 'is to one of the tails you've already encountered, its length is infinite but it certainly has no better candidates (since you've already seen them all). All infinite lists in Arc have tails that contain themselves as tails, so it's possible to make an Arc 'best that terminates on any infinite list.

-----

1 point by rocketnia 5047 days ago | link

"Rich Hickey recently called fold/reduce 'complex' because of the variants."

Because of the variants? The only mention of fold I found was this one:

Rich Hickey (at 30:33): "Imperative loops--fold, even, which seems kinda higher-level? Still has some implications that tie two things together... whereas set functions are simple."

I thought he was referring to the way the calls to the folded function are all lined up in a dependency chain, just like they'd be lined up if you were using an iterative loop.

  (state before the iteration, input to this iteration)
    -> state after the iteration

To illustrate, in a procedural language, it's easy and efficient to define a left fold as an iterative loop:

  (def foldl (first rest func)
    (ret result first
      (each elem rest
        (zap func result elem))))

This is the same as the 'best change we're talking about. But since Arc's continuations make more sense when there's no mutation going on, I recommend not implementing 'foldl and 'best this way.

With "set functions," you can determine certain properties about the result if you know some subsets' results:

- The result of (some odd (join '(17) foo)) is t, because (some odd '(17)) is t.

- The result of (keep odd (join '(17) foo)) has at least the element 17 in it, because (keep odd '(17)) has that element in it.

Rich Hickey's main point is that these "simple to reason about" properties help humans. But they also mean that if we know enough about these function calls statically (which we never do, in Arc), we might be able to turn them into constants without knowing the value of 'foo, or at least we might be able to parallelize them. These set functions may be implemented as special cases of fold, but fold doesn't have the same reasoning potential in the general case.

Instead of saying "you can compile it better," Rich Hickey would probably spin this as "you can leave the 'how' up to the language implementor."

-----

1 point by akkartik 5046 days ago | link

"The only mention of fold I found was.. 30:33"

Look around minute 40. He says they complect what to compute and how to compute it, which I took to mean the need to specify foldl/foldr.

The difference with your version is mostly just phrasing, though.

-----

1 point by rocketnia 5046 days ago | link

Ah, here we go:

"Loops and fold! Loops are pretty obviously complecting what you're doing and how to do it. Fold is a little bit more subtle, 'cause it seems like this nice 'somebody else is taking care of it,' but it does have this implication about the order of things. This left-to-right bit."

I can see how you'd get that. I think our interpretations are related. As far as how they're related goes...

The underlying issue may be that introducing non-commutativity makes an operation more cumbersome when used with sets, and introducing non-associativity makes it even worse.

- Commutative and associative: If you apply '* or 'max pairwise to a set, you can only get one result. (It's possible to implement 'some this way, and a 'keep that returns an unordered bag.)

- Associative: If you apply 'compose or 'join pairwise to a set, you can get many results, but you can choose between them by putting the items in a list. (It's possible to implement 'keep this way.)

- Commutative: If you apply XOR pairwise to a set, you can get many results, but you can choose between them by placing the items on the leaves of a rooted binary tree.

- Neither: If you apply 'cons pairwise to a set, you can get many results, but you can choose between them by placing them on the leaves of a rooted binary tree with a distinction between left children and right children.

Both 'foldl and 'foldr are specific enough to be deterministic when they're used with non-associative, non-commutative operations like 'cons. The fact that they're so specific is why there isn't just one 'fold.

-----

1 point by rocketnia 5046 days ago | link

Oops, XOR may be commutative, but it's also associative. XD Here's something that isn't:

  arc> (isnt 'zip (isnt 'nada 'nil))
  t
  arc> (isnt (isnt 'zip 'nada) 'nil)
  nil

Since 'isnt over the set { t, nil } is XOR, I jumped to conclusions.

Here's another:

  arc> (is 'zip (is 'nada 'nil))
  nil
  arc> (is (is 'zip 'nada) 'nil)
  t

I was intentionally avoiding 'is as an example, since it has an intuitive meaning when applied to a set (IMO), which is different from applying it piecewise.

-----

2 points by thaddeus 5048 days ago | link

re: [1], ah yes, ... he also deemed 'inconsistency', as 'complect' too. So, at the very least, arc code should be consistent if not simple. :) - lol.

-----

1 point by rocketnia 5047 days ago | link

He wasn't talking about "consistency" in the sense of the principle of least astonishment.

If two parts of the program observe the system at the same time but see different states, that's called inconsistency. Consistency is difficult to achieve in a distributed program, thanks to the fact that "observation" isn't necessarily instantaneous or reliable. Per the CAP theorem (http://en.wikipedia.org/wiki/CAP_theorem), to achieve full consistency in a distributed program you have to sacrifice availability or partition tolerance. These three properties are inherently complected.

A popular sacrifice is to only guarantee a sort of eventual consistency (http://en.wikipedia.org/wiki/Eventual_consistency). That is, for any given state of the system, given enough time to communicate, every part of the system will catch up to that state. It's like saying that a garbage-collected program eventually releases memory.

-----

1 point by thaddeus 5047 days ago | link

I think you're over analyzing. Although "consistency" in the general sense, was not the only thing he was talking about... it was obvious he was including that, he even dumbed it down to that point within his presentation.

Things that are consistent are simpler than things that are not.

And I do feel that way about arc code. Having half your functions accounting for nil, while ignoring it for the other half, would make coding in arc less simple, than making sure they all accounted for it (as a general statement without debating the specifics).

-----

1 point by rocketnia 5047 days ago | link

Sure, I agree he was talking about that kind of consistency too. I spoke too matter-of-factly for once. ^_^

I was more specifically referring to the part at 31:17 where he had Inconsistency vs. Consistency on the slide and he explained the complexity as "taking a set of things that stand apart and trying to think of them all at the same time." He finished that line of thought with "And anybody who's tried to use a system that's eventually consistent knows that."

(Note that he's talking about eventual consistency as being on the "Inconsistency" side, which I guess is because an eventually consistent system is often on the way to consistency rather than being there.)

-----