Arc Forumnew | comments | leaders | submitlogin
2 points by rocketnia 5112 days ago | link | parent

In the iteration library I just linked to, https://github.com/rocketnia/lathe/blob/master/arc/iter.arc, my 'iter-with-yield is basically the same as your 'make-stream, and my 'yielder is basically the same as your 'stream.

I'm amazed at how short your definitions are, but I see two problems, not necessarily counting bugs: First, your generators can't yield nil without appearing to terminate. Second, it seems like the created iterators will take arguments the first time they're called and ignore them thereafter, which makes looping over them a bit more of a pain since the initial loop is treated differently from the others.

I have a suggestion we can both benefit from: Forget generator parameters altogether. While it's tempting to emulate Python and C# with a definition form like this,

  (gen foo (a b c)
    (each elem b
      yield.a yield.elem yield.c)))
it's probably simpler to implement, just as useful, and just as flexible to have a parameterless syntax like this one:

  (def foo (a b c)
    (accum-later:each elem b
      yield.a yield.elem yield.c))
Incidentally, 'yield is a good default name for the 'accum variable. I almost always say (accum acc ...), and I've wanted to make an anaphoric version for a long time, but I've never been comfortable having two related names that look like they abbreviate the same word. :-p


1 point by Pauan 5112 days ago | link

"First, your generators can't yield nil without appearing to terminate"

Yeah, I know. I considered that an okay tradeoff, since they're designed to be... well, streams, you know? If necessary, you could wrap the returned values in a list, letting the caller know whether it's nil the value, or nil the terminator.

---

"Second, it seems like the created iterators will take arguments the first time they're called and ignore them thereafter, which makes looping over them a bit more of a pain since the initial loop is treated differently from the others."

Yeah, that's the bug I was talking about:

  (= it (stream args
          (while t (yield args)))

  (it 5)       -> (5)
  (it)         -> (5)
  (it 5 10 20) -> (5)
---

"I have a suggestion we can both benefit from: Forget generator parameters altogether."

How is that clearer, shorter, or overall better? Looks clunky to me. To be more specific, what do we gain from it, aside from (as you claim) it being easier to implement?

-----

1 point by rocketnia 5112 days ago | link

"what do we gain from it"

Just simplicity. My question is, what do we gain from having these parameters?

What's an iterator good for, except to iterate over? There are only three everyday utilities I see using these things with: 'each, 'all, and 'some--and 'each and 'all can be implemented in terms of 'some.

None of those utilities has any spot to provide extra arguments to the Iterable when making its Iterator (to use the Java terms), and it doesn't have any spot to provide extra arguments to the Iterator when getting each of the elements. This doesn't mean we need to add these spots.

-----

1 point by Pauan 5112 days ago | link

You could go backwards... or reset it to the beginning, or send it values, ala coroutines. Or then there's peek vs. consume, etc.

I see streams as a generalization of iterators. They can be used to create simple iterators, yes, but they can also provide additional features that iterators don't (usually) have.

It's okay that functions like `each` can handle iterators without needing to pass arguments: that's a good thing. In fact, I considered it to be pretty important that passing it no arguments would do a "default action" like retrieve the next element.

---

What do we gain from having the parameters? I believe more extensibility. In fact, here's an idea that I was juggling around: prototypes (surprised?). Basically, it could work something like this:

  (def foo (x y)
    (case x
      'goodbye (+ "goodbye " y)))
                    
  (= bar (make-proto foo
           (fn (fail x y)
             (case x
               'hello (+ "hello " y)
                      (fail)))))
             
             
  (bar 'hello "world")     -> "hello world"
  (bar 'goodbye "world")   -> "goodbye world"
  (bar 'something "world") -> nil
What's going on here? Basically, a prototype is a function, that can selectively delegate to another function, by calling the `fail` continuation. I note that this is similar to rocketnia's idea of :fail or somesuch.

All this could be wrapped up in a macro or two, making it quite simple and short:

  (proto bar (foo y)
    'hello (+ "hello " y))
Anyways, when a proto calls `fail`, it then calls the prototype (in this case foo) with the same arguments. It can then do whatever it likes. Why is this good?

Well, for starters, it unifies the concepts of objects, classes, methods, functions, and prototypes... secondly, it can give us more extensibility. Specifically, how do we easily extend all functions of this type, without affecting that type? And what if we want to extend a particular function, without affecting other similar functions?

One solution would be to create prototypical functions, that act as "bases" that other functions can build off of. Consider the case of stdin, which would have the 'char and 'peek "methods":

  (def stdin (x)
    (case x
      'peek ...
      'char ...))
If you want your function to behave like stdin, you could define such methods in your function, but using stdin as the prototype:

  (proto my-stdin (stdin)
    'peek ...
    'char ...)
Now, what happens is, if you call (my-stdin 'peek) or (my-stdin 'char) it will behave normally. But... what if you want to add an 'xml-node method to all the input functions? Simple, just extend stdin:

  (extend stdin (x) (is x 'xml-node)
    ...)
Because my-stdin inherits from stdin, it'll see the new method automatically! Oh my, this is looking an awful lot like OOP! Indeed, except rather than using objects, methods, classes, etc... it's all just functions. Functions can inherit from any other function. Also, it's a somewhat strange form of OOP, since it uses prototypes rather than more traditional classes.

Note: this methodology would only be useful in the cases where you want to make a certain "type" of function behave differently. Obviously this won't apply to all circumstances, so ordinary functional programming is still good.

-----

1 point by rocketnia 5111 days ago | link

I have no more doubt that you're going somewhere interesting with this stuff. ^_^ Funny, I woke up this morning out of a dream in which someone online had started making something very similar to Penknife, and here you are, taking at least some degree of inspiration from failcall.

(Actually, the codebase I dreamed about was written in Java, was organized almost exactly the same way as my Arc code (probably thanks to dream magic), and was actually a language for distributed computing rather than just for extensible syntax. If you hit all those marks, then I'd be outright scared of you, lol.)

---

"it unifies the concepts of objects, classes, methods, functions, and prototypes"

A few years ago I was working with someone who used the phrase "add an else-if" when talking about almost any new feature. But between 'extend, failcall, prototype inheritance, scope shadowing, and operator precedence, it turns out we have an awful lot of sophisticated terms and frameworks for what amounts to adding an else-if. ^_^

-----

1 point by Pauan 5111 days ago | link

"(Actually, the codebase I dreamed about was written in Java, was organized almost exactly the same way as my Arc code (probably thanks to dream magic), and was actually a language for distributed computing rather than just for extensible syntax. If you hit all those marks, then I'd be outright scared of you, lol.)"

Sorry, I'm not going anywhere close to Java code. You must be thinking of some other magic person. :P

-----

1 point by Pauan 5112 days ago | link

Okay, let me put it like this. You want a function that behaves like stdin, so calling (peekc my-stdin) will work. But (readc my-stdin) needs to work too... and possibly (readb) as well. Which means that those functions need to extract internal details out of my-stdin, in order to return a useful value.

The problem is that we want my-stdin to behave differently in different situations, but on the surface it just looks like an ordinary function call, so how do the various functions above get at the internal stuff in my-stdin?

This provides a solution: you call the function and tell it what data you want, and then the function provides that data. Another problem with things like (readc) is that it's all-or-nothing: either you extend it so it changes behavior for all input functions, or you jump through some clunky hoops to make it only work on a subset... but even that's not very extensible.

But with prototypes, you get completely fine-grained control. You can extend an individual function, or extend that function's prototype, or extend that function's prototype, etc. This not only gives a way to expose the function's internal details, but also allows for fine-grained extensibility.

Rather than extending something based on it's `type` (cons, string, fn, etc.), we can extend a function based on it's prototype. The functions still have a type of 'fn, but we get fine-grained control to say "these kinds of functions behave differently in these situations".

-----

1 point by rocketnia 5111 days ago | link

"The problem is that we want my-stdin to behave differently in different situations, but on the surface it just looks like an ordinary function call, so how do the various functions above get at the internal stuff in my-stdin?"

In Anarki, here's an easy way to go (if not an easy way to get all the way to what each of us might want want):

  (defcall input (self)
    readc.self)
With this in place, we can say (my-stream) to read a character, but all the other things we can do with an input stream are still available.

---

One random thing I'm not comfortable with about the (my-stream 'peek) approach is the fact that it uses a symbol, 'peek, which might mean different things to different libraries. Obviously 'peek itself would be toward the core, but I could easily imagine two pull parsers trying to use 'xml-node.

---

"Another problem with things like (readc) is that it's all-or-nothing: either you extend it so it changes behavior for all input functions, or you jump through some clunky hoops to make it only work on a subset... but even that's not very extensible."

Could you elaborate on that somehow? If I "extend" 'readc, I'm probably only changing it to support a whole new range of stream types, not (intentionally) modifying the behavior for existing ones. It shouldn't be too clunky to determine whether something's in that new range, and I don't have a clue what part of this strategy you're saying is less extensible than your idea.

---

"You can extend an individual function, or extend that function's prototype, or extend that function's prototype, etc."

I know what you're talking about there, at least. ^_^ It's interesting to note that both of our approaches put it on the API designer's shoulders to make sure the system is well-factored into extension points. My approach, with direct use of failcall, and with rulebooks rather than deep dependency chains[1], would promote patterns like this:

  rulebook.readb
  
  ; To simplify the example, we'll assume readc itself isn't a rulebook.
  [fun readc [str]
    [let first-byte rely:readb.str
      ...]]
  
  ; The use of the "rely" macro above makes readc fail if readb.str
  ; fails. The above definition of readc is roughly equivalent to this:
  ;
  ; [= readc [failfn fail [str]
  ;            [let first-byte [failcall fail readb.str]
  ;              ...]]]
  ;
  ; The "fun" macro introduces the "fail" variable anaphorically, and
  ; the "rely" macro expands to refer to it.
  ;
  ; Then "failcall" itself is a macro. It parses its second argument
  ; as an expression, then tries to unwrap it into a function call
  ; (potentially causing an error), then wraps it back up as a function
  ; call with a particular fail parameter subexpression. I'm still
  ; thinking over this design.
  ;
  ; Note that most of this stuff is still just a sketch right now.
Certain rulebooks could have rules that acted as inheritance, in your sense:

  rulebook.my-plus
  [rule* my-plus [] args
           my-plus/+      ; This is the name of the rule.
    [rely:apply + args]]
I expect this to be a common design pattern, common enough that I might end up with macros that capture patterns similar to some of your prototype system's patterns. In fact, depending on how I settle on implementing rulebooks (which is up in the air thanks to module difficulties), that pattern might already be easy to accomplish by going to a lower level:

  rulebook.my-plus
  [add-rule my-plus +]  ; highly speculative
---

[1] By "deep depencency chains," I mean that I assume you're talking about having patterns whereby A is the prototype of B, B is the prototype of C, C is the prototype of D, and people only ever use D most of the time. (A, B, and C might have longer names.) The rulebook equivalent would be to have a rulebook E which keeps track of A, B, C, and D. I think rulebooks probably make it easier to forget about the precedence order of different cases (for better or worse), while also making it possible to add and remove cases without having to rewire their neighbors (say, possible to remove B without fiddling with C's prototype).

-----

1 point by Pauan 5111 days ago | link

"With this in place, we can say (my-stream) to read a character, but all the other things we can do with an input stream are still available."

Yeah, but that doesn't help if you want to extend readc or peekc so they understand your new stream. You need to extend readc and peekc directly. And then how do those functions get at the data they need?

---

"One random thing I'm not comfortable with about the (my-stream 'peek) approach is the fact that it uses a symbol, 'peek, which might mean different things to different libraries. Obviously 'peek itself would be toward the core, but I could easily imagine two pull parsers trying to use 'xml-node."

That is true, but how is that any more arbitrary than saying that the global functions peekc, readc, etc. are reserved for input streams? The symbol 'peek simply means whatever the function wants it to mean. Thus, a function that behaves like an input stream can be called like an input stream. Duck typing.

---

"Could you elaborate on that somehow? If I "extend" 'readc, I'm probably only changing it to support a whole new range of stream types, not (intentionally) modifying the behavior for existing ones. It shouldn't be too clunky to determine whether something's in that new range, and I don't have a clue what part of this strategy you're saying is less extensible than your idea."

Why not modify the behavior of existing ones? In fact, readline does that right now, by relying on readc. Thus, if your input supports readc, it supports readline too.

And it's not just readc either, it could be other functions, it's just that this is a discussion related to input streams. So, yeah, it's possible to do it with the current model, but I think it's clunkier and leads to more (and uglier) code.

Okay... let me try to explain... I want to allow Arc users to create new data types, like input streams. Right now, this is possible only by extending built-in functions like readc, peekc, etc. Something like this:

  (extend readc (x) (isa x 'my-data-type)
    ...)
...but now you end up needing to define a new data type, and you need to extend all the built-in functions individually. You can avoid that by using coerce...

  (def readc (x)
    (do-something (coerce x 'input))
...the problem is the "do-something" part. Even after readc coerces your custom input type to 'input, how does it extract the data it needs? How does it read a character? How does readc access the internals of your data type? You could special-case it to be a function call, but then what about peek? What about reading bytes?

What I am proposing is a way to solve both problems in one fell swoop: functions are typed not based on their `type`, but based on the pseudo-methods they contain. Duck typing. And by calling these "methods", functions (like readc) can extract the data they need. Now, you can define readc like this:

  (def readc (x)
    (x 'char))
And now any function that supports 'char automatically gets access to readc without needing to extend it. What if a different function also uses 'char? No problem, it works too! Thus, the functions don't care about the type of their argument, they only care that it supports the methods they need to do their job. And by leveraging prototypes, it lets you extend the scope chain at any point, giving more fine-grained control than simply whether data is a string or a cons or whatever.

By the way, you can combine this approach with conventional types as well:

  (def readc (x)
    (coerce x 'input).'char)
The above is more strict, because it not only expects the data to be coercable to 'input, but it also expects it to have a 'char "method". This could potentially handle your use-case about conflicting libraries using the same method names: you can use the function's actual `type` to differentiate between them.

---

Random side note: this of course means that I can define tables so that they are functions of type 'table that have a 'get and 'set method. Then you can create a custom table just by creating an annotated function that has 'get and 'set. And then `apply` could be designed so when you call a table with (foo 'bar), it calls (foo 'get 'bar) instead.

This makes it super-easy for Arc code to define custom table types. Or custom input types. Or custom any-type, really, since all composite data types can be represented as functions with methods.

This is what I'm trying to explain. If you have a table called foo, then calling (foo 'bar) is a "get" action, and calling (= (foo 'bar) 5) is a "set" action. foo is a single data-type, it is self-contained, but it has two different behaviors in two different areas.

If you want to emulate that, you end up needing to jump through some hoops, like the following:

  (def my-table ()
    (obj get (fn (n) ...)
         set (fn (n v) ...)))
...and now you have the fun of extending sref and apply (and probably eval as well) so they behave properly with your custom table. Remember my prototype post a while back? I couldn't even get it to work in pgArc. What a pain. So much easier if all your function needs to do is implement the required methods, and everything just works without needing to extend anything.

And this "functions with methods" idea is applicable to other data types as well, like input streams. So for the same reasons that creating custom table types are a massive pain, creating custom input types are a massive pain. But this proposal makes it super easy.

---

"My approach, with direct use of failcall, and with rulebooks rather than deep dependency chains[1], would promote patterns like this:"

By the way, I would expect my proposal to be completely compatible with rulebooks as well... the only requirement is that the call (foo) return a char, and (foo 'peek) return a char, but not consume the stream. It's completely up to foo how it handles delegation and/or inheritance. I merely provided one potential way: prototypes.

But the proposal works even with ordinary functions that aren't prototypes. In fact, it doesn't even need to be functions per se, for instance I would expect the following to work:

  (= foo (obj char #\f))

  (readc foo) -> #\f
Neat. Since foo is a table, and (foo 'char) returns #\f, readc was able to use it. This, of course, wouldn't work in the strict version of readc, which tries to coerce it's argument to 'input. But this would work:

  (= foo (annotate 'input (obj char #\f)))

  (readc foo) -> #\f
---

"By "deep depencency chains," I mean that I assume you're talking about having patterns whereby A is the prototype of B, B is the prototype of C, C is the prototype of D, and people only ever use D most of the time. (A, B, and C might have longer names.)"

It could be that way, but shallower trees are also quite possible, and in fact I expect those would be the norm. My suggestion to authors of functions would be to only increase the depth of the tree as needed.

-----

1 point by rocketnia 5111 days ago | link

"Yeah, but that doesn't help if you want to extend readc or peekc so they understand your new stream. You need to extend readc and peekc directly."

Sounds like you're solving your own problem. :)

"And then how do those functions get at the data they need?"

Why hide it?

Anyway, I'd put the extensions of 'readc and 'peekc in the same place as the rest of my code that dealt in the concrete details of my custom stream type. That way, I can pretend in all my other code that the data is strictly encapsulated, and when I do change the implementation, everything I need to refactor is in the same page or two of code.

---

"That is true, but how is that any more arbitrary than saying that the global functions peekc, readc, etc. are reserved for input streams?"

If you're saying the symbol 'peek is just as arbitrary as the global variable name 'peekc, I agree, but global variables are the more likely thing to be handled by a namespace system. :) If that's not what you're saying, whoops.

---

"Why not modify the behavior of existing ones? In fact, readline does that right now, by relying on readc. Thus, if your input supports readc, it supports readline too."

Huh? Using 'readline doesn't change what happens the next time you use 'readc. I think we're having some word failures here.

Maybe what you mean by "modify" is more of a pure functional thing, where you "change" a list by removing its first element when you call 'cdr. But then I still don't understand what you meant in your original statement that "Another problem with things like (readc) is that it's all-or-nothing."

"you need to extend all the built-in functions individually . You can avoid that by using coerce..."

Darn 'coerce, always enticing people to use it. ^_^ It's actually possible to use a coercion pattern here, but you'll need to check whether you can coerce at all, and go on to other extensions if not. (This is something failcall and rulebooks make convenient.) However, to create a value of type 'input, you still need to specify all the reading and peeking behavior somewhere, and I prefer to specify those behaviors in separate pieces of ravioli, in this case by extending each function individually.

"Even after readc coerces your custom input type to 'input, how does it extract the data it needs?"

Exactly, I wouldn't turn something into an 'input and then turn it back; by extending 'readc directly, I'd preempt the coercion step.

To digress a bit, coercion is a positively useful design pattern when there's more than one sensible set of "axioms" for a set of utilities. If I see utilities A1, A2, B1, B2, B3, and B4, and I notice that the B's can be implemented in terms of the A's, then I can do the following:

1. Write a function C of the form "Take any value, and return a similar value that supports all the A's." I call this a coercion function, but I don't expect everyone to agree with that. ^_^

2. Extend the B's so that they try using C. (If there's no standard way to "try" using a function, such as failcall, then C needs to indicate failure in its own way, or there needs to be another function D that people call to see if they can call C.)

3. Sometimes it's useful (if uninspired) to create a boilerplate concrete type for C to return when there's no better way to make the right A-savvy value. This type tends to take the form of a wrapper containing a function to call for the A1 behavior and a function to call for the A2 behavior. Obviously, the A's should be extended to support this type; that's the only point of it.

After this, if I ever want to extend the B's, there's a good chance I can do it by finding (or making) something that extends the A's instead, and then extending F to do the conversion. After a certain initial cost (making C and C's general-purpose return type), this eventually becomes a net decrease in the number of extensions needed.

...And I've run out of time. I'll get to the rest of your post later! ...And your followups someday. XD

-----

1 point by Pauan 5111 days ago | link

I would like to point out that prototypes and methods are completely separate subjects that serve a similar purpose (extensibility) in completely separate ways. Perhaps I shouldn't have discussed them in the same post.

Methods solve the problem of polymorphism, namely the ability to easily define new types (and new instances of existing types) that can intermesh with existing code (like the built-ins readc and peekc). It does this by implementing duck typing: if the function supports the method, just use it.

This can be augmented by a technique that I've come to like: coercion. Rather than saying "my argument needs to be of type foo", the function just coerces to type 'foo. If it can't be coerced, it will automatically throw an error.

The reason I like the coerce approach is because it means you can easily create completely new types. So if you create a new 'byte type, you can extend coerce so it can be coerced into 'char, 'int, 'num, etc. and existing code will work with it.

The reason I like the method approach is that it makes it easy to create new instances of existing data types. Like I mentioned, it makes it easy to create custom 'table types. It also maximizes the benefit of prototypes, and in some cases allows completely new types to piggyback off of existing functions, which is made easy with prototypes.

The reason I call it "more extensible" is for the exact same reason I call the coerce approach more extensible. With the coerce approach, the function doesn't care what type it is, it only cares that it's possible to coerce it to the type it wants.

In the case of methods, the function doesn't care what type it is, it only cares that it's possible to extract the data it needs, using the function's methods.

---

Prototypes, however, serve a different purpose. Specifically, it tries to solve the problem where you sometimes want to extend a particular function, and sometimes want to extend many functions at once. It's designed to give fine-grained control beyond merely what the type is. Also, by letting one function serve as a "base" for other functions, it tries to reduce duplicate code.

All three concepts are designed to enhance extensibility, but they do so from different angles, and in different ways. You'll note that all three attempt to achieve extensibility by ignoring what the type of something is. Instead, they focus on either the desired type, the desired functionality, or the desired ancestor.

The three combine to form a coherent whole, which I think is as it should be.

Perhaps I should write up a post comparing the two approaches (prototypes/methods vs. pure functions). Depending on the results, that could either convince me that my idea is silly, or convince you guys that it has merit.

-----

1 point by Pauan 5111 days ago | link

To put it another way...

Coercion means you only need to define coercion rules in one spot, and existing code can Just Work.

Methods mean you only need to define a method that implements the functionality, and existing code can Just Work.

---

Prototypes get rid of the traditional concept of type entirely. Rather than saying "that's an input stream" or "that's an object of type input" we instead say "that's a function that inherits from stdin."

Of course, I don't plan to get rid of types, since I think they're useful (especially with coercion), but at the same time, I think having more fine-grained control can be useful, so I think there's some synergy between types and prototypes.

-----