Arc Forumnew | comments | leaders | submit | akkartik's commentslogin
1 point by akkartik 4973 days ago | link | parent | on: Wart update

Thanks for trying it out! I'm embarrassed that your experience was so terrible. Here I am, using it everyday..

I'll get on fixing all your bug reports. In the meantime I've first added them to manual_tests: http://github.com/akkartik/wart/commit/d8336d2a22; http://github.com/akkartik/wart/commit/ac4462c94d

(Edit a day later: all these bugs should now be fixed. http://github.com/akkartik/wart/commit/e062e2a407)

---

I used to be on a mac, yes, but I don't think that's why I use zsh :) It's just an apt-get away, and it spares me having to worry about bash kludges like "$@" and whatnot. But you're right, I should just use sh to minimize friction. It's not like it's a complex shell script.

-----

1 point by akkartik 4973 days ago | link | parent | on: Wart update

Yeah. I considered |<-, but it seemed too cryptic. It was only when I came up with default that I decided to take the leap on it all.

-----

2 points by Pauan 4973 days ago | link

  n |<- 0
I actually think that looks fine, to be honest.

-----

2 points by akkartik 4980 days ago | link | parent | on: The evolution of Nulan's : and ;

via http://arclanguage.org/item?id=16807

-----

2 points by Pauan 4980 days ago | link

Thanks. Though I'm still trying to pin down the "right" meaning for ":" and ";"

For instance, I've actually changed it since then. I should update that page to describe the new changes.

-----

2 points by iopq 4966 days ago | link

What is the change? I hate parens in any language with a passion. I even suggested a similar approach to Newspeak (a language based on Smalltalk): https://groups.google.com/forum/?hl=en&fromgroups=#!topi...

of course here you can get away with just positional placement because all binary and keyword sends make it fairly clear where the message send is and where the argument is

-----

1 point by Pauan 4966 days ago | link

Let's see... it's been a while, so I'm looking at the Nulan source code... Ah, right. Here's the latest rules:

  : creates a new list that continues until either ; or ( or ) is encountered
  ; creates a new list that continues until either ; or ) is encountered
So for instance:

  foo: bar: qux; corge -> (foo (bar (qux)) (corge))
  foo: bar; qux; corge -> (foo (bar) (qux) (corge))
  foo: bar: qux: corge -> (foo (bar (qux (corge))))
  foo: bar (qux corge) -> (foo (bar) (qux corge))
The differences are:

  : now terminates on (

  ; creates a new list in addition to terminating all previous : and ;
     (before, it only terminated and didn't create a new list)

  ; now terminates ; in addition to :

-----

1 point by Pauan 4966 days ago | link

I've actually changed my mind so that I don't mind using parens, but only for function calls. Calls to a vau (def, if, and, let, etc.) shouldn't be wrapped in parens:

   mac accessor -> n v
     uniqs %a
       {def n -> %a
         {{%a v} %a}}

   def or: casevau env
     x    -> (eval x)
     x @r -> let x (eval x)
               if x x (eval {or @r})
The above is a hypothetical mockup for new syntax for Nulan. It's radically different than the current syntax, but I think it looks very clean. At least, compared to the current syntax:

  $mac accessor; N V ->
    $uniqs %A
      [$def N; %A ->
        [[%A V] %A]]

  $defvau $or Env
    X    -> eval X
    X @R -> $let X: eval X
              $if X X: eval [$or @R]

-----

3 points by iopq 4965 days ago | link

why would you need two different characters?

foo; bar; qux ;corge (foo(bar(qux))corge

foo; something is foo(something)

something ;foo is something(foo)

-----

2 points by Pauan 4964 days ago | link

Oh, I see what you're saying. You're saying ";" should have different meaning depending on whether it's to the left or the right of the symbol. That could work, but I feel it's making whitespace a bit too significant. No way to know for sure without trying it out, though.

-----

2 points by iopq 4962 days ago | link

well, you already make whitespace significant by saying that alphabeta is not alpha beta

the real crux of the issue is whether it comes before an identifier or after don't think of it as whitespace disambiguating nobody says *pointer in C is "too significant whitespace" just because it can't be separated by a space (in which case it becomes multiply!)

-----

1 point by Pauan 4962 days ago | link

Like I said, I think it's only a bit too far, so if somebody wants to run with that idea, go for it. I personally favor the idea of having multiple syntaxes that parse to the same AST.

-----

1 point by akkartik 4962 days ago | link

I didn't know you couldn't have a space after deref!

But, I dunno.. attaching ';' to identifiers is different from attaching '*'. Our brains are trained to treat the semi-colon as punctuation, never part of a word. Even programming languages have only reinforced this pattern. It's going to be a hard habit to break.

-----

2 points by rocketnia 4962 days ago | link

"Our brains are trained to treat the semi-colon as punctuation, never part of a word. Even programming languages have only reinforced this pattern. It's going to be a hard habit to break."

Give it 100 years. ;)

-----

1 point by Pauan 4962 days ago | link

I'm assuming ";" was just an example... I would personally use ":" rather than ";" if I used only one character.

-----

3 points by iopq 4960 days ago | link

I would personally use something like | if it's not used for anything

it looks like an undirected paren so in cases like a |b| c it translates to a ((b) c) or (a (b)) c I guess I'd pick the first option (more natural for reading left to right), but for the order of operations it doesn't matter

-----

2 points by akkartik 4960 days ago | link

I like this! It seems the semi-colons were blinding me to the possibilities :)

-----

1 point by akkartik 4964 days ago | link

I agree that this is probably too much. Modern languages are training us to ignore the semi-colons; to have to pay attention to the whitespace around them seems retrogressive.

-----

1 point by Pauan 4964 days ago | link

To be fair, the semicolons in Nulan are mostly used as infix operators, unlike in languages like C where they're required at the end of every statement. And Nulan already uses significant whitespace indentation. So I don't think it'd be a lot worse to make the whitespace significant for ":" or ";" I just think it might be a bit too far.

-----

1 point by Pauan 4965 days ago | link

Because there are situations where you want to terminate things, like with "and" and "or":

  and: foo 1; bar 2; qux 3 -> (and (foo 1) (bar 2) (qux 3))
Keep in mind my system was intentionally designed to get rid of as many parentheses as possible in as many situations as possible.

Now that I'm more lenient toward parens, I'll probably simplify ":" and get rid of ";"

---

There is one other situation where I use ";" in Nulan: functions that continue on the next line:

  $def foo; X Y ->
    ...
But with my new syntax for functions, I can get rid of that:

  def foo -> x y
    ...

-----

2 points by akkartik 4983 days ago | link | parent | on: Infix support in wart

There isn't a unified grammar for the language, I'm afraid. I've built wart in layers:

a) parse lisp as usual. This layer doesn't know about the regular vs infix distinction, so a, a-1 and ++a and ++ are all just tokens.

b) expand infix ops inside symbols, e.g. a+1 => (a + 1)

c) scan for operators inside lists and pinch them with the adjacent elements.

  (.. a ++ b ..) => (.. (++ a b) ..)
Edit: Notice that this is different from your example:

  (a infix b ..) => ((infix a b) ..)

-----

1 point by akkartik 4983 days ago | link | parent | on: Infix support in wart

Rather to my amazement, these test cases work as expected:

  a-1.0

  a.0-1.0
Any other stress test ideas?

-----

4 points by fallintothis 4982 days ago | link

Any other stress test ideas?

Depending on how you parse number literals, there are the examples at the end of http://arclanguage.org/item?id=10149 which I used for stress-testing ssyntax/number highlighting.

-----

3 points by fallintothis 4982 days ago | link

Just in case there are any that are useful, I also used http://pastebin.com/YqxZydyw to test syntax highlighting. A lot of the tests have to do with recognizing Scheme numeric literals, though.

-----

1 point by akkartik 4983 days ago | link | parent | on: Infix support in wart

Thanks for all those comments! After mulling them I think I'll feel better if I can eliminate ssyntax in favor of infix operators. But there's two challenges to that:

  a.b vs dotted lists
  f:g vs :syms
I'm gonna take the rest out next.

-----

1 point by fallintothis 4983 days ago | link

  a.b vs dotted lists
I was trying to think of alternatives, thought "maybe a more complex symbol for one of the uses, like ..?", then wondered about a potential edge case. Really, I'm just thinking of the parsing algorithm---or, rather, lexing. If . was defined as the ssyntax is, would a..b expand into ((a) b)? Without spaces, it's fairly clear that certain arguments are "empty", since it could conceivably be tokenized as two .s. But a++b probably wouldn't tokenize to two +s. Suppose both . and .. were defined; how would a..b be resolved? Longest-operator-first?

  f:g vs :syms
Could always go with another symbol for function composition. | comes to mind, but it's more like "reverse compose" at a Unix shell. On the other hand, as far as the function composition operator is concerned, I've seen mathematicians define it both ways---$(f \circ g)(x) = f(g(x))$ and $(f \circ g)(x) = g(f(x))$. No technical reason you couldn't use a pipe, just conventional.

-----

2 points by akkartik 4983 days ago | link

Yeah, currently:

  a..b
  => ((a) b)
My reflex: I'm ok with breaking this corner case and just treating it as a single op like infix a++b. Juxtaposing infix ops isn't really on my radar.

Update: Hmm, a more valuable use case that I might have to give up:

  f:~g
Update 4 hours later: Ah, perhaps I don't have to give it up! We could just define new operators:

  mac (:~) (f g)
    `(: ,f (~ ,g))

  mac (.-) (f n)
    `(,f (- ,n))
Yeah, this could work. a..b is still challenging to define, though..

-----

1 point by fallintothis 4983 days ago | link

Really, I'm just thinking of the parsing algorithm---or, rather, lexing.

Oh yeah, and how does it work for negative number literals? I assume

  (f n-1)   --> (f (- n 1))
  (f n - 1) --> (f (- n 1))
because the minus either does or does not have spaces around it, but

  (f n -1) --> (f n -1)
because the minus sign only has spaces on one side?

-----

1 point by akkartik 4983 days ago | link

Yeah. I never treat an op as infix if it has whitespace on just one side.

There is one ugly special-case here:

  f.-1   ; => (f -1)
http://github.com/akkartik/wart/blob/8211614d63/014infix.cc#...

-----

1 point by akkartik 4981 days ago | link

Ok, erstwhile ssyntax is now all infix: [1] http://github.com/akkartik/wart/commit/365a2ce3ac

Check out the details below, and give me your reactions. Is this too ugly to be worthwhile?

Excluding tests, this change reclaimed ~50 LoC. In all, this whole experiment has costed 225 LoC. I'm going to play with it for a bit before I decide to back it out.

---

Compromises:

1. In wart, unlike arc, you could mix unquote with ssyntax: car.,x, etc. This had to go.

2. You can no longer use ssyntax with operators: ++. used to expand to (++); now it's just a three-character op. Haskell's prefix syntax is now required to escape infix.

3. list.-1 is now a call to the .- op. As planned (http://arclanguage.org/item?id=16801) I just defined it to do what I mean, but it's a kinda ugly user-space coupling. And it requires handling assignment separately. (http://github.com/akkartik/wart/blob/365a2ce3ac/040.wart#L27; http://github.com/akkartik/wart/blob/365a2ce3ac/047assign.wa...)

As a happy bonus, ++.n is now ++n.

---

Some special-cases are hardcoded into the reader:

1. Periods can be part of operators, but dotted list syntax now uses ..., which is never an operator.

2. Period at end of symbol calls it. prn. is (prn), not (prn .)

3. Colon at start of symbol is part of the symbol. This was always true, for keyword args. It means I can't define :~ to support f:~g; it just didn't seem worth another special-case.

4. Bang at the end of a symbol is part of the symbol, for mac!, reverse!, etc.

5. Bang has another special-case. In keeping with left-associativity, prefix ops are always highest-precedence:

  ~odd?.n  ; => (. (~ odd?) n)
However, ! at the start of a symbol is _lowest_ precedence:

  !cdr.x  ; => (not (. cdr x))
Perhaps I'll get rid of this feature. We'll see.

-----

2 points by fallintothis 4981 days ago | link

1. Periods can be part of operators, but dotted list syntax now uses ..., which is never an operator.

Seems a worthwhile trade-off. Dotted lists are used infrequently enough, and an ellipsis does just as well as a single dot.

2. Period at end of symbol calls it. prn. is (prn), not (prn .)

Hm. So this is like a vestigial instance of ssyntax?

3. Colon at start of symbol is part of the symbol. This was always true, for keyword args. It means I can't define :~ to support f:~g; it just didn't seem worth another special-case.

Yeah, wouldn't want a special case on top of a special case! :)

4. Bang at the end of a symbol is part of the symbol, for mac!, reverse!, etc.

Have you considered a non-operator character for this use, to ditch the special case? I'm partial to mac=, reverse=, etc. I mean, since = isn't used for comparison anyway. And assuming that = is actually not an operator character. Did you ever decide if you wanted = to be an infix operator (and thus character)?

5. Bang has another special-case.

Whoa. Did I miss where this infix notation extended to prefix operators? Or does this work the same way ssyntax did? And if so, in what sense has ssyntax been removed? :)

Is this too ugly to be worthwhile?

Hm...parsing is getting too complicated for my tastes. But then, my taste is for parentheses. :P

Still, carving out special cases so ssyntax still "mostly works" isn't quite what I envision as a way to unify away ssyntax. Basically, is "traditional" (inasmuch as Arc establishes tradition) ssyntax prohibitively useful? Or can we do without some of its uses in the name of a more general infix notation without the complications of special symbol parsing?

-----

2 points by akkartik 4980 days ago | link

Ok, I've tried to be ruthless and take the ssyntax stuff out. '!' is now always part of regular symbols, just like '?'. There's no prefix version like !a. And there's also no infix version, like f!sym.

It turns out this doesn't actually bring back any parens. !x becomes no.x or not.x. And in some situations I can replace a!b with a'b. (Maybe that's too clever.)

I've also dropped support for turning x. into (x). Not even traditional arc has that. Now x. turns into (x .).

Only remaining special-cases: '...', and ':' at start of sym is part of the sym.

Whoa. Did I miss where this infix notation extended to prefix operators?

Good point. This happened as part of the elimination of ssyntax, but I figured it was intuitive to extend left-associativity to prefix ops. However, now I see that:

  (f a + b) => (f (+ a b))
but:

  (- a + b) => (+ (- a) b)
Is that weird?

Thanks for the comments! This really seems like an improvement over my original idea.

-----

2 points by fallintothis 4980 days ago | link

Thanks for the comments! This really seems like an improvement over my original idea.

I'm glad you think so. I try to make my suggestions as nonprescriptive as possible, though (in full disclosure) I'm liable to lead you in circles back to prefix notation if you follow them too far. :P

It was that or lose <=, etc.

Oh, duh. Move along, nothing to see here!

Only remaining special-cases: '...', and ':' at start of sym is part of the sym.

I'm really okay with ..., because it doesn't feel like a "special case" as much as it does a built-in keyword; I wouldn't expect to be able to redefine fn or if, either. I don't really have an opinion on the :keyword symbols.

  (f a + b) => (f (+ a b))
but:

  (- a + b) => (+ (- a) b)
Is that weird?

Maybe, maybe not. It's not like every other language doesn't do mixfix with their "infix" notation. I just wasn't sure how it worked. Do you declare that certain operators are prefix? Or are they all potentially prefix, like

  ( mixfix a b ... )   -->   ( ( mixfix a ) b ... )
where mixfix is any operator, a and b are any normal token, and ... is 0 or more tokens? Or something like that?

  x-1.0
What's the intuitive way to parse this?

I'd say as subtraction of a float: (- x 1.0). If nothing else, I can't imagine a reason to do ((- x 1) 0).

Is it worth getting right, or should we just say, "don't use floats with infix"?

My gut reaction is that it's worth getting right, because programming languages shouldn't be ambiguous.

I notice that a lot of these problems seem to come from using the dot. Thinking back about ssyntax now, it occurs to me that the dot is probably the least-used among them, in Arc. If I were to guess from my own code, I'd rank their usage descending as ~, :, !, ., &. But hey, we can put numbers to that:

  (= arcfiles* '("strings.arc" "pprint.arc" "code.arc" "html.arc" "srv.arc" "app.arc" "prompt.arc")
     allfiles* (rem empty (lines:tostring:system "find ~/arc -name \\*.arc")))

  (def ssyntax-popularity (files)
    (let tallies (table)
      (each symbol (keep ssyntax (flat (map errsafe:readall:infile files)))
        (each char (string symbol)
          (when (find char ":~&.!")
            (++ (tallies char 0)))))
      (sortable tallies)))

  arc> (ssyntax-popularity arcfiles*)
  ((#\~ 13) (#\! 9) (#\: 3) (#\& 1) (#\. 1))

  arc> (ssyntax-popularity allfiles*)
  ((#\! 532) (#\: 144) (#\~ 122) (#\. 58) (#\& 27))
Mind you, it's been awhile, so I have no clue what all is in my personal ~/arc directory. Probably various experiments and output from things like sscontract (http://arclanguage.org/item?id=11179) and so on. All the same, the dot is low on the totem pole. I personally wouldn't be heartbroken to have to write (f x) instead of f.x, and you could reclaim the regular dotted list syntax. Would it be worthwhile to backtrack at this point and get !, :, and ~ functionality without worrying about .? There were some existing issues with ! and : (name collisions). ~ is prefix, but if you have a way of extending the infix notation for subtraction, surely it would apply to ~? Related thought: f ~ g could replace the f:~g you were worried about before.

Anyway, just some random thoughts off the top of my head. Do what you will with them.

-----

1 point by akkartik 4979 days ago | link

Yeah you're right that using period as both an infix op and inside floats is kinda messy. I use it a lot more than you, so I'm still going through the five stages of grief in giving it up. In the meantime I've hacked together support for floats. Basically, the reader tries to greedily scan in a float everytime it encounters a sym-op boundary. Some increasingly weird examples:

  wart> e=5
  wart> e-2.0
  3
  wart> e-3e-3
  4.997
  wart> 3e-3-e
  -4.997
Perhaps this is reasonable. We have a rule that's simple to explain, whose implications can be subtle to work out, but which programmers are expected to exercise taste in using. That could describe all of lisp.

-----

2 points by rocketnia 4978 days ago | link

"We have a rule that's simple to explain, whose implications can be subtle to work out, but which programmers are expected to exercise taste in using. That could describe all of lisp."

I don't think the syntax for numbers is very easy to explain. That's the weak link, IMO.

If it were me, I'd have no number literals, just a tool for translating number-like symbols into numbers. Of course, that approach would make arithmetic even less readable. :)

I think the next simplest option is to treat digits as a separate partition of characters like the partitions for infix and non-infix. Digits are sufficient to represent unsigned bigints with a simple syntax. Then most of the additional features of C's float syntax could be addressed by other operators:

  -20.002e23
  ==>
  (neg (20.@1002 * 10^23))
This hackish .@ operator, which simulates a decimal point, could be defined in Arc as follows:

  (def dot-at (a b)
    (while (<= 2 b)
      (zap [/ _ 10] b))
    (+ a (- b 1)))
You could avoid the need for this hack by treating . as a number character, but then you lose it as an operator.

-----

1 point by akkartik 4979 days ago | link

"Do you declare that certain operators are prefix? Or are they all potentially prefix?"

Yeah any op can be used in prefix.

-----

2 points by Pauan 4980 days ago | link

"Basically, is "traditional" (inasmuch as Arc establishes tradition) ssyntax prohibitively useful?"

I don't think so. Nulan completely ditched Arc's ssyntax and only uses significant whitespace, ":" and ";". Yet, despite that, it's capable of getting rid of almost all parentheses.

Oh, by the way, ":" in Nulan has a completely different meaning from ":" in Arc. I just chose it because I thought it looked nice.

-----

1 point by akkartik 4980 days ago | link

Actually, there's one major remaining question. This no longer works:

  x-1.0
What's the intuitive way to parse this? Is it worth getting right, or should we just say, "don't use floats with infix"? Especially since wart recognizes whatever formats the underlying C recognizes:

  -2.14e-3
It'll get complex fast to mix that with infix operators..

-----

1 point by akkartik 4980 days ago | link

Did you ever decide if you wanted = to be an infix operator (and thus character)?

Yes, it's always been infix, so wart lost def= and function= when it gained infix ops. It was that or lose <=, etc. The question in my mind was more of style: even if assignment can be in infix, should I always use prefix?

-----

1 point by akkartik 4983 days ago | link | parent | on: Infix support in wart

  > a . b:c     ; hope you don't have dotted lists?  Or just use a.b:c
Boy do I have dotted lists. You'll take them from my cold dead hands :)

-----

1 point by fallintothis 4983 days ago | link

:) I merely meant that it would break if you had dotted lists---syntax collision.

-----

1 point by akkartik 4983 days ago | link | parent | on: Infix support in wart

"The key difference [to Haskell] is the precedence & associativity thing (where wart is more like Smalltalk). Is this for simplicity/generality, or are there any technical reasons to avoid precedence rules?"

Hmm, I started out from the perspective in http://sourceforge.net/p/readable/wiki/Rationale that precedence violates homoiconicity. But if it happens in the reader and macros always see real s-expressions I suppose there isn't a technical issue.

My only other response: 9 levels of precedence?! Cut your tomfoolery! :)

---

I momentarily considered haskell's backticks, but there's a problem finding a reasonable character. And I wanted to not make the language more complex.

-----

1 point by fallintothis 4983 days ago | link

My only other response: 9 levels of precedence?! Cut your tomfoolery! :)

Ha! So, "for simplicity's sake" it is. :P

(Also, you can thank my sleepy attempts at self-censoring "cut that shit out" for my sounding like https://www.youtube.com/watch?v=nltVuSH-lQM)

-----

1 point by akkartik 4983 days ago | link | parent | on: Infix support in wart

"If you need more than two levels, cut your tomfoolery out and put in some parentheses!"

Exactly! The discussion at http://arclanguage.org/item?id=16749 yesterday was invaluable in bringing me back (relatively) to the fold. And it was also easier to implement :o)

-----

1 point by akkartik 4983 days ago | link | parent | on: Infix support in wart

Link's dead?

Huh, turns out github won't let me shorten tree urls like commit urls. For a second I thought I'd found my first hash prefix collision :) Fixed.

-----

More