Arc Forumnew | comments | leaders | submitlogin
2 points by rocketnia 2071 days ago | link | parent

This is what I think would be a great way to enter and print tables at the REPL:

  arc> (ob (v name "John Doe") (v age 23) (v id 73881))
  (ob (v age 23) (v id 73881) (v name "John Doe"))
And if they must be compatible with `read` and `write`, I think this would be a great way to render them for that:

  (##ob (v name "John Doe") (v age 23) (v id 73881))
This way it's just about as easy to refactor between `(##ob (v k1 ,v1)) and (ob (v k1 v1)) as it is to refactor between `(,a ,b ,c) and (list a b c).

(The v here is for "value." An alternate syntax, (kv ...), could be used for entries where the key isn't quoted.)

(Note that (##ob ...) here is a reader macro call. I'm using a design for reader macros that puts the macro name on the inside of a parenthesis, rather than the approach taken by things like Racket's #hash(...). That way, reader macro names can be descriptive without pushing the indentation far to the right.)

This approach generalizes to just about any other data structure we want, such as graphs, queues, sorted sets, etc. We don't have to pick out new parentheses for each one, and we don't have to specify idiosyncratic indentation rules either, so pretty printing at the REPL can be very nice automatically:

  (ob
    (v name "John Doe")
    (v age 23)
    (v id 73881))
  
  (ob
    (v name
      "John Doe")
    (v age
      23)
    (v id
      73881))
In terms of Racket implementation, it should be pretty easy to get most of this working using `gen:custom-write`, `make-constructor-style-printer`, and `pretty-print`. Racket's pretty-printer will probably give us results I find slightly less satisfying, but it's a start:

  (ob
   (v name "John Doe")
   (v age 23)
   (v id 73881))
  
  (ob
   (v
    name
    "John Doe")
   (v
    age
    23)
   (v
    id
    73881))
There are only a few other tricky parts:

- We may have to represent Arc tables as their own data structure, rather than directly as Racket hashes, so that they print nicely even when they're nested inside other Racket data structures like lists and vectors. This is one distinct place where, for the best possible Racket interop, we may need to avoid representing Arc values the same way as Racket ones. Then again, I think `port-print-handler` might provide the ability to print parts of Racket values using the Arc style, so it could be possible to get very nice interop here.

- In order to get (##foo to be processed as a call to an Arc reader macro called "foo", we would need to replace the Racket ( readtable entry with an entry that behaved the same as it does in non-## cases. Racket's ( syntax isn't as simple as it might seem, as I found out when I wrote a custom open parenthesis for Parendown, and I would be glad to copy out some of my Parendown code to make this work.

- Of course, it would take some design work to decide on Arc-side interfaces for defining things like reader macros, custom write behaviors, and maybe even custom REPL pretty print behaviors and custom quasiquotation behaviors (to determine where unquotes can go). In Racket, customization of the `write` or `print` behavior is usually done in a per-value-type way using `gen:custom-write`, but I think it would be better to associate them with the "current writer" or "current printer" somehow, just as the reader and macroexpander use the "current readtable" and the "current namespace." That would allow us to swap out the writer at the same time as we swap out the reader, rather than letting the `read` and `write` behavior get out of sync. Essentially, I would store all these things in the Arc namespace.

---

Would it be much trouble if I started working toward some of these things for Anarki or Amacx? If I do work on this, which things would need my help the most or would make the best milestones? Honestly, my top priorities right now are Punctaffy and Cene, so even though I can express opinions about Arc, I might not allocate the time to follow through on them myself. (My desire not to burden people with something that I think of as being in only in a half-finished state has always been one of the reasons I commit so rarely to Anarki.)

I know the reader syntax for tables bears very little resemblance to the curly brackets people have been talking about here, and I don't want to trample on that. Maybe tables can `write` with curly brackets while other things tend to use this more general-purpose style.

shawn, are you currently trying to write a full pretty-printer for Arc values from scratch just so Racket hashes can be written using curly brackets? Are you using `port-print-handler` or something? That's another thing I'd rather not trample on if you have an idea underway.



2 points by aw 2071 days ago | link

> Would it be much trouble if I started working toward some of these things for Anarki or Amacx?

My aspiration for Amacx is that it becomes a framework that allows you to create the language you want to create. By analogy, similar to how if you're writing a compiler, and you'd find LLVM useful, you can use LLVM as part of your toolchain to write your compiler.

Thus, if you (or someone) wanted to create a particular reader and printer syntax for tables (whether ##ob and v or something else), then you certainly should be able to do that.

I have both an Arc reader and printer written in Arc, but not yet included in Amacx because currently it's too slow. Working on the reader and printer makes the most sense, I think, after finishing my current work on source location tracking (assuming that works out), both because with a profiler it will be easier to see how to speed up the implementation, and because the reader will need to support source location tracking itself.

There's a lot of "if"s here, but in the happy scenario that everything works out, then hopefully adding ##ob and v (or whatever someone wants) will be easy: just add a few lines of Arc code :-)

-----

2 points by i4cu 2071 days ago | link

Personally, I don't think this is going to make the language more attractive. You've traded better printing for more verbose code.

  current-arc> (obj name "John Doe" age 23 id 73881)

  your-arc> (ob (v name "John Doe") (v age 23) (v id 73881))
maybe?:

  alt-arc> (ob name "John Doe" age 23 id 73881)
returns (Assuming you're attempting to have ordered tables?):

  (ob (v name "John Doe") (v age 23) (v id 73881))

-----

2 points by rocketnia 2071 days ago | link

(I hope you don't mind if I change my mind and use `object` instead of `ob`. I just remembered `ob` is a pretty good local variable name for object values.)

Code could still use `obj`, even in the reader. These two things could be parsed as the same value:

  (##obj name "John Doe" age 23 id 73881)
  (##object (v name "John Doe") (v age 23) (v id 73881))
The reason I suggest interspersing extra brackets and v's, when the concise `obj` already exists, is to avoid idiosyncrasies of pretty-printing `obj` for larger examples.

Here's an example of how a nested table prints in the latest Anarki:

  arc> coerce*
  '#hash((bytes . #hash((string . #<procedure:...ne/anarki/ac.rkt:1128:21>)))
         (char
          .
          #hash((int . #<procedure:integer->char>)
                (num . #<procedure:...ne/anarki/ac.rkt:1133:21>)))
         (cons
          .
          #hash((queue . #<procedure:...t/private/kw.rkt:592:14>)
                (string . #<procedure:...ne/anarki/ac.rkt:1127:21>)
                (sym . #<procedure:...t/private/kw.rkt:592:14>)
                (table . #<procedure:...t/private/kw.rkt:592:14>)))
         (fn
  ...
As a human who can easily apply idiosyncratic rules, here's how I'd probably lay that out if I could only use (##obj ...):

  arc> coerce*
  '(##obj
     
     bytes (##obj string #<procedure:...ne/anarki/ac.rkt:1128:21>)
     
     char
     (##obj
       int #<procedure:integer->char>
       num #<procedure:...ne/anarki/ac.rkt:1133:21>)
     
     cons
     (##obj
       queue #<procedure:...t/private/kw.rkt:592:14>
       string #<procedure:...ne/anarki/ac.rkt:1127:21>
       sym #<procedure:...t/private/kw.rkt:592:14>
       table #<procedure:...t/private/kw.rkt:592:14>)
     
     fn
  ...
There are several idiosyncrasies in action there: I'm choosing not to indent values by the length of their keys, I'm choosing not to indent them further than their keys at all (or vice versa), I am grouping them on the same line when I can, and I'm putting in padding lines between every entry just because some of the keys and values are on separate lines.

Oh, and I'm not indenting things by the length of the "##obj" operation itself, just by two spaces in every case, but that's a more general rule I go by.

As far as Lisp code in general is concerned, those seem like personal preferences. I don't expect anyone to indent this quite the same way. Maybe people could take a shot at it and see if a consensus emerges here. :)

Now suppose I could only use `##object`:

  arc> coerce*
  '(##object
     (v bytes (##object (v string #<procedure:...ne/anarki/ac.rkt:1128:21>)))
     (v char
       (##object
         (v int #<procedure:integer->char>)
         (v num #<procedure:...ne/anarki/ac.rkt:1133:21>)))
     (v cons
       (##object
         (v queue #<procedure:...t/private/kw.rkt:592:14>)
         (v string #<procedure:...ne/anarki/ac.rkt:1127:21>)
         (v sym #<procedure:...t/private/kw.rkt:592:14>)
         (v table #<procedure:...t/private/kw.rkt:592:14>)))
     (v fn
  ...
This saves some lines by not needing whitespace to group keys with their objects. In even larger examples it can cost some lines since it introduces twice as much indentation at every level, so that might be a wash. What really makes a difference here is that all those pairs of parentheses can be pretty-printed just like function calls, so things that process the "##object" syntax don't need to make special considerations for pretty-printing it.

---

"Assuming you're attempting to have ordered tables"

In this thread, the original post's example used unordered tables. It doesn't matter to this design. Ordered tables and unordered tables can coexist with different ## names.

-----