Arc Forumnew | comments | leaders | submitlogin
1 point by akkartik 5974 days ago | link | parent

:) We're in a post-apocalyptic tale. Woot! I've always wanted to be in a post-apocalyptic tale.

I'm at the head of anarki. Any help with this session appreciated.

  ;; I want to play with parser combinators.
  arc> (load "lib/parsecomb.arc")
  nil
  ;; Example from parsecomb
  arc> (= bin-digit    (one-of "01")
     binary (seq (char "#") (char "b") (many1 bin-digit)))
  #<procedure: binary>
  arc> (binary "#b01" 0)
  Error: "Function call on inappropriate object #3(tagged mac #<procedure>) ((char \"#\" . nil))"

  ;; Ok, mapdelay isn't working. Let's try something simpler.
  arc> (mac foo (a b) `(+ ,a ,b))
  #3(tagged mac #<procedure>)
  arc> (apply foo '(3 4))
  Error: "Function call on inappropriate object #3(tagged mac #<procedure>) (3 4)"
  arc> (apply (rep foo) '(3 4))
  (+ 3 4)
  arc> (eval (apply (rep foo) '(3 4)))
  7
  ;; But that's ugly. Is there a better way?


1 point by almkglor 5974 days ago | link

parsecomb.arc has not been maintained. The current parser combinator library that is maintained is raymyers's lib/treeparse.arc , but you have to coerce strings to parse into 'cons cells.

For that matter the fix in lib/parsecomb.arc should probably be to mapdelay:

  (mac mapdelay (lst)
    `(map [eval:apply (rep delay) (list _)] ',lst))

-----

1 point by akkartik 5974 days ago | link

Yeah, that was the first thing I was going to try, then I got sidetracked :)

Can you elaborate on how I can use treeparse for more generalized parsing? Like if I just have a file with flat text and want to generate a list of strings separated by empty lines?

-----

2 points by almkglor 5974 days ago | link

treeparse is very generalized, so expect to write a lot of code.

As for files: the lib/scanner.arc library allows you to treat input streams (e.g. input files) as if they were a list of characters, which is what treeparse accepts (treeparse can work with "real" lists and scanners).

Treeparse usually returns a list of characters, but this behavior can be modified by using a 'filt parser.

Just off the top of my head (untested!):

  (require "lib/scanner.arc")
  (require "lib/treeparse.arc") ; could use cps_treeparse.arc
  (let (list-str newline line oneline lastline allparse) nil
    ; converts a list of chars to a string
    (= list-str
       [string _])
    ; support \r\n, \n\r, \n, and \r
    (= newline
       ; alt & seq's semantics should be obvious
       (alt
         (seq #\return #\newline)
         (seq #\newline #\return)
         #\newline
         #\return))
    (= line
       ; lots of characters that could be anything...
       ; but *not* newlines!
       (many (anything-but newline)))
    (= oneline
       (seq line
            newline)))
    ; in case the last line doesn't end in newline
    (= lastline
       line)
    (= allparse
       (seq
         ; many is greedy, but will still pass
         ; if there are none
         (many
           ; the filter function has to return a list
           (filt [list (list-str _)]
             oneline))
         ; might not exist; will still pass whether
         ; it exists or not
         (maybe
           (filt [list (list-str _)]
             lastline))))
    (def lines-from-scanner (s)
      (parse allparse s)))
  (def lines-from-file (f)
    (w/infile s f
      (lines-from-scanner (scanner-input s))))
p.s. performance may be slow. cps_treeparse.arc is a more optimized version (maybe 25% faster on average, which most of the speedup in many and seq IIRC, which could be twice the speed), but does not support the "semantics" feature.

-----

1 point by almkglor 5974 days ago | link

More on using scanners and treeparse:

http://arclanguage.com/item?id=5227

As an aside, Arkani (wiki-arc.arc, not to be confused with Anarki, arc-wiki.git) is now called Arki

-----

1 point by akkartik 5974 days ago | link

Thanks a bunch, Alan. I've been reading up. I'll be around :)

-----

3 points by almkglor 5974 days ago | link

If you're curious about the CPS variant of treeparse:

http://arclanguage.com/item?id=5321

Achtung! The CPS variant is written in, of all things, continuation passing style, which is hard on the eyes. It is thus expected to be much harder to read than the straightforward treeparse.

Achtung! In a single Arc session, it is safe to use only treeparse or cps_treeparse. Never load both in the same session.

Achtung! Some of treeparse's features are not supported by cps_treeparse.

-----

1 point by almkglor 5974 days ago | link

Note also that for simple parsing needs, using scanners may be "good enough". For example:

  (require "lib/scanner.arc")
  (def lines-from-scanner (s)
    (drain
      (when s
        (tostring
          ; doesn't handle Mac text files though...
          (while (aand (car s) (if (isnt it #\newline)
                                   (write it)))
            (zap cdr s))))))
  (def lines-from-file (f)
    (w/infile s f
      (lines-from-scanner (scanner-input s))))

-----

2 points by akkartik 5965 days ago | link

It seems like:

  (zap cdr s)
doesn't work when s is a scanner -- s remains unchanged. I looked at the code and convinced myself that macros like zap that call setforms won't work for new types. Does that make sense?

-----

2 points by almkglor 5965 days ago | link

  arc> (require "lib/scanner.arc")
  nil
  arc> (= s (scanner-string "asdf"))
  #3(tagged scanner (#<procedure> . #<procedure>))
  arc> (car s)
  #\a
  arc> (zap cdr s)
  #3(tagged scanner (#<procedure> . #<procedure>))
  arc> (car s)
  #\s

-----

1 point by akkartik 5970 days ago | link

Hmm, just saw this. Thanks for the tip.

I'll work on this some more this weekend, and hopefully have something cogent to say.

-----