Arc Forumnew | comments | leaders | submitlogin
2 points by aw 5262 days ago | link | parent

[edit] added support for #t, #f, and #\nul...

akkartik, I was wondering if you might like to test this approach and see how fast it is compared to your version? (I could test the speed myself if you wanted, but without your JSON data my results might not be representative).

I believe the particular require planet line below requires MzScheme version 4, let me know if you're using version 3...

  (def deepcopylist (xs)
    (if (no xs)
         nil
         (cons (deepcopy (car xs)) (deepcopylist (cdr xs)))))

  (= scheme-f (read "#f"))
  (= scheme-t (read "#t"))

  (def deepcopy (x)
    (if (is x scheme-t)
         t
        (is x scheme-f)
         nil
        (is x #\nul)
         nil
         (case (type x)
           table (w/table new
                   (each (k v) x
                     (= (new (deepcopy k)) (deepcopy v))))
           cons (deepcopylist x)
           x)))

  ($ (require (planet dherman/json:3:0)))

  (= scheme-read-json ($ read-json))

  (def read-json (s)
    (deepcopy (scheme-read-json s)))
The require planet line gave me a bunch of warnings about not having scribble installed, but it worked anyway.

The idea is we run the original unmodified Scheme module, and then convert its Scheme return value to Arc.

I expect this would probably be slower than your code. The question is, how much slower? If it turns out to be, oh, 1% slower for example, we might not care! :) And a lot easier than going into every Scheme module we might want to use and modifying it to return correct Arc values.



1 point by aw 5262 days ago | link

OK, here's a version that uses the lshift parser that you ported. Uncomment the require file line and put in the path to the original json.ss module...

  (def deepcopylist (xs)
    (if (no xs)
         nil
         (cons (deepcopy (car xs)) (deepcopylist (cdr xs)))))

  (= scheme-f (read "#f"))
  (= scheme-t (read "#t"))
  (= scheme-vector? ($ vector?))
  (= scheme-void? ($ void?))
  (= vector->list ($ vector->list))
          
  (def deepcopy (x)
    (if (is x scheme-t)
         t
        (is x scheme-f)
         nil
        (scheme-void? x)
         nil
        (scheme-vector? x)
         (w/table new
           (each (k . v) (vector->list x)
             (= (new (deepcopy k)) (deepcopy v))))
        (acons x)
         (deepcopylist x)
         x))

  ; ($ (require (file "/tmp/json-scheme-20050827134102/json.ss")))

  (= scheme-json-read ($ json-read))

  (def json-read (s)
    (deepcopy (scheme-json-read s)))
I timed a few runs of your port and this version against your data set; times for both versions varied between 585ms and 831ms on my laptop, but there wasn't a difference that I could see between the two versions given the spread of times for each run.

-----

1 point by akkartik 5262 days ago | link

Interesting that you see no difference! What platform are you on?

BTW, you can replace deepcopylist with simply (map deepcopy x).

-----

1 point by aw 5262 days ago | link

I'm running Linux on my laptop. There could well be a difference, I simply haven't run the tests enough times to be able to tell, given that I'm getting a pretty wide spread of run times for each version. Which could be for example my web browser sucking up some CPU at random times or whatever...

-----

1 point by akkartik 5262 days ago | link

Great idea. I thought of that approach but discounted it without attempting.

I tried running it but got a parse error.

  default-load-handler: cannot open input file: "/usr/lib/plt/collects/scribble/base.ss" (No such file or directory; errno=2)
  default-load-handler: cannot open input file: "/usr/lib/plt/collects/scribble/base.ss" (No such file or directory; errno=2)
  setup-plt: error: during making for <planet>/dherman/json.plt/3/0 (json)
  setup-plt:   default-load-handler: cannot open input file: "/usr/lib/plt/collects/scribble/base.ss" (No such file or directory; errno=2)
  setup-plt: error: during Building docs for /home/pair0/.plt-scheme/planet/300/4.1.3/cache/dherman/json.plt/3/0/json.scrbl
  setup-plt:   default-load-handler: cannot open input file: "/usr/lib/plt/collects/scribble/base.ss" (No such file or directory; errno=2)
  Error: "read: expected: digits, got: ."
I'm going to email you the data set.

-----

1 point by aw 5262 days ago | link

This appears to be a bug in the JSON parser...

  $ mzscheme
  Welcome to MzScheme v4.1.5 [3m], Copyright (c) 2004-2009 PLT Scheme Inc.
  > (require (planet dherman/json:3:0))
  > (read-json (open-input-string "3.1"))
  read: expected: digits, got: .
I'll take a look at the JSON parser you ported.

It will be a better test anyway... the two different JSON parsers might have very different speeds for all we know.

-----

1 point by akkartik 5262 days ago | link

I just tried it with the version I already have.

  (time:deepcopy:w/infile f "x" (json-read f))
  time: 5874 msec.
  (time:w/infile f "x" (json-read f))
  time: 3953 msec.
So at least here it's within a factor of 2. Pretty useable.

-----

1 point by aw 5262 days ago | link

Yeah, be sure to try each test multiple times. You can get that much of a variance simply from the garbage collector running at one time but another, from the file needing to be read from disk vs. already cached in memory by the operating system, some other process hitting the CPU, or in EC2, the virtual CPU getting fewer cycles at the moment...

-----

1 point by akkartik 5262 days ago | link

Yeah it's fairly consistent. I've tried experiments where I interleave the two versions and compute an average.

-----

1 point by akkartik 5260 days ago | link

I just switched to this version (also in anarki). I was seeing errors with unicode escape sequences, and this bug was just easier to fix.

-----

1 point by aw 5260 days ago | link

Good! I wanted to see if the latest code would work with the hackinator, so I grabbed a copy from Anarki and updated the deepcopy code:

  $ hack ycombinator.com/arc/arc3.1.tar \
         awwx.ws/ac0.hack \
         awwx.ws/scheme-json0.hack
Seems to work. (require (file "lib/.../foo.ss")) appears to be the right thing to do for .ss files; a Scheme load uses Arc's readtable which messes up on square brackets, and a plain (require "lib/.../foo.ss") doesn't like periods in directory names.

-----

1 point by aw 5262 days ago | link

OK.

What version of MzScheme are you running?

-----

1 point by akkartik 5262 days ago | link

v4.1.3 on Ubuntu jaunty on EC2

-----