Arc Forumnew | comments | leaders | submitlogin
Bug: possibly in Arc, possibly in http-get
5 points by cchooper 5674 days ago | 5 comments
The lib/http-get library doesn't seem to work with sites served by Arc. For example

  (save-page "http://reddit.com" "foo")
works fine, but

  (save-page "http://arclanguage.com" "foo")
returns a malformed header error, as does http://pageonetimes.com, news.ycombinator.com and any local site I create. I looked at the header it was generating and couldn't see anything wrong with it. I'm running Anarki on Windows, so it could be that Windows/MzScheme is mangling the line feeds or something, but I don't have another system to test it on so I can't check that.

So I can't work out if the bug is in what http-get is sending or in what srv.arc will receive :(



3 points by stefano 5674 days ago | link

Seems like something weird is going on with 'readline: a call to the last header line (which should be a line made only of "\r\n") reads a line starting with a newline (something that shouldn't happen). For example for news.ycombinator.com it reads "\n<html><head><link rel=\"stylesheet\" type=\"text/css\" href=\"http://ycombinator.com/news.css\">" instead of an empty line. As a consequence, the header terminator isn't found and the whole page is considered as an header, thus failing to parse it. I wonder why...

-----

4 points by almkglor 5674 days ago | link

It seems partly a problem with srv.arc, it seems it doesn't correctly use \r\n, only \n ~.~;

See in srv.arc:

    (def header ((o type textmime*) (o code 200))
      (string "HTTP/1.0 " code " " (statuscodes* code) "
    " serverheader* "
    Content-Type: " type "
    Connection: close"))
Inspecting the file, they seem to be \n's only, not \r\n

A second problem lies in 'readline:

    (def readline ((o str (stdin)))
      " Reads a string terminated by a newline from the stream `str'. "
      (awhen (readc str)
        (tostring
          (writec it)
          (whiler c (readc str) #\newline
            (writec c)))))
It has to do with the fact that it reads in a character, BUT DOESN'T CHECK THAT THAT FIRST CHARACTER IS A NEWLINE. The only check done is with the second character read.

Will fix.

-----

2 points by cchooper 5673 days ago | link

You fixed it! Thanks!

-----

1 point by thaddeus 5499 days ago | link

I've just started to use the http-get library; one note and one question:

the note: the 'str->url' function contains a function call '1+' which doesn't work in anarki (changing (1+ port) to (+ 1 port) fixes the ability to get webpages from a local port (local8080).

the question: how can I use the 'save-page' function on a page hidden behind a defopl ? ie, how can I pass authentication in to get past the login page of someone elses arc server page?

Thanks, T.

-----

3 points by skenney26 5674 days ago | link

Replicated the error on a Mac.

-----