There's still one open question regarding filenames. So far we've been writing uploaded files to random filenames, but the uploaded filename is actually present in the POST request. I see 2 ways to provide it to the programmer (say when uploading a file in a field called file):
a) (arg req "file") returns just the contents of the uploaded file; the filename is in (arg req "filename"). "filename" is the multipart header, so I could just inline all the various headers into the args table.
b) (arg req "file") returns an alist or table with the various headers for that part of the multipart POST. File contents are in ((arg req "file") 'contents).
c) (arg req "file") returns file contents like in a, and other metadata like filename or creation-date is in some other field of req, say req!multipart.
b generalizes to multiple file uploads in a single form, but it's also a little more work to get at form input values. c addresses this but now you've got stuff for a field scattered in multiple places. What do people prefer?
Does the fnid multipart form test (the second test) and the static multipart test (the third test) work for you?
They do not work for me. I get a "srv thread took too long for <ip address>". I suspect this could be related to nginx again -- though, I started the webserver on a different port and run form-tests through that different port.
How come a non-multipart test works and a multipart test does not? Is there something going on with multipart and ports?
I tried having nginx stopped and only one Anarki running, and tests two and three do not work for me. Uploading a 7K file runs forever until the thread is killed (srv thread took too long).
A 150k plaintext file works but a 147k binary pdf does not.
Update: The bug has to do with reading in bytes vs characters. Earlier, srv would readc from the POST body unless type was multipart, and your code (like the webupload example) would readb from the body otherwise. Now srv is always the one reading body, and it always reads characters using readc. As a result it gets confused by binary uploads that don't translate to characters.
Update 2: The sentence beginning 'Earlier' is incorrect, and webupload was always using readc as well. I'm not sure what I was looking at.
$ ls -l bintest
-rw-r--r-- 1 akkartik akkartik 145974 Jun 5 12:39 bintest
$ racket -f as.scm
arc> (w/infile f "bintest" (len:readbytes 200000 f))
145974
arc> (w/infile f "bintest" (len:readchars 200000 f))
141878
readchars does some interpretation but otherwise works fine. However when trying to upload bintest through a socket, it never terminates. Most curious.
Update: I've confirmed that the bytes in the file fail to be encoded as a unicode string, so presumably that's the issue. Another bit of sloppiness is that we're trying to read n characters from the request body where n is the Content-Length in bytes. webupload.arc has always had this problem.
Your existing code won't work as is. Since it's meaningless to try to convert possibly-binary data to a string, file contents are now returned as a list of bytes. There's a helper called bytes-string for when you're sure you have just ascii data. Otherwise you'll need to know the encoding of text uploads and convert the bytes appropriately.
I should warn you that it's gotten a lot slower. You might need to temporarily up threadlife. I have some ideas for speeding it up, but let's check first if this works for you.
Ah, this is because the fnid field is being read as a list of bytes. I could convert fnid to string as a special case. Another option is to pass it in with the action url like in aform-multi: http://github.com/nex3/arc/blob/46e3820a6b/lib/srv.arc#L560
Update: Ok, I finally found a way to check when it's safe to convert to string, so now all fields (including fnid) will auto-convert to string when possible. If you get a list of bytes you know it has some sort of non-ascii data.
That is really weird. I find myself momentarily out of ideas :( Maybe some sort of linux setting that controls how ports are opened? Are you running iptables or anything like that?
Done. I came up with a way to get the best of both worlds. Multipart request args are now packaged in a table with all metadata, and with the actual body in key "contents". The arg helper is smart enough to deref "contents" in this table, so you can just say (arg req "file") to get its contents. To get at other fields, use the new function multipart-metadata.
I get in trouble when needing to update Anarki. Some patches like this one don't apply on the Anarki version I use and it seems only a brand new git clone gives the latest Anarki version. Which then requires I move around data, stuff in static, etc.
Is there a better way? How can I separate my app from the Anarki changes?
Can one run .arc programs from a directory that is different than the directory Anarki lives in?
If you aren't making many changes to the base arc it might also make sense to stay updated with anarki. You just have to do a git pull every now and then. Keep your app in new files; that'll ensure pulls don't cause conflicts.
In the short term if you send me your srv.arc I'll make the change for you. (email in profile)
Yes, anarki loads news.arc by default. You probably don't want to do that in your repo. Delete the line from libs.arc. Feel free to do that in anarki as well if it'll make your life easier.
Yes the original distribution doesn't load it by default. But I found that many people come to arc wanting to see how HN runs and it seemed worth reducing one step for them.
On a more pragmatic level, I found I was repeatedly making changes and forgetting to test them with news.arc. Loading it by default was at least a superficial sanity check. But this isn't a big deal. Like I said, feel free to delete the load, commit and git push.
I verified that the example code I just provided does not work if nginx 0.7.67-3+squeeze2 from Debian is proxying connections to Anarki with the following configuration:
You can always call Python from Arc to use some libraries.
A Python dict to s-exp converter is about 38 lines. There's a fromjson.arc (breaks over some input) and a tojson.arc (broken, creates an extra }) somewhere that could help talk to web services.
(w/outfile o f
(let i ($.make-limited-input-port req!in n)
(after ($.copy-port i o)
close.i)))
For all I know, closing i may not be necessary:
(w/outfile o f
($.copy-port ($.make-limited-input-port req!in n) o))
Note that this copies n bytes. The code you posted deals with n characters for some reason, even though Arc provides 'readb and 'writeb for dealing with bytes.
It seems saving w/outfile saves the file correctly as sent. It's just that Anarki does not return after that.
Also, uploading a 874996-byte file results into only 183045 of those bytes being available in the multipart data that end up being saved, and the browser reports an error:
The webpage at http://lovelywebsite.com/x?fnid=6bl1DEG5yq
might be temporarily down or it may have moved permanently
to a new web address.
Also, when this error uploading a 874996-byte file occurs, the 'srv thread took too long' error message is not displayed at all: the browser error occurs within 1-2 secs.
That sounds almost like you're coming across a request size limit somewhere in your server (maybe in news.arc or app.arc or srv.arc, maybe in Apache or nginx or whatever)... but probably not since the upload is less than 4MB. All I know is PHP tends to limit uploads to 4MB unless you reconfigure it. :-p
Maybe you're coming across an issue with the fact that bytes are being treated as characters? Do you have another file you could test? (Maybe one you could share with us?)
I thought there might be a request size limit, since there's an nginx in front of Anarki. But then I tried connecting directly to the Anarki web port and got the same result.
Here's a 822K file which I just confirmed fails to be uploaded:
Sorry for leaving you hanging here.... I'm not really in the practice of running news.arc, and when I asked for a file to use for testing it's because I thought maybe someone else would pick up my slack. XD
http://www.arclanguage.org says "Arc is fluid and future releases are guaranteed to break all your code." As a result the community tends not to care about compatibility either. Anarki has numerous little incompatibilities. Since literally anybody can commit to it at any time it's hard to make any assumptions about it whatsoever.
All of us choose to live in one repo; either arc3.1 or anarki or something of our own. My personal set of favorite incompatibilities is at http://github.com/akkartik/arc, for example. My recommendation: jumping back and forth between arc3.1 and anarki is more trouble than it's worth.
I had to gradually accustom myself to how things work here. Even now I monitor new commits to anarki as I pull them. It can't be like a library you blindly rely on.
You're welcome! One of the great benefits of this model is that it is literally frictionless to propose new ideas. I hope you will feel free to make your own edits directly to anarki. I'd love to see you post about them here if the rationale isn't obvious, but it's always ok to make changes first and see if anybody complains :)
This makes serialization of templates more reliable, but now templates aren't hash-tables anymore; instead of (maptable ... x) you have to say (maptable ... rep.x).
Does this sound like some code you wrote? I'm looking for any places in anarki that use table functions on templates.
I just doublechecked news.arc and didn't see an issue. Can you be more specific? Note that you can't mix news.arc from arc3.1 with arc.arc from anarki because of this incompatible change.
That is so, yes. Since arc can change at any time, to use it you have to be open to changing your app anytime you pull in updates. It's one of the reasons arc is small; it has no bloat from historical baggage. I've given up and even keep my own code directly in the arc directory.
I'm curious what your current setup is. Are you running off of anarki? Or copying parts of it over to your own repo?
I just looked in Arc 3.1, in all of Anarki's branches, a few commits back in Anarki's master ac.scm, and I didn't find anything at line 972 that would cause an error like that.
I'm still curious if you were seeing an issue in news.arc on anarki outside of your own code. If so I really should fix it, but I don't see anything wrong so far. When I made the change I tried to be careful and replace save-table with temstore: http://github.com/nex3/arc/commit/c125d0330c.