"With a community this small I think it's easiest to bundle the two together since anyone can contribute to anarki."
I like that in concept. We could just have a GitHub Pages branch on Anarki, and GitHub would automatically let us view it as a web page. However, aw mentioned having trouble with GitHub Pages: http://arclanguage.org/item?id=12934
Someone could rig up a website to serve GitHub raw file views as HTML, but I don't know if that's nice to GitHub. :)
Someone could instead have a website that somehow keeps an up-to-date clone of Anarki (perhaps triggering a "git pull" not only as a cron job but also every time a certain page was viewed) and somehow uses that to determine the website content.
One thing to consider is security: If anyone can show their own JS code on this page, they could set tracking cookies or something. If anyone can run Arc code on the server, there's even more to worry about (albeit nothing the racket/sandbox module isn't designed for).
---
"Maybe Anarki contributor should designate a collaborative location that can serve as the official site for documentation for atleast user guides, tutorials and faq's."
However, having a separate place for documentation is only one part of what I'm suggesting. I'm not sure it's worth it unless the separate parts are somehow integrated again--for instance, by showing docs and discussions as you browse code, or by letting user guide writers say {{doc:anarki:lib/util.arc:afnwith}} or somesuch to include a piece of the API reference.
---
"For discussions I think we would all agree that arclanguage.com is the best place."
Speaking of which, are arclanguage.com and arclanguage.org both legitimate? Both of their WHOIS entries list Paul Graham, but I don't know whether that means anything. I've never logged in anywhere but arclanguage.org, just because it's what most people link to.
Anyway, we totally do use the Arc Forum for discussions now, but I think things would be better if we could incorporate ideas like the ones from this thread: http://arclanguage.org/item?id=12920
I imagine that my complaints about GitHub Pages at the time were probably just growing pains on GitHub's part.
However exactly for the reason of implementing our own features at some point such as the cross references you mention I expect that we're going to want to do our own processing. Which suggests that GitHub Pages or the arclanguagewiki on Google Sites might be part of the right long term solution, but only if there's a way to e.g. insert the piece of an API reference... which we're generating.
Here's a thought. What if we had a server which went out and gathered documentation source material from various places such as Anarki. (GitHub has http://help.github.com/post-receive-hooks/ so the server could get notified of new pushes to Anarki instead of having to poll).
The server would work on the text of the sources, such as docstrings found in the Anarki source code. That way even if someone pushed something malicious to Anarki then we wouldn't have a security problem (either on the server or in the reader's browser). The server would process the documentation source material and generate static HTML files... which could be hosted on S3 or GitHub Pages. This would have an additional advantage that even if the server were down, the documentation itself would still be up and available.
"The server would work on the text of the sources, such as docstrings found in the Anarki source code."
With this approach, people might be pushing to Anarki way more, sometimes using in-browser file edits on GitHub, and the server would have to scrape more and more things each time. Then again, that would be a good problem to have. :-p
---
"That way even if someone pushed something malicious to Anarki then we wouldn't have a security problem (either on the server or in the reader's browser)."
By the same token, it would be harder for just anyone to update the server, right? Eh, that might be a necessity for security anyway.
Potentially, parts of the server could run Arc code in a sandbox, incorporating the Arc code's results into the output with the help of some format that's known to have no untrusted JavaScript, like an s-expression equivalent of BBCode or something.
Well, code that generates page contents.... Suppose I want to put "prev" and "next" links on several pages, or suppose I want an API reference to automatically loop through and include all the docstrings from a file. Lots of this could be up to the server to do, but I'd like for the documentation itself to have some power along these lines. For instance, someone might write a DSL in Arc and want to set up a whole subsite covering the DSL's own API.
Besides that, it would just be nifty to have the Arc documentation improve as people improved the Arc libraries and vice versa.
Suppose I want to put "prev" and "next" links on several pages, or suppose I want an API reference to automatically loop through and include all the docstrings from a file.
I'd just have the server code do that.
For instance, someone might write a DSL in Arc and want to set up a whole subsite covering the DSL's own API.
Sorry, not following you here. How would this be different?
Besides that, it would just be nifty to have the Arc documentation improve as people improved the Arc libraries and vice versa.
Certainly. Naturally the server code can be written in Arc itself.
Say this DSL is a stack language written in Arc, called Starc, and Starc programs are implemented by lists of symbols. I've set up a global table to map from symbols to their meanings, and I have a 'defstarc macro that submits to that table and supports docstrings.
Now I want my language to have documentation support that's seamless with Arc's own documentation. Somehow I need my Starc documentation to be split across multiple pages, with some pages created using the 'defstarc docstrings. I want Starc identifiers to be displayed in a different style than Arc identifiers, but if anything, I want it easier for a Starc programmer to refer to Starc identifiers in the documentation than to Arc identifiers.
So every time I come up with one of these requirements for the documentation, I should submit a patch to the server or something? Fair enough--the code implementing the documentation oughta be documented somewhere too, and keeping it close to the project also makes it more ad hoc and inconsistent--but I think this would present a bit of an obstacle to working on the documentation. I'd rather there be a compromise, where truly ad hoc and experimental things were doable in independent projects and the most useful documentation systems moved to the server code gradually.
This would be more complicated to design, and it could probably be incorporated into a more authoritarian design after it's underway, so no worries.
- you run a copy of the server code you're working on locally, until you see that your "Stark" documentation is being integrated into the rest of the documentation in the way that you want it to
- you push your changes to the server (say, via github for example) and they go live
OK, but what if you're a completely random person, you've never posted anything to arclanguage.org, no one knows who you are, and you want write access to the server so that you "can do stuff". Alright, fork the repo on github, push your changes there, and send a pull request. Then when you turn out to be someone who isn't trying to install malicious Javascript you are given write access to the server repo yourself. (This is pretty standard approach in open source projects, by the way).
But... what if write access to the server repo ends up being controlled by an evil cabal of conservatives who reject having any of this "Starc" stuff added? Fire up your own server, publish the documentation pages yourself, and people will start using your documentation pages because they are more complete than the old stuff.
My concern with the sandbox idea is that I imagine it's going to be hard to create a sandbox that is both A) powerful enough to be actually useful, and B) sufficiently constrained so that there's no possible way for someone to manage to generate arbitrary Javascript.
I'm finding this discussion very helpful, by the way. What I'm spending my time on now is the "pastebin for examples" site. I've been wondering if this project would stay focused on just the examples part (with the ability for other documentation sites to embed examples from the pastebin site) or if it would expand to be a site for complete documentation itself (the "code site for Arc" idea).
For the pastebin site I've thrown away several designs that weren't working and I've found one that so far does look like it's going to work. But, the catch is that by design it allows the site to execute arbitrary code in the target machine that's running the example. This isn't too terrible by itself (you can always run the example in a virtual machine or on an Amazon EC2 instance etc. instead of on your own personal computer if you want), but it does mean that the "pastebin for examples" site is going to need a higher level of security than an Arc documentation site.
Which in turn implies that while the Arc documentation site can use examples from the pastebin site (if people find it useful), the pastebin site itself shouldn't be expanding to take on the role of the Arc documentation site (since the Arc documentation site can and should allow for a much freer range of contributions).
"But... what if write access to the server repo ends up being controlled by an evil cabal of conservatives who reject having any of this "Starc" stuff added?"
The main thing I'm afraid of is the documentation site becoming stagnant. Too often, someone finds the arclanguage.org website and asks "How do I get version 372 of MzScheme?" Too often, someone who's been reading arcfn.com/doc the whole time finally looks at the Arc source and starts a forum thread to say "Look at all these unappreciated functions!" ^_^
I don't blame pg or kens; I blame the fact that they don't have all the time in the world to do everything they want. I'm in the same position, and I bet it's pretty universal.
---
"Fire up your own server, publish the documentation pages yourself, and people will start using your documentation pages because they are more complete than the old stuff."
That could be sufficient. But then while I'm pretty active on this forum, I'm not sure I have the energy to spare on keeping a server up. If the community ends up having only people as "let someone else own it" stingy as me, we'll be in trouble. >.>;
---
"My concern with the sandbox idea is that I imagine it's going to be hard to create a sandbox that is both A) powerful enough to be actually useful, and B) sufficiently constrained so that there's no possible way for someone to manage to generate arbitrary Javascript."
All I'm thinking of is some hooks where Arc code can take as input an object capable of querying the scrape results and give as output a BBCode-esque representation that's fully verified and escaped before use. But then I don't know if that would be sophisticated enough for multi-page layouts or custom styles or whatnot either. ^^;
There could also be another Arc hook that helped specify what to scrape in the first place... but in a limited way so that it couldn't do denial-of-service attacks and stuff. ^^; lol
Partly it's just a curiosity for me. I like the thought of letting Arc code be run in a sandbox for some purpose, even if it's only marginally useful. :-p
---
Meanwhile, I had another thought: Even if the server doesn't allow running arbitrary code, people could still develop special-purpose things for it by running their own static site generators and putting up the output somewhere where the server will crawl. I wonder how this could affect the server design.
But then while I'm pretty active on this forum, I'm not sure I have the energy to spare on keeping a server up.
I'd be happy to run the server, and set up some kind of simple continuous deployment system so that when someone makes a code push to the server repo the code goes live.
Depending on availability and motivation I may (or may not...) end up having time myself to get Ken's documentation into a form where it can be edited (he generously offered last year to let us do this).
A part that I don't have motivation to do myself is writing the code that would crawl Anarki and generate documentation from the docstrings.
I like the thought of letting Arc code be run in a sandbox for some purpose, even if it's only marginally useful.
I certainly won't prevent someone from adding a sandbox to the server. On the other hand... if you'd like to work on something where a sandbox would be useful ^_^, I'd encourage you join me in my API project :-)
"The main thing I'm afraid of is the documentation site becoming stagnant. Too often, someone finds the arclanguage.org website and asks "How do I get version 372 of MzScheme?" Too often, someone who's been reading arcfn.com/doc the whole time finally looks at the Arc source and starts a forum thread to say "Look at all these unappreciated functions!" ^_^
I don't blame pg or kens; I blame the fact that they don't have all the time in the world to do everything they want. I'm in the same position, and I bet it's pretty universal."
I think if contributing is open and flexible people will contribute to keep the site up todate. Complete and simple instructions must exist to help and encourage people to contribute. Some is social where people feel they need "permission" to contribute.
The interesting thing I am seeing among the experimentation and projects people are doing here is the fragmentation. I think experimentation with languages are great and very necessary but it's difficult to see there isn't a main champion for the community to rally behind.
PS stupid question how are you italicizing quoted text. I tried adding <i>some text</i> but that didn't work. I haven't had enough time to play with the comments to figure that out.
"The server would work on the text of the sources, such as docstrings found in the Anarki source code. That way even if someone pushed something malicious to Anarki then we wouldn't have a security problem (either on the server or in the reader's browser)."
If it ever got to the point where actually eval'ing the code were necessary/desirable, you could do so in a safe namespace in PyArc (hint hint).