Arc Forumnew | comments | leaders | submitlogin
1 point by rocketnia 5163 days ago | link | parent

I hope this syntax doesn't apply to every instance of a hyphen before a whitespace character

Oops, I meant to say "a hyphen before a non-whitespace character."

Speaking of which, lots of links have +pluses+ in them as URL-encoded spaces, so that's a troublesome syntax too. What does Textile do in these cases?



4 points by dpkendal 5163 days ago | link

> The fact that you're not escaping HTML characters troubles me. I'd kinda prefer not to risk encountering malevolent JavaScript on a forum, even on a forum where most regulars wouldn't abuse that power.

The original classTextile.php (http://code.google.com/p/textpattern/source/browse/developme... -- warning: big) provides a 'restricted' mode, designed for forum comments etc., in which all input <, > and & characters are escaped, which is what I intend to do here as I continue to work on it.

> It strikes me that this wouldn't handle nesting well, which may be fine, 'cause these spans are things people almost never need to nest--some exceptions being when attributes are involved or when nesting multiple layers of <sup> and <sub>. To make it a bit less sloppy, I recommend manually incrementing an index through the text, using 'begins to identify start tags and maintaining a stack if necessary, even if it sounds horribly ugly to do it that way. :-p This should also give you a good place to insert attribute-parsing code.

While you're right about the attribute-parsing, I'm not too concerned about nesting issues because the current reference implementation (there's a dingus at http://textile.sitemonks.com/) doesn't handle that either. It uses the regexp method too (see classTextile.php), with a callback for parsing attributes.

> Speaking of spans, I'm not sure why you bother making all the txt-@ global variables. I think all you do is use 'eval to define them and another 'eval to get them back, and you don't even need the second 'eval because you have all the information you need.

Thanks. I've now corrected my copy and I'll use your method in the next version.

> Also speaking of spans, I'm troubled by the fact that there's a -del- span at all. I hope this syntax doesn't apply to every instance of a hyphen before a whitespace character, 'cause that'll mess up a lot of variable names and links.

> Speaking of which, lots of links have +pluses+ in them as URL-encoded spaces, so that's a troublesome syntax too. What does Textile do in these cases?

Good point, that's a special case I missed. The function is now:

    (def txt-span (text st et tag) ; st = start textile; et = end textile -- todo: support span attributes
      (re-replace (string "(?<=\\W|^)" (txt-re-quote st) "(\\S.*?\\S?)" (txt-re-quote et) "(?=\\W|$)") text (string "<" tag ">" #\\ 1 "</" tag ">")))
(notice the lookarounds at each end) which should prevent such issues.

> I realize you're in the earlier stages of getting to know Arc, and I don't mean to discourage you or anything. I only mean to contribute. ^_^

Yes! Thank you for your contribution -- it's for advice like this that I released so early.

> The suggestion that comes to my mind is this, but it's kinda laughable....

Yeah, I'll stick with my `str-split` for now. It should be a library function anyhow, so I'd rather keep my version short and simple until there's a version in `strings.arc` or `arc.arc`.

> Apparently you can just keep using 're-replace but pass a function as the "replacement" argument. It appears to work very much like preg_replace_callback(), except that each captured string is passed as a separate argument rather than in an array

Aha, I don't know why I didn't think to try that. I just mentally assumed I would need another function to do it.

Thanks for all your help! I really appreciate it.

(There's now a textile repo at https://github.com/dpkendal/textile, into which I'll slowly be putting improvements.)

-----