Arc Forumnew | comments | leaders | submit | aaco's commentslogin
3 points by aaco 4279 days ago | link | parent | on: Clarification about Character Sets

I fail to see where Arc doesn't support Unicode, since it seems to me that Arc is just using MzScheme strings, which are just Unicode strings.

Can someone explain this to me?

Some examples:

  ;◠ is a 2 bytes Unicode char, but I guess it's escaped in this forum, so replace it with the correct character when testing.
  arc> (len "a◠b") ; Unicode
  arc> (len "axb") ; ascii
  arc> (coerce #\◠ 'int) ; Unicode
  arc> (coerce #\x 'int)  ; ascii
  arc> (subseq "a◠b" 1 2) ; Unicode
  arc> (subseq "axb" 1 2)  ; ascii
Where does Arc don't support Unicode?!


3 points by olavk 4279 days ago | link

That just shows how agile PG is. He added unicode support the minute he saw people request it! :)

Seriously, PG explicitly claims that Arc intentionally doesn't support anything but ASCII (, so that might be why people (including me) believed that to be the case.


1 point by aaco 4278 days ago | link

Yes, I think Arc intentionally supports only ASCII just to not bother with Unicode issues as of right now.

Anyway, I can't see how Unicode can break in Arc. I'm not a Lisper, but I think you can't extract 1 byte from an Arc string (since it's just a MzScheme string), but 1 char instead. That's a different concept, because in Unicode 1 char can be formed with 1, 2 or more bytes.


2 points by bobbane 4278 days ago | link

Watch out - that's single-portable-implementation thinking. When Paul puts out another release of Arc based on, say, another Scheme implementation, or SBCL, those tricks won't work.


2 points by aaco 4280 days ago | link | parent | on: Ask Arc: What's the Arc symbol going to be?

I'd use an Unicode arc.



Would work as a presage for a soon Unicode implementation.


2 points by bayareaguy 4280 days ago | link

Character 2221 (measured angle) from the mathematical operators range would be good too.