Arc Forumnew | comments | leaders | submitlogin
2 points by bogomipz 6047 days ago | link | parent

I think the strongest reason for separate strings and symbols is that you don't want all strings to be interned - that would just kill performance.

About lists of chars. Rather than analyzing lists every time to see if they are strings, what about tagging them? I've mentioned before that I think Arc needs better support for user defined types built from cons cells. Strings would be one such specialized, typed use of lists.

Also, how do you feel about using symbols of length 1 to represent characters? The number one reason I can see not to, is if you want chars to be Unicode and symbols to be ASCII only.



2 points by sacado 6046 days ago | link

Symbols, ASCII only ? No way, I'm writing my code in French, and I'm now used to calling things the right way, i.e. with accents. "modifié" means "modified", "modifie" means "modifies", that's not the same thing, I want to be able to distinguish between both. Without accents, you can't.

Furthermore, that would mean coercing symbols into strings would be impossible (or at least the 1:1 mapping would not be guaranteed anymore).

-----

2 points by stefano 6047 days ago | link

From the implementation point of view representing characters as symbols is a real performance issue, because you would have to allocate every character on the heap, and a single character would then take more than 32 bytes of memory.

-----

2 points by sacado 6046 days ago | link

I think that's an implementation detail. You could still somewhat keep the character type in the implementation, but write them "x" (or 'x) instead of #\x and making (type c) return 'string (or 'sym).

Or, if you take the problem the other way, you could say "length-1 symbols are quite frequent and shoudn't take too much memory -- let's represent them a special way where they would only take 4 bytes".

-----

1 point by stefano 6046 days ago | link

This would require some kind of automatic type conversions (probably at runtime), but characters-as-symbols seems doable without the overhead I thought it would lead to.

-----