PDA

View Full Version : altcodes and chinese IDN domains


Rossin
15th August 2011, 03:55 AM
Hi there!

I´m sure that most of you knows that altcodes can work very well with Google Chrome, but not with FireFox neither IE.

Example:

業 (xn--q6v).

We can type alt + 26989 on Chrome address bar and we can get that ideogram in a easy and fast way.

In Firefox and IE we get a wrong result: m.

Do you know a way to get the right result?

I tried to change the codification (on IE and FF) to the same that work on Chrome (Unicode-UTF-eight), but again the result is always "m" and not 業.

Any thoughts? Thanks in advance! ;)

blastfromthepast
15th August 2011, 10:14 AM
Solution: Use a real Chinese input method.

domainguru
15th August 2011, 11:53 AM
Sounds like a very esoteric input method :)

But what you are saying is you have a way to type in the numeric Unicode identifier (the codepoint) for the symbol and Chrome converts that to the actual character. Makes sense I suppose ...

If so, cool, but what do you enter exactly in Chrome to see it? I don't understand the "Alt" part, that can't be normal text as the browser wouldn't know whether it was just "normal text". You have to somehow get into the "context" of entering codepoints directly ..... there must be a series of characters to "escape" out of the normal text input context first....

Not sure you will ever get it to work across browsers though. Might just be a "feature" of Chrome for developers.

Update: bit of info here about valid characters in URLs.:

http://www.blooberry.com/indexdot/html/topics/urlencoding.htm

But whatever, I think either Chrome has something "extra" working here to allow it to do the conversion. Its certainly not a standard feature of a URL.

Like we know, all sorts of "shennanigans" go on in browser address bars these days, its just not just typing RFC-compliant URLs ....

Bottom line: don't expect it to work in other browsers.

Rossin
15th August 2011, 04:11 PM
Solution: Use a real Chinese input method.

:lol: Sure, it would be a good solution.

But I´m wondering if we can find an "universal" input mode that doesn´t depend on what you call "real chinese imput method".

It could be an interesting alternative to reach one character (ideograms and letters too) domains that we can´t type in with a regular keyboard.
Then we would just need to remember a short(:blink:) number sequence like that one (26989).

Just an idea.

Rossin
15th August 2011, 04:39 PM
Sounds like a very esoteric input method :)

But what you are saying is you have a way to type in the numeric Unicode identifier (the codepoint) for the symbol and Chrome converts that to the actual character. Makes sense I suppose ...

If so, cool, but what do you enter exactly in Chrome to see it? I don't understand the "Alt" part, that can't be normal text as the browser wouldn't know whether it was just "normal text". You have to somehow get into the "context" of entering codepoints directly ..... there must be a series of characters to "escape" out of the normal text input context first....

Not sure you will ever get it to work across browsers though. Might just be a "feature" of Chrome for developers.



:lol: Yes, it´s a esoteric input method, but I presume it could work in all standard keyboards and this is really interesting! that´s the relevant point.
Here in my computer this procedure works very well.

Sorry, I don´t know if I´m going to explain some kind of obvious thing, but I will clarify the "alt" part:

With the cursor on Chrome´s URL bar, press the ALT key (left) and hold it down while you type the sequence (unicode) of numbers 26989 (on Num Lock pad). After that, release de ALT key. This procedure results the ideogram 業. After that, you just need to add the "dot com" or "dot wherever" and you can reach the site.

But if we do the same procedure on FF or IE, yes, we get a result, but a wrong one. The result is the letter "m".
It would be great if the same code 26989 worked on all brownsers URL bars.
I´m pretty sure that there is a different sequence that works on FF and IE. That´s why I´m asking you about any thoughts on this.

I´m running Windows Vista. I don´t know the results under Linux or in a MAC. I would appreciate if some of you could do these tests and tell me.

As I said to blastfromthepast, I think that is a interesting possibility (alternative) to reach one character IDN domains. These domains could be more "marketable". Don´t you think?

But again, just an idea.

squirrel
15th August 2011, 04:43 PM
have you thought about virtual keyboards ?

domainguru
15th August 2011, 04:49 PM
:lol: Yes, it´s a esoteric input method, but I presume it could work in all standard keyboards and this is really interesting! that´s the relevant point.
Here in my computer this procedure works very well.

Sorry, I don´t know if I´m going to explain some kind of obvious thing, but I will clarify the "alt" part:

With the cursor on Chrome´s URL bar, press the ALT key (left) and hold it down while you type the sequence (unicode) of numbers 26989 (on Num Lock pad). After that, release de ALT key. This procedure results the ideogram 業. After that, you just need to add the "dot com" or "dot wherever" and you can reach the site.

But if we do the same procedure on FF or IE, yes, we get a result, but a wrong one. The result is the letter "m".
It would be great if the same code 26989 worked on all brownsers URL bars.
I´m pretty sure that there is a different sequence that works on FF and IE. That´s why I´m asking you about any thoughts on this.

I´m running Windows Vista. I don´t know the results under Linux or in a MAC. I would appreciate if some of you could do these tests and tell me.

As I said to blastfromthepast, I think that is a interesting possibility (alternative) to reach one character IDN domains. These domains could be more "marketable". Don´t you think?

But again, just an idea.

ok thanks for the explanation.

1) I don't think there is a Mac equivalent. The keyboard modifiers don't work like a PC. If you hit the Alt (Option) key on a Mac and then a number, symbols come up straight away. But the Mac is completely different for character input. I wouldn't expect it to work the same.
2) Whether it works widely or not, I don't think it makes any difference to the value of one character domains. Remember why domains were invented? So people could abstract away from "numerics" such as IP addresses, to things they could remember. Going back to Unicode codepoints isn't really a step forward!

Still be interested if you can get it working on other PC browsers ....

Rossin
15th August 2011, 04:54 PM
have you thought about virtual keyboards ?

Yes, but I was thinking about an alternative method that we could use for all IDN characters and without any need to install softwares or hardwares. Just using a standard keyboard.

I´m wondering about use this specific input method using unicode number sequences. I know it is not a "sooo cool" solution, but I think it´s not so bad.

Example: you can advertise a chinese ideogram site like this:
"reach us in a easy and fast way typing alt 22222 dot com!!!".

People use to remember ZIP codes, don´t they? :lol:

Just a crazy idea, I know. But who knows?

squirrel
15th August 2011, 04:58 PM
might as well teach people to type in punycode

domainguru
15th August 2011, 05:01 PM
Yes, but I was thinking about an alternative method that we could use for all IDN characters and without any need to install softwares or hardwares. Just using a standard keyboard.

I´m wondering about use this specific input method using unicode number sequences. I know it is not a "sooo cool" solution, but I think it´s not so bad.

Example: you can advertise a chinese ideogram site like this:
"reach us in a easy and fast way typing alt 22222 dot com!!!".

People use to remember ZIP codes, don´t they? :lol:

Just a crazy idea, I know. But who knows?

It is a bit crazy to be honest. Which is best?

1) Find us at 業.com, or ....
2) Find us at "alt 268919", and then a paragraph explaining what that means, how it doesn't work on Macs, or most PCs ...

Nah, not a good idea. You might as well go back to ASCII domains as use that method. Why not just purchase 268919.com and be done with it. nearly all 6-digit domains are available.

Your argument appears to be, what if the geezer doesn't have "業" on their keyboard? well, they won't want to visit the site. that's the answer.

also, a "standard" keyboard doesn't exist, that idea was outdated many years ago, totally outdated now ... and totally obsolete in the future when anyone will be able to select 100 virtual keyboards on whatever device they are using.

Full marks for thinking outside the box, but no big fat cigar in this case, sorry.

domainguru
15th August 2011, 05:02 PM
might as well teach people to type in punycode

Shit no. Some of my punycodes are like 20 chars long :p

Rossin
15th August 2011, 05:08 PM
might as well teach people to type in punycode

Yes, you got a point here.

But remember a number sequence could be easier than remember xn--wg8a.

Actually, I really don´t think it´s a great problem that people can reach my IDN sites typing in the letters / ideograms, because Mister Google do the indexation very well if we have a "relevant content site".

I´m just playing with the possibilities. Brainstorming about this could give us an other level of interesting solutions, I believe.

Thank you all for the replies!

Rossin
15th August 2011, 05:15 PM
It is a bit crazy to be honest. Which is best?

1) Find us at 業.com, or ....
2) Find us at "alt 268919", and then a paragraph explaining what that means, how it doesn't work on Macs, or most PCs ...

Nah, not a good idea. You might as well go back to ASCII domains as use that method. Why not just purchase 268919.com and be done with it. nearly all 6-digit domains are available.

Your argument appears to be, what if the geezer doesn't have "業" on their keyboard? well, they won't want to visit the site. that's the answer.

also, a "standard" keyboard doesn't exist, that idea was outdated many years ago, totally outdated now ... and totally obsolete in the future when anyone will be able to select 100 virtual keyboards on whatever device they are using.

Full marks for thinking outside the box, but no big fat cigar in this case, sorry.

You have a good point here when you talk about 6-digit domains.

When I say "Standard keyboard" I mean a keyboard with numbers and a ALT key. I´m not a international guy :rolleyes: but I think we are able to find this kind of features (alt key and numbers) in the most keyboards around the world, aren´t we? Sorry if I´m wrong.

Rossin
15th August 2011, 05:24 PM
Shit no. Some of my punycodes are like 20 chars long :p

:lol: that´s why I´m talking about one character IDN domains only!

With one charater IDNs, the Alt code sequence is 3, 4 or 5 digits long.

An real example:

Type in Alt + 444 + .com
The results is ƽ.com
The domain ƽ.com is easy to reach using this method.
But only with Google Chrome.
(could be better if the sequence was 555 LOL)

I just want to understand why this procedure don´t work on FF and IE.

domainguru
15th August 2011, 05:28 PM
You have a good point here when you talk about 6-digit domains.

When I say "Standard keyboard" I mean a keyboard with numbers and a ALT key. I´m not a international guy :rolleyes: but I think we are able to find this kind of features (alt key and numbers) in the most keyboards around the world, aren´t we? Sorry if I´m wrong.

Well Macs don't have ALT keys as such, not that operate the same as PCs.

And as you have found out yourself, the feature doesn't even work on PC+IE or PC+FF, or probably PC+Safari.

Dead duck, sorry, move on.

domainguru
15th August 2011, 05:30 PM
:lol: that´s why I´m talking about one character IDN domains only!

With one charater IDNs, the Alt code sequence is 3, 4 or 5 digits long.

An real example:

Type in Alt + 444 + .com
The results is ƽ.com
The domain ƽ.com is easy to reach using this method.
But only with Google Chrome.
(could be better if the sequence was 555 LOL)

I just want to understand why this procedure don´t work on FF and IE.

I already told you why it doesn't work. because its a special feature of Chrome. There is no reason why typing Alt + 444 + .com should result in that unless the browser writers have coded it into their address bar code.

In other words, its not a feature of any standard, therefore its not a standard feature.

Cool line that :)

Rossin
15th August 2011, 06:12 PM
I already told you why it doesn't work. because its a special feature of Chrome. There is no reason why typing Alt + 444 + .com should result in that unless the browser writers have coded it into their address bar code.

In other words, its not a feature of any standard, therefore its not a standard feature.

Cool line that :)

I don´t think you explained, not at all. :p

The reason why typing alt + 444 give us as result a "ƽ" (hope you can see the little "5" and not an square) is that the input method "alt + decimal sequence" is a old system feature (Windows, at least), not a specific Google Chrome´s one.

The sequence 444 is the decimal correspondent "ƽ" code. You can see here (look at HTML Entity (Decimal)): http://graphemica.com/%C6%BC

You can use it anywhere under the Windows: That "alt code" input method feature DOES WORK on FF and IE URL bar or any data field (like here in this forum!), and under Microsoft Office, notepad, writer and so on, BUT the result is not the expected character under FF and IE URL bar. At least here in my machine I can´t get the right character based on the RIGHT DECIMAL correspondent sequence code.

It´s ok if nobody doesn´t know the answer to my question. But I´m sure that there is an answer.

Thanks Domainguru for your attention! I really appreciate it.

blastfromthepast
15th August 2011, 07:51 PM
I still don't understand why you don't just use a regular Chinese input method for typing Chinese.

PS: ƽ is an obsolete letter, it is no longer used in any language.

Rossin
15th August 2011, 08:53 PM
I still don't understand why you don't just use a regular Chinese input method for typing Chinese.

PS: ƽ is an obsolete letter, it is no longer used in any language.

Yep, thank you, but I do know ƽ is an obsolete letter (and not IDNA2008 compatible). It was just an example. I´m sure you know that any IDN characters has its decimal code and we have an altcode for every character in the world. You can choose another character! It´s up to you!

I put this thread here in Chinese IDNs but I´m thinking of Chinese AND all other IDNs characters too, as I´ve already said. So, again: I´m trying to figure out an alternative and universal input method for IDN single letters and ideograms that people can´t reach UNLESS they use the regular input method. It´s a broad thinking (I hope you understand my poor English).

Almost everyboody in the world understands hindu-arabic numerals and how to type them in. Not everybody knows how to use what you call "a regular input method". Or Am I wrong?
A chinese could know how to use the "regular" input method for HIS own language, right??
I´m thinking about something a little bit different: type in ANY IDN letter / ideogram in ANY language in a fast and easy way. No setups, no virtual keyboads, no software or hardware installation. I want to use ONLY the "regular" keyboard.

But really, all of this stuff is just an idea! You don´t need to take that serious! :lol:

I´m just curious :confused: WHY "alt + decimal" input method works CORRECTLY only under Chrome and not under FF and IE.

If you don´t know the answer to my question (and now I know that you don´t), just let´s move on. And probably you don´t know because it´s really nonsense and useless information.

Regards!

domainguru
16th August 2011, 07:31 AM
I don´t think you explained, not at all. :p

The reason why typing alt + 444 give us as result a "ƽ" (hope you can see the little "5" and not an square) is that the input method "alt + decimal sequence" is a old system feature (Windows, at least), not a specific Google Chrome´s one.

The sequence 444 is the decimal correspondent "ƽ" code. You can see here (look at HTML Entity (Decimal)): http://graphemica.com/%C6%BC

You can use it anywhere under the Windows: That "alt code" input method feature DOES WORK on FF and IE URL bar or any data field (like here in this forum!), and under Microsoft Office, notepad, writer and so on, BUT the result is not the expected character under FF and IE URL bar. At least here in my machine I can´t get the right character based on the RIGHT DECIMAL correspondent sequence code.

It´s ok if nobody doesn´t know the answer to my question. But I´m sure that there is an answer.

Thanks Domainguru for your attention! I really appreciate it.

OK gotcha, now I understand how you knew about it in the first place!

Can't be so old though on Windows, surely, as Windows hasn't been Unicode aware for that long? :)

All then I can say is "Kudos to Chrome" for strictly obeying Windows "text input" rules. And ironic Microsoft doesn't obey its own rules. But then again, Microsoft programming was always the sloppiest in the industry by a long way. I'm sure they employed unemployed English majors as programmers 'cos they clearly never employed real programmers when developing their browser.

Big problem, apart from only working on Chrome, is it isn't a Windows world any more......... maybe it works on "Chrome OS" or Android?

andre
16th August 2011, 09:15 AM
Mac OSX Snow Leopard has the Unicode Hex Input keyboard mapping

System Preferences ⇒ Language & Text ⇒ Input Sources

So, for example, to get the chinese character referred to earlier select Unicode Hex Input and type alt 696D to get 業

I frequently use Unicode Hex Input to insert the various Unicode formatting & control characters into documents.

All OSX keyboard mappings & Input Methods are system wide so can be used in any app or browser.

André 小山 Schappo
http://blog.sina.com.cn/andreschappo
http://weibo.cn/andreschappo

domainguru
16th August 2011, 09:33 AM
Mac OSX Snow Leopard has the Unicode Hex Input keyboard mapping

System Preferences ⇒ Language & Text ⇒ Input Sources

So, for example, to get the chinese character referred to earlier select Unicode Hex Input and type alt 696D to get 業

I frequently use Unicode Hex Input to insert the various Unicode formatting & control characters into documents.

All OSX keyboard mappings & Input Methods are system wide so can be used in any app or browser.

André 小山 Schappo
http://blog.sina.com.cn/andreschappo
http://weibo.cn/andreschappo

Cool, thanks. Finally managed to get that Chinese character to display!

But now we are into the realms of typing HEX, even more down the esoteric text input mode than the Windows version!

But useful nonetheless :) is there no decimal version? just to keep Rossin happy? ...

blastfromthepast
16th August 2011, 11:18 AM
Mac OS X: How to type Unicode characters, including Symbol or Zapf Dingbat fonts
http://support.apple.com/kb/ht1518

http://www.themaclawyer.com/uploads/image/Character%20Palatte(1).png

blastfromthepast
16th August 2011, 11:18 AM
Using Character Map
You can use Character Map to copy and paste special characters into your documents, such as the trademark symbol, special mathematical characters, or a character from the character set of another language.

http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/app_charmap.mspx

http://tlt.its.psu.edu/suggestions/international/graphics/vista/chaprmapvista.gif

blastfromthepast
16th August 2011, 11:24 AM
I´m thinking about something a little bit different: type in ANY IDN letter / ideogram in ANY language in a fast and easy way. No setups, no virtual keyboads, no software or hardware installation. I want to use ONLY the "regular" keyboard.

There is no fast and easy way to do this because it is not something most people need. What people need is to type in their own language, and occasionally get access to a few characters here and there from outside their regular range. This can easily be accomplished by using the Character Map, Character Palette, hex input, etc. as noted above. Neither of these methods is particularly fast, the fastest is using a dedicated input system for a particular language. Dedicated input methods are typically standard software on most computers nowadays.

From a domain point of view, copy and paste usually is good enough for people who can't type in the language, and in addition, a parallel shortcut domain in ascii can be registered and forwarded to the IDN.

domainguru
16th August 2011, 11:25 AM
Mac OS X: How to type Unicode characters, including Symbol or Zapf Dingbat fonts
http://support.apple.com/kb/ht1518

http://www.themaclawyer.com/uploads/image/Character%20Palatte(1).png

Yeah, that's nice but was never a problem. Nearly every one with a Mac must know about the superb character viewer and keyboard viewer to select Unicode characters. Its probably been a feature of Macs for 20 years.

The problem was simply trying to replicate what Rossin was doing i.e. if you start from a place where you know the codepoint, how do you get the character into a text box ....

I know, I know, this discussion's a bit weird, lol.

andre
16th August 2011, 11:38 AM
Cool, thanks. Finally managed to get that Chinese character to display!

But now we are into the realms of typing HEX, even more down the esoteric text input mode than the Windows version!

But useful nonetheless :) is there no decimal version? just to keep Rossin happy? ...

I do not know of a decimal version. Personally, when dealing with Unicode, I find it easier and more natural to use hex.

André 小山 Schappo
http://blog.sina.com.cn/andreschappo
http://weibo.com/andreschappo

Rossin
16th August 2011, 11:17 PM
And ironic Microsoft doesn't obey its own rules. But then again, Microsoft programming was always the sloppiest in the industry by a long way. I'm sure they employed unemployed English majors as programmers 'cos they clearly never employed real programmers when developing their browser.

Yes! I did not realize this! It´s really ironic! WHY Microsoft doesn´t obey its own rules? :no:
I guess you gave us the right answer above!

Rossin
17th August 2011, 12:10 AM
Mac OSX Snow Leopard has the Unicode Hex Input keyboard mapping

System Preferences ⇒ Language & Text ⇒ Input Sources

So, for example, to get the chinese character referred to earlier select Unicode Hex Input and type alt 696D to get 業

I frequently use Unicode Hex Input to insert the various Unicode formatting & control characters into documents.

All OSX keyboard mappings & Input Methods are system wide so can be used in any app or browser.

André 小山 Schappo
http://blog.sina.com.cn/andreschappo
http://weibo.cn/andreschappo

Hi Andre! Thank you for the information! :)
I know nothing about Mac.

I remember that a few weeks ago we talked about punycode showing on Twitter instead the real ideogram, do you remember? You wrote a post about it on your blog.

Good to know about this Mac feature, although the HEXA code is a bit more complex (´cause envolves letters) than the Decimal (only numbers).

Regards!

Rossin
17th August 2011, 12:22 AM
There is no fast and easy way to do this because it is not something most people need. What people need is to type in their own language, and occasionally get access to a few characters here and there from outside their regular range. This can easily be accomplished by using the Character Map, Character Palette, hex input, etc. as noted above. Neither of these methods is particularly fast, the fastest is using a dedicated input system for a particular language. Dedicated input methods are typically standard software on most computers nowadays.

From a domain point of view, copy and paste usually is good enough for people who can't type in the language, and in addition, a parallel shortcut domain in ascii can be registered and forwarded to the IDN.

Yes, I agree with you that is not something that people need. But the needs can be created, that´s the marketing magic power, isn´t it? :lol:

The character map is cool, I´m always using it, but it´s not fast.
And it´s not 100% complete.

You are absolutely right: "copy and paste" works great! :D

Rossin
17th August 2011, 12:26 AM
Yeah, that's nice but was never a problem. Nearly every one with a Mac must know about the superb character viewer and keyboard viewer to select Unicode characters. Its probably been a feature of Macs for 20 years.

The problem was simply trying to replicate what Rossin was doing i.e. if you start from a place where you know the codepoint, how do you get the character into a text box ....

I know, I know, this discussion's a bit weird, lol.

It´s not a bit weird: it´s insane!!! But is cool! :lol:

I´m almost sure that in the future, this "altcode + decimal" feature will be good for some specific one character IDN domains. But maybe I´m wrong! Or crazy!

Thank you for all of you! This is a nice discuss!

Rossin
17th August 2011, 12:46 AM
Cool, thanks. Finally managed to get that Chinese character to display!

But now we are into the realms of typing HEX, even more down the esoteric text input mode than the Windows version!

But useful nonetheless :) is there no decimal version? just to keep Rossin happy? ...

:lol: I´m already happy with the information about Mac HEX! But yes, Decimal would make me soooo happy!

Domainguru, could you tell me please if you are able to use this feature in the URL bar? Does it works there or you need to copy/paste the ideogram from other place (like a text editor)?

domainguru
17th August 2011, 07:04 AM
:lol: I´m already happy with the information about Mac HEX! But yes, Decimal would make me soooo happy!

Domainguru, could you tell me please if you are able to use this feature in the URL bar? Does it works there or you need to copy/paste the ideogram from other place (like a text editor)?

Yep, works in the address bar on Mac. Switch the input method to "Unicode Hex Input", hold down "alt" and you can merrily whack in any number of strange ideograms by bashing away on the numbers :)

So all good, nice to know, etc. but ..... knowing about the "Unicode Hex Input" alone would get you top table at a geek convention, its not exactly regular Joe stuff.

Fun though. And since its a Mac, I don't even need to test different browsers, I just tried my default (Firefox) and it worked. The others will work as well.

blastfromthepast
17th August 2011, 01:26 PM
If you would like to have a decimal input tool, we can develop one for the Mac. But I don't think there is enough demand for such an input method. Would you pay for it?

Rossin
18th August 2011, 04:53 PM
If you would like to have a decimal input tool, we can develop one for the Mac. But I don't think there is enough demand for such an input method. Would you pay for it?

I really appreciate your offer, it´s very nice to know that a feature like this is possible, but I´m looking for free features! :rolleyes:
That´s why I´m trying to discover how to replicate the Google´s behavior (accepting alt + decimal) on FF and IE.

Regards!

Rossin
18th August 2011, 05:01 PM
So all good, nice to know, etc. but ..... knowing about the "Unicode Hex Input" alone would get you top table at a geek convention, its not exactly regular Joe stuff.

The "Geek convention" part was great!! :lol:

Well, thank you guys for the opinions! Nice! :cool: