IDN Forums - Internationalized Domain Names  
Home | Advertise on idnforums | Premium Membership

Go Back   IDN Forums - Internationalized Domain Names > IDN Talk > IDN Newbies

IDN Newbies IDN domain newbie questions like What is an IDN domain, what are idners, how do you use punycode, is there a Japanese sedo etc. Since this is a new market please don't hesitate to ask questions...

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 2nd September 2006, 05:06 PM
Junior Member
 
Join Date: Jan 2006
Posts: 10
iTrader: (0)
Rep Power: 0
sevent is an unknown quantity at this point
Finding out the language of an IDN

Given just the punycode, is it possible to decode the language characterset of a domain?

Also, it seems that both Chinese and Japanes can have the same character, but these are different IDN's. From what I have heard, if one is taken the registry prevents you from reserving the same character in another language. Is that correct?

Thanks!
Reply With Quote
  #2 (permalink)  
Old 2nd September 2006, 05:22 PM
bramiozo's Avatar
Administrator
 
Join Date: Sep 2005
Location: Haarlem
Posts: 2,251
iTrader: (30)
Rep Power: 957
bramiozo is on a distinguished roadbramiozo is on a distinguished roadbramiozo is on a distinguished roadbramiozo is on a distinguished roadbramiozo is on a distinguished roadbramiozo is on a distinguished road
Send a message via MSN to bramiozo Send a message via Skype™ to bramiozo
Re: Finding out the language of an IDN

http://idntools.net/bulkpuny3.php to get the characterset from punycode.

Yes there is a thing called variant blocking, this is described here.
Reply With Quote
  #3 (permalink)  
Old 2nd September 2006, 05:54 PM
Junior Member
 
Join Date: Jan 2006
Posts: 10
iTrader: (0)
Rep Power: 0
sevent is an unknown quantity at this point
Re: Finding out the language of an IDN

Quote:
Originally Posted by bramiozo
http://idntools.net/bulkpuny3.php to get the characterset from punycode.

Yes there is a thing called variant blocking, this is described here.
Thanks for the info! I did a check of a name and it converted fine to the native looking characters but the script was labeled:

CJKUnifiedIdeographs

Does that seem right? Is there a way to find out from this info what language you are really talking about (ie. Chinese simplified)?
Reply With Quote
  #4 (permalink)  
Old 2nd September 2006, 06:05 PM
Drewbert's Avatar
Administrator
 
Join Date: Feb 2006
Posts: 6,091
iTrader: (20)
Rep Power: 0
Drewbert is a tad dodgyDrewbert is a tad dodgyDrewbert is a tad dodgyDrewbert is a tad dodgyDrewbert is a tad dodgyDrewbert is a tad dodgyDrewbert is a tad dodgy
Re: Finding out the language of an IDN

Quote:
Originally Posted by sevent
CJKUnifiedIdeographs

Does that seem right? Is there a way to find out from this info what language you are really talking about (ie. Chinese simplified)?
What do you think might be a way of figuring out if a text string is Chinese, Japanese or Korean, or multiples of them?
__________________
It's all jaded style to me.
Reply With Quote
  #5 (permalink)  
Old 2nd September 2006, 06:16 PM
bramiozo's Avatar
Administrator
 
Join Date: Sep 2005
Location: Haarlem
Posts: 2,251
iTrader: (30)
Rep Power: 957
bramiozo is on a distinguished roadbramiozo is on a distinguished roadbramiozo is on a distinguished roadbramiozo is on a distinguished roadbramiozo is on a distinguished roadbramiozo is on a distinguished road
Send a message via MSN to bramiozo Send a message via Skype™ to bramiozo
Re: Finding out the language of an IDN

Quote:
Originally Posted by sevent
Thanks for the info! I did a check of a name and it converted fine to the native looking characters but the script was labeled:

CJKUnifiedIdeographs

Does that seem right? Is there a way to find out from this info what language you are really talking about (ie. Chinese simplified)?
There are unicode-ranges which are used for several languages (latin,kanji,romaji etc.), if each char of a string is within the overlapped range it is impossible to determine the language directly. One would have to rely on char-groups, char positions etc., the statistical occurrence of a certain combination would then determine the probabilities of the different languages.

It's possible but it requires quite an effort into the relevant languages if you want to pull it off.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT. The time now is 09:03 AM.

Site Sponsors
Your ad here
buy t-shirt
מחיר הזהב

Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.3.0
Copyright idnforums.com 2005

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54