Rubber Duck
15th February 2008, 08:04 AM
https://delhi.icann.org/files/Delhi-WS-gTLDEnvironment-14FEB08.txt
ERIC BRUNNER-WILLIAMS: Eric Brunner-Williams from core.
Ram, in the 2001/2002 time frame in the Chinese domain name consortium context, we were then grappling with not variations that looked alike but with variations that did not look like alike but which had identical meaning. This is the simplified Chinese/traditional Chinese table debate issue or whatever.
But the enduring thing that we have from there is the notion of equivalence classes across scripts on a per-character basis, and here we're not looking not so much per-character issues but string issues, we have the possibility of looking at that as a concatenation of equivalent characters or recognizing that if the -- even if the -- a set of characters are not character-by-character equivalent as they were in the SC/TC context, that the semantic -- the intended meaning of the string or the apparent meaning of the string, as in the four examples, Hindi, Urdu, Arabic and English, that these could define an equivalence class.
With that concept in mind, knowing what we were trying to do in SC/TC -- that is, cause a registration for one label to be equivalent to the variants of that label so that the label effectively was a -- appeared in one namespace multiple times for each one of the variants, with all resolving to the same A record, and more interestingly, the two Chinese registries that were cooperating, or at least hypothetically cooperating, the TC -- the CN registry and the TW registry, doing cross-registration of the variants automatically, so that a registration of one variant in one registry automatically caused the A record to resolve in the other registry for either variant in the other registry.
So we have the possibility of looking at equivalence classes, the experience that we have from the CDNC period, and looking at IDNs as a possible -- looking at equivalence classes as a way that we can say that these multiple strings actually result in a single zone file, or they result in multiple zone files. Or the identical entries in a single unified zone file or identical entries in multiple discrete zone files.
So this is a technique that we haven't really explored or talked about. It's kind of been an assumption of some people who are somewhat business minded that each possible variant represents a new business opportunity for ICANN as a revenue collector, or for some business as a gTLD operator.
And this isn't necessarily the case. That's the point about equivalence classes, that we might be looking at proposals that involve folding much of this onto one zone or fewer than the maximum possible number of zones for the maximum possible number of scripts.
ERIC BRUNNER-WILLIAMS: Eric Brunner-Williams from core.
Ram, in the 2001/2002 time frame in the Chinese domain name consortium context, we were then grappling with not variations that looked alike but with variations that did not look like alike but which had identical meaning. This is the simplified Chinese/traditional Chinese table debate issue or whatever.
But the enduring thing that we have from there is the notion of equivalence classes across scripts on a per-character basis, and here we're not looking not so much per-character issues but string issues, we have the possibility of looking at that as a concatenation of equivalent characters or recognizing that if the -- even if the -- a set of characters are not character-by-character equivalent as they were in the SC/TC context, that the semantic -- the intended meaning of the string or the apparent meaning of the string, as in the four examples, Hindi, Urdu, Arabic and English, that these could define an equivalence class.
With that concept in mind, knowing what we were trying to do in SC/TC -- that is, cause a registration for one label to be equivalent to the variants of that label so that the label effectively was a -- appeared in one namespace multiple times for each one of the variants, with all resolving to the same A record, and more interestingly, the two Chinese registries that were cooperating, or at least hypothetically cooperating, the TC -- the CN registry and the TW registry, doing cross-registration of the variants automatically, so that a registration of one variant in one registry automatically caused the A record to resolve in the other registry for either variant in the other registry.
So we have the possibility of looking at equivalence classes, the experience that we have from the CDNC period, and looking at IDNs as a possible -- looking at equivalence classes as a way that we can say that these multiple strings actually result in a single zone file, or they result in multiple zone files. Or the identical entries in a single unified zone file or identical entries in multiple discrete zone files.
So this is a technique that we haven't really explored or talked about. It's kind of been an assumption of some people who are somewhat business minded that each possible variant represents a new business opportunity for ICANN as a revenue collector, or for some business as a gTLD operator.
And this isn't necessarily the case. That's the point about equivalence classes, that we might be looking at proposals that involve folding much of this onto one zone or fewer than the maximum possible number of zones for the maximum possible number of scripts.