2025-04-02 Latin Script Diacritics - Meeting #02
The call for the Latin Script Diacritics team will take place on Wednesday, 02 April 2025 at 13:00 UTC for 90 minutes.
For other places see: https://tinyurl.com/e6fhy5mk
PROPOSED AGENDA
Welcome and SOIs
Summary of First Meeting
Established Scope to Date
Scope Management and Discussion: Similarity
Charter Question 1
Next Steps/AOB
BACKGROUND DOCUMENTS
Unicode Chart (Basic Latin)
Unicode Chart (Latin Supplement)
RECORDINGS
Zoom Recording (including audio, visual, rough transcript and chat)
GNSO transcripts are located on the GNSO Calendar
Notes/ Action Items
[OUTCOME]
Agreement to define Latin Script Diacritics per Unicode (Established sinceMeeting #1 [icann-community.atlassian.net]).
Agreement that this PDP WG should examine the diacritics that are similar but should not define what similarity means or change the existing String Similarity Review process; This WG will use the outcomes of the String Similarity Review Panel and if found to be confusingly similar, the recommendations from this PDP will become effective.
Agreement to allow for multiple diacritic cases (multiple diacritics in the same letter as well as multiple [3 or more with no upper limits] diacritic versions of TLDs at the same time) within scope while also not restricting the cases to certain TLD types such as real words or to geographic or brand, bearing in mind the process flow.
Agreement to limit the scope to same entity operating the ASCII/diacritic versions while not limiting it to only applied-for IDNs strings of existing ASCII gTLDs; also accepting new gTLD applications.
[ACTION ITEMS]
Leadership/Staff to present a list of Unicode Tables, compiling all the cases that are within scope.
WG to consider the existing body of work to be consistent in the use of terminologies, while also being mindful of the existing policies and the process flow (e.g., Suggestion to refer to AGB language on similarity [itp.cdn.icann.org]and the Applicant Journey [itp.cdn.icann.org])
Leadership/Staff to invite an ICANN org expert on String Similarity Review process (Sarmad Hussain?) to share the details and criteria of String Similarity Review with the WG.
[NOTES]
Download slides and background docs discussed during meeting (Unicode Tables, AGB on String Similarity Review and Applicant Journey) from here: https://icann-community.atlassian.net/wiki/spaces/gnsolsdpdp/pages/176029697/2025-04-02+Latin+Script+Diacritics+-+Meeting+02 [icann-community.atlassian.net]
Welcome and SOIs
N/A
Summary of First Meeting
Reminder that the Early Input Request message has been sent to each SO/AC/SG/C Leadership.
Proposed Project Plan has been submitted to GNSO Council (28 March) andPrudence Malinki, the GNSO Council Liaison to the PDP WG, will be providing an overview of the Plan to the Council during the April Council Meeting.
Established Scope to Date
When determining what is within scope (with diacritic), the following Unicode Tables can be referred to – A Unicode Character can be decomposed into a base character and a diacritic:
https://www.unicode.org/charts/PDF/U0000.pdf [unicode.org]
https://unicode.org/charts/PDF/U0080.pdf [unicode.org]
Questions on whether cases with multiple diacritics in the same letter (e.g., diệp) and/or multiple (3 or more) diacritic versions of TLDs (e.g., stop, stóp, stòp) at the same time would be allowed/within scope: Members agreed that allowing all these cases should not be a problem.
A suggestion was made to gather the Unicode Tables to be used for this PDP: Noted that a list could be compiled. However, whether to provide general principles vs. a full list of diacritics, it was noted that presenting a general rule of Unicode may be best with the list provided for convenience; though if listed in full, it must be made clear that the list is not fixed/restricted but rather current examples that may be adjusted if the LGR version were to be adjusted.
Scope Management and Discussion: Similarity
Agreement that terms should be consistent throughout this work per the existing body of work (e.g., “similarity”, “user confusion” rather than “visual similarity”) and the correct understanding/definition of “similarity” is necessary. A suggestion was made for WG to refer to the AGB language to align definition of similarity: https://itp.cdn.icann.org/en/files/policy-development/agb-string-similarity-topic24-10-09-2024-en.pdf [itp.cdn.icann.org]
Agreement that this PDP WG should examine the diacritics that are similar but should not define what similarity means or change the existing String Similarity Review process; This WG will use the outcomes of the String Similarity Review Panel and if found to be confusingly similar, the recommendations from this PDP will become effective.
Question on whether this PDP could propose a rule saying that the registry should only allocate same second level domain to both ASCII and diacritic TLD: Noted that this PDP would follow the rules set forth in the EPDP-IDNs as that PDP has already made sure that even though variant TLDs are confusingly similar, the registries operating them will have rules that make it unlikely.
A suggestion was raised to share more information on variant rules compared to the String Similarity Review process: Noted that it may be ideal to invite an ICANN org expert on String Similarity Process (Sarmad Hussain?) to share the criteria and details of String Similarity Review with the WG.
Charter Question 1
A suggestion was made that the focus should be on preventing user confusion (and not only about allowing applicants to move forward) and from this perspective that the multiple TLDs should follow a same entity (same registrant, same registrar, same registry) rule: Noted that the same entity principle is also a part of the EPDP-IDNs solution for variants (second-level) and with the goal to avoid user confusion, the same entity principle will be applicable for this PDP (TLD level) with a possibility for this WG to explicitly determine the same entity concept within its policy recommendation. Either way, WG should be mindful of existing policies.
Question on the meaning of “workaround” for existing ASCII gTLDs instead of the proper IDN string (and vice versa): Noted that there is no need for this WG to define ”workaround” anymore as the characters will be defined via Unicode.
Agreement on allowing for multiple diacritic cases while not restricting to certain TLD types such as real words or to geographic or brand, bearing in mind the process flow (reminder of the process flow where the String Similarity Review comes first during application process; Refer to Applicant journey: https://itp.cdn.icann.org/en/files/policy-development/applicant-journey-related-to-topics-5-and-16-14-02-2025-en.pdf [itp.cdn.icann.org]. Also, the WG agreed to not have an upper limit in the diacritics.
Agreement to limit the scope to same entity operating the ASCII/diacritic versions while not limiting the scope to only applied-for IDNs strings of existing ASCII gTLDs; also accepting new gTLD applications.
Next Steps/AOB
Leadership/Staff to present a list of Unicode Tables, compiling all the cases that are within scope.
WG to consider the existing body of work to be consistent in the use of terminologies, while also being mindful of the existing policies and the process flow (e.g., Suggestion to refer to AGB language on similarity [itp.cdn.icann.org] and the Applicant Journey [itp.cdn.icann.org])
Leadership/Staff to invite an ICANN org expert on String Similarity Review process (Sarmad Hussain?) to share the details and criteria of String Similarity Review with the WG.