/
2022-07-13 IDNs EPDP String Similarity Review
2022-07-13 IDNs EPDP String Similarity Review
The call for the IDNs EPDP String Similarity Review team will take place on Wednesday, 13 July 2022 at 12:00 UTC for 60 minutes.
For other places see: https://tinyurl.com/yckr99y2
PROPOSED AGENDA
- Welcome
- Review of Arabic example
- Revisit Chinese example
- Recommendation language & rationale
- Objection process impact
- AOB
BACKGROUND DOCUMENTS
String Similarity Review Visualization.pdf
String Similarity Review Small Group Meeting Slides_13 July 2022.pdf
PARTICIPATION
Notes/ Action Items
Meeting materials available here: https://icann-community.atlassian.net/wiki/spaces/epdpidn/pages/97109078/2022-07-13+IDNs+EPDP+String+Similarity+Review
Review of Arabic Example
- In reviewing the two examples, need to consider how many examples to use for write-up and report back to full team.
- This example is important because it demonstrates the risk of failure modes, especially misconnection.
- Slide 46, Two Applied for Arabic TLDs
- Slide 47, String Similarity Review of Two Applied-for Arabic TLDs
- Slide 48, Two Applied for Arabic TLDs, Cont – Demonstrates what the hybrid model may catch. Without the hybrid model, D1 and E1 could both be delegated, even though there may be string similarity against blocked variants.
- Will need to be prepared for questions of whether this example is an edge case.
- String similarity is generally “corner” cases - not commonplace
- From one perspective, the hybrid model will catch the edge cases and is therefore worth the additional effort and complexity.
- As an Arabic speaker, belief that the risk of miscommunication is very minor. Need multiple keyboard versions to type in the wrong TLD.
- However, need to consider other users of the script (Urdu).
- And need to consider users that are not necessarily fluent in the language.
- The problem is that an Arabic speaker/reader reads the Pashto TLD as Arabic. This helps represent the misconnection risk.
- Implementability is outside of the assignment. Only need to consider security and stability. Ok to look at the likelihood of the risk however.
- A new factor for string similarity in the future is to take into account RZ-LGR rules. The generation panels were instructed to identify obvious string sim cases, which will be considered as variants of each other. Less obvious cases would not be included in the RZ-LGR, but the string sim panel could find two strings as similar.
- If someone sees E1, thinks it is D10/D17/D24, but types it as D1. Can that happen?
- Seems to be agreement that the example represents an edge case, but can’t put a number on the likelihood. Hybrid model represents the zero risk model. Will submit point about balancing risk likelihood versus implementability with the full team. 100% risk prevention is full level 3, so we have apparently already done some level of balancing.
- Question whether or not the group needs to decide and agree on the solution. Preference for agreeing on solution, but can just provide findings to the full team, describing that the proposed solution is designed to address security and stability concerns primarily.
Review of Chinese Example
- Does not demonstrate the misconnection risk as clearly as the Arabic example. Question is whether the example should be brought forward to the full team.
- Support for including and probably helpful to present in the same order as today: Arabic then Chinese example.
Recommendation language & rationale
- Skipped – has not changed substantively and in the interest of time.
Objection process impact
- 4 types of objections. Need to remember that string sim confusion is tied to string sim. Can contribute to contention sets. String confusion does have a different standard that goes beyond string sim.
- For the other three objection procedures, a successful objection kicks the application out of the process.
- The SubPro WG affirmed the standard for objections. Single question is to consider the role that non-requested allocatable variants and blocked variants play.
- Start with String confusion first – should the hybrid model apply here?
- Yes, and consistent with the hybrid model, the objection cannot be two level 3 strings against each other.
- Suggestion that the decision is up to the panel.
- Some member seems to imply that the string confusion objection can be filed based on the confusing similarity between two blocked variants too. The EPDP team’s recommendation can help set the baseline, but perhaps it should be up to the panel to determine whether two level 3 strings are too confusingly similar.
- If the string similarity review uses the hybrid model without comparing blocked strings against each other, can the string confusion objection be filed based on the confusing similarity between blocked strings?
- Suggestion that blocked variants should also be considered for legal rights for instance.
- There are three elements to consider (perhaps in the “appeal process” rather than the objection process):
- Hybrid level comparison is not sufficient
- Hybrid level is too much
- Don’t agree with the decision only
AOB
- None