Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Fix links.




Info

Jump to the Background for the history and background of this issue.

...

Testing of new translation tool for LACRALO mailing lists

ICANN Staff have created two mailing lists (New-transbot-en and New-transbot-es) with a select number of persons on those lists for testing purposes.

Some of the key changes implemented in the new translation tool.

...

FY17 update

The TTF filed a budget request to the At-Large FBSC in FY17 for ICANN to finance the hiring of a programmer to assist the volunteer ICANN staff member to fix outstanding bugs - see At-Large FY17 Budget Development Workspace , this was approved by the At-Large FBSC and filed with ICANN Finance. On the 2016-08-08 At-Large Technology Taskforce Call, ICANN Staff member corinna.ace confirmed that a programmer/developer has been hired to sort out the remaining bugs.

To test the new translation tool, two test email lists: new-transbot-en and new-transbot-es were created and TTF volunteers and ICANN staff joined these lists to test the translation and to report bugs at discussion-of-LACRALO-mailing-list-issues page.

New versions of the translation tool were deployed to these transbot lists in late Dec 2016 and March 2017. The March 2017 update introduced new features

  • translated emails will also include attachments (TXT, PDF, WORD, JPEG, PPT, PNG, GIF) from the original email
  • If there is text that you don't want translated, you can enclose such text with a <DNT></DNT> tags

In May 2017, emails from the existing LACRALO mailing lists were reposted to the new transbot lists  to get a sense of how the new tool will handle how current users are using the email lists.

Since ICANN59, the TTF chairs have been discussing with Mark Segall and Corinna Ace from ICANN IT and with Silvia Vivanco and Mario Aleman from ICANN At-Large Staff on implementing the new version of the translation tool developed by ICANN IT on the existing LACRALO mailing lists.

To minimize the issue of persons posting to both lists at the same time which would create problems, members of LACRALO will be asked via online survey at https://goo.gl/forms/sEOEWqacRYLPk2Xc2 to indicate
* which lac discuss list do you wish to RECEIVE emails from (English, Spanish, or both)
* which lac discuss list do you wish to be able to SEND emails to. You can post to one list.

A conference call for LACRALO members was held on Tuesday Sept 5 2017 (see recordings at https://icann-community.atlassian.net/wiki/x/TLyKBg) to raise awareness of the planned changes to the translation tool used for the LACRALO mailing lists and what persons on the LACRALO lists need to do to prepare for the changes.

The tool was deployed to the main LACRALO lists on October 6 2017.

...


StatusDate AddedDescriptionAdditional Notes
Status
colourGreen
titleFIXED
28 Feb 2016

Subject line in body of translated email has the sender and first line of the email on the same line, when the sender and first line should be on separate lines. Two examples:

The converted body text was included immediately after the subject and from line in translated emails. A new line character was inserted between sender name and first body text line to separate them.

Initial testing complete; additional testing in progress.

Example: https://icann-community.icannatlassian.orgnet/wiki/x/AofDAwqqGKBg



(Noted by satish.babu)
Status
colourGreen
titleFIXED
28 Feb 2016

There is an empty space in the beginning of most (but not all) lines in the translated email

After investigation it appears empty spaces may have been added by the email client or by Google Translate API.

Space removed before all translated lines in mailing list emails to resolve the issue.

Tested with mail IDs Outlook, Yahoo, Gmail. Verified that translated emails are not indented; empty spaces are not appearing at the beginning of lines.

Example: https://icann-community.atlassian.net/wiki/x/qKGKBg

(Noted by satish.babu)
Status
colourGreen
titleFIXED
28 February 2016

At the end of the message, the sender's name starts with a lower case ('dev Anand', although it is 'Dev Anand' in the original message
EN (original): http://mm.icann.org/pipermail/new-transbot-en/2016-February/000080.html ES (translated): http://mm.icann.org/pipermail/new-transbot-es/2016-February/000068.html

Google Translate understands 'Dev' as an abbreviation and as a rule converts it to all lowercase.

Name was hardcoded to fix 'Dev' to begin with a capital letter.

Tested with mail IDs Outlook, Yahoo, Gmail. Verified that name is appearing correctly with first letter capitalized.

Example: https://icann-community.atlassian.net/wiki/x/pKGKBg

(Noted by satish.babu)
Status
colourGreen
titleFIXED
28 February 2016

'Transbot' is mis-spelled as 'tansbot' (third line from the bottom)
EN (original): http://mm.icann.org/pipermail/new-transbot-en/2016-February/000080.html ES (translated): http://mm.icann.org/pipermail/new-transbot-es/2016-February/000068.html

Misspelling was hardcoded. Applied hardcode fix to correct spelling of 'tansbot' to 'transbot' in translated emails.

Tested with mail IDs Outlook, Yahoo, Gmail. Verified that spelling is now appearing correctly as 'transbot.'

Example: https://icann-community.icannatlassian.orgnet/wiki/x/oILDAwoqGKBg

Status
colourGreen
titleFIXED
April 17 2016

the transbot can't handle cedilla - as At-Large Staff signature lists a staff member with a cedilla in her name, any message from At-Large Staff will result in a message not translated
See

The April thread with the subject line "CALL FOR MEMBERS: At-Large Public Interest Working Group" on EN : http://mm.icann.org/pipermail/new-transbot-en/2016-April/thread.html and ES : http://mm.icann.org/pipermail/new-transbot-es/2016-April/thread.html showed how the issue was isolated after several variations of the original email were tried.

This is a critical bug, as any cedilla in any word in an email would result in the email not being translated.

The issue has been resolved specific to the reported cases of broken emails caused by the cedilla character in the AL Staff signature.

As a larger issue, it is still in progress. Efforts around the reported case led to wider investigation into how the transbot and email applications handle Unicode characters. This is important UTF-8 compliance work and requires extensive testing.

Recent tests with a wider set of characters using Outlook have been successful. Tests with those same characters have been inconsistent with Gmail and Yahoo. The team is continuing to research, test, and make progress.

Examples: https://icann-community.icannatlassian.orgnet/wiki/x/pYrDAwrKGKBg

Status
colourGreen
titleFIXED
April 17 2016

The phrase "This Working Group is open to interested members of the At-Large community." gets translated to

"Este grupo de trabajo está abierto a los miembros interesados \u200b\u200bde la comunidad de alcance."

Not sure why it repeatedly happens for that phrase: See
EN: http://mm.icann.org/pipermail/new-transbot-en/2016-April/000090.html
ES: http://mm.icann.org/pipermail/new-transbot-es/2016-April/000078.html

The issue was related to zero-width space, which was being injected by Google Translate API.

Zero-width space is used after characters that aren't followed by a visible space, but after which there may be a line break (source). It was encoded into the Unicode and was appearing as /u200b.

To fix, it was replaced in the translated text with no space.

Tested with mail IDs Outlook, Yahoo, Gmail. Verified that phrase is translated correctly without additional characters.

Example: https://icann-community.atlassian.net/wiki/x/pqGKBg




Email subject lines can get jumbled and distorted along threads of translated emailsThis issue is related to extra spaces in subject lines, and research shows it is a known issue with Microsoft Office/Mail-man server. There is not a known resolution at this time. As a workaround solution, the new test lists were designed so that subject lines aren't translated and original subject are retained.



Attachments are not retained on translated emailsChanges were made to support attachments on translated emails. The file formats that will be retained between lists are TXT, PDF, WORD, JPEG, PPT, PNG, GIF

...