Public Services and Procurement Canada
Symbol of the Government of Canada

Institutional Links

 

Important notice

This version of Favourite Articles has been archived and won't be updated before it is permanently deleted.

Please consult the revamped version of Favourite Articles for the most up-to-date content, and don't forget to update your bookmarks!

Search Canada.ca

WeBiText to the rescue

Frances Urdininea
(Language Update, Volume 7, Number 3, 2010, page 31)

As a translator, I have the luxury of two free Canadian resources at my fingertips: TERMIUM Plus®, a terminology data bank with 4 million entries, and the gc.ca domain, which contains over 50 million high-quality bilingual Web pages published by the Government of Canada. However, while consulting TERMIUM Plus® is very straightforward, searching on gc.ca for the translation of an expression using a conventional search engine can be much more time consuming. Typically, it used to take me a few minutes to manually retrieve a single pair of sentences containing the expression and its translation. That was until I discovered a wonderful tool called WeBiText!

Print screen of the WeBiText user interface
Figure 1: The WeBiText user interface

What is WeBiText?

WeBiText is a free bilingual concordancer that allows users to automatically retrieve, in just a few seconds, several pairs of aligned sentences from large, high-quality multilingual websites and view them in a side-by-side bilingual display (see Figure 1). The main advantage over conventional translation memories is that users don’t have to create a bilingual corpus themselves, since the tool is pre-populated with content from existing trustworthy sites. Also, it can give users access to a wider variety of bilingual texts than what may be available in the translation memories of their employer or client.

I have been using WeBiText for several months now in my Spanish translation work, and it has become one of my favourite translation support tools. It allows me to search in several multilingual sites, but interestingly enough, I find that even the English-French gc.ca corpus helps me find Spanish equivalents, because of the close resemblance between that language and French.*

Who developed it?

WeBiText is the result of three years of research at the Institute for Information Technology of the National Research Council of Canada (NRC). The idea arose from a study where researchers observed translators in their normal day-to-day work. It quickly became apparent that translators often used Web search engines to find equivalents, but that this was a time-consuming manual process that was amenable to automation. In developing WeBiText, the NRC team consulted heavily with translators, including members of the Multilingual Translation and Localization Division of the Translation Bureau, who have been officially collaborating on the project since October 2009.

What sites and languages does it cover?

WeBiText supports bilingual searches in 29 languages, even Inuktitut, a language of the Inuit people of Canada. While the default corpus is gc.ca, the tool also includes the sites of several reliable organizations like the European Parliament and the World Health Organization.

A phenomenal response

Since January 2010, WeBiText has seen an eighteen-fold increase in traffic, growing from 100 queries per day to 1,800, in spite of the fact that it has not been widely publicized. This is a clear indication that the technology is filling an unmet need in the translation industry. So far, the Translation Bureau is the heaviest institutional user, with 15% of all queries, while 65% of queries originate from home-based freelancers. The tool is also popular in translation schools (8% of queries), where professors use it to teach students how to work with large bilingual corpora.

Inspired by this success, the WeBiText team continues improving the tool based on user feedback. Readers are invited to try this free and easy-to-use tool to discover its many advantages for themselves.

To try WeBiText, go to WeBiText Trial.

Remark

Back to remark 1* An article on this particular technique will be published in El Rincón Español, the Spanish section of this journal, in December 2010.