Public Services and Procurement Canada
Symbol of the Government of Canada

Institutional Links

 

Important notice

This version of Favourite Articles has been archived and won't be updated before it is permanently deleted.

Please consult the revamped version of Favourite Articles for the most up-to-date content, and don't forget to update your bookmarks!

Search Canada.ca

My quest for information in 2010

André Guyon
(Language Update, Volume 7, Number 2, 2010, page 30)

My work consists, in part, of finding information on the Web. Millions of pages have been written about searching the Web, and I have to admit I haven’t read them all. Still, I think my approach might be of interest to some readers.

First of all, I’d like to tackle the syntax myth. At the risk of losing a few friends, let me just say that I almost always find what I’m looking for without worrying too much about syntax.

The exact phrase

For me, finding the correct phrase is crucial. I usually don’t search using keywords, but rather exact phrases. Search engines have an annoying habit of finding every single page that contains all or some of the words in your search—and not necessarily in the right order. For example, if you search for shady deal, you won’t just get pages about shady deals, you’ll also get pages about deals of all kinds, not to mention pages about shady areas under trees.

To better illustrate how important the correct phrase is, I suggest you try the following two searches:

"Bureau de la traduction" Weidner

"Bureau des traductions" Weidner

With these searches, I’m trying to find information on machine translation tests that took place before I arrived at the Translation Bureau. Fascinating stuff, isn’t it? Those who have been around long enough will recall that the Translation Bureau was once called Bureau des traductions in French.

In short, by leaving out the quotation marks, I find not only what I’m interested in, but also a whole lot more that doesn’t interest me. When I search for an exact phrase, however, I mainly find pages of interest to me.

Choice of language

Sometimes, you can find something in one language, but not in another. That’s obvious, you say? I agree. But let me tell you a little story.

During separate conversations I had about voice recognition with two researchers (one owns a private company, the other works in a research centre), both told me they had found nothing on dictation and productivity when I mentioned an article by Lise Laroque-Divirgilio that appeared in Meta in 1981.

As it turns out, even though they are both Francophone, they had done their research in English. Today, anything that matters is written in English.*

In short, a word of advice for absolutists: ask yourself where what you’re searching for came from—which could give you an indication of the language in which it was written—then try to find someone who speaks that language to do the searches. Or, if you must, use machine translation tools so that you can search in the desired language and interpret the results.

At the end of the day, my modus operandi in the virtual world looks a lot like my usual MO in the old urban paper jungle.

Thirty years ago, I would visit university libraries. Today, I "visit" the Web. Back then, I would usually start by looking in the subject index. Then I would ask the librarians to help me find my way through the jungle of shelves. I would also often ask them for advice on recommended publications.

Once my harvest was gathered up on a table, I would leaf through each item to see where I had the best chance of finding what I was looking for. I would check whether a book was an original work or a translation, read the author’s profile and so on.

Next, the real research work would begin: feverish reading in search of information that would help me better understand the subject or find the correct expression to use in my translation.

I would later confirm my choices with my network of personal contacts. For example, when translating a text on acid rain in 1981, I asked two engineer friends to tell me what they thought of my choice of reference works and whether they could suggest any others. I also asked them to make sure my translation did not say anything that wasn’t true.

Today, I still turn to the same people from time to time, though more often than not through social media. I have a tendency to limit my circle of friends and the size of my networks, and I wouldn’t mind comparing my personal network with that of others who have thousands of friends on Facebook to see whose is the most effective.

I still don’t know whether I’ll use Twitter or Buzz one day. I’m keeping myself informed and assessing the possibilities.

As a language-technologies specialist, I feel that my most useful contacts are people in the industry, researchers and, above all, users of the technology. If need be, I can write to them in order to validate or invalidate certain hypotheses and sometimes influence what will happen.

For almost a year now, I have been using Google Alerts to keep myself up to date on what’s happening in my neighbourhood. Whenever there is news about projects that interest me (a residential project and a sports centre), I receive emails on recent developments. Of course, the search is in both English and French….

More recently, I also started using Alerts for the types of language products that interest me. I used to use Copernic Agent a fair bit, which kept me up to date on changes to Web pages I was interested in. However, the arrival of RSS feeds** has made Copernic Agent a lot less useful.

Sorting

Once I have found links, it’s a bit like when I had a list of books at the library. I like to sort them by checking their content for quality. Is the author reliable? Does the text seem well written and easy to read or would it be a struggle to get through? Does the publisher or host have a good reputation?

Some of the links will inevitably point to Wikipedia. Wikipedia articles are generally well structured, but I still tend to check the references. I would hesitate to cite Wikipedia as a source in a report, as the article cited could be completely different by the time my text ends up being read.

When I was a student, it was recommended that you always have at least three distinct sources of information in order to consider something valid. I apply the same principle to information on the Internet. And I am leery of content whose authors spend their time citing themselves or each other.***

I am also always on the lookout for conflicts of interest. A legitimate scientist may very well have become affiliated with a company whose work he or she is praising, a representative of the company, etc. When a company publishes the testimonials of users who are extremely satisfied with a new product that it has just put on the market, those users are often its own employees! It’s a bit like letters to the editor appearing in the first issue of a new journal: very shady.

Pharmaceutical companies often have excellent sites describing illnesses. The information on these sites is very reliable—except when it comes to treatment. You need to know who owns the site in order to determine which sections are most likely to be reliable and which sections are likely to be more biased.

Lastly, the information I’m looking for is not always to be found in the pages indexed by search engines—not by a long shot. Again, getting up to speed on how the sites I’m interested in are organized allows me to dig deeper.

However, I sometimes come up against a brick wall. Some information is accessible only to certain users. For instance, everyone is talking about Google Wave, but no one has access to it. The same thing happened when there was a beta version of Google Translator Toolkit. We could read about it, but hardly anyone had access to it. I’m looking to see whether someone in my personal network can invite me, but so far I haven’t had any luck, and I don’t think my luck would have been any better if I had thousands of friends on a social media site.

Read, understand and synthesize

Once I have chosen the pages that interest me and have ensured that the sources are independent, all that I have left to do is read and synthesize. Obviously, I occasionally put my two cents in.

To be honest, this part of the process takes a long time. I have to read carefully and check whether the text contains any indication that further research would lead me to find more information in another source. Companies that sell computer products, for example, very rarely mention their competitors.

However, once I know what category their software falls under, I can do a search on the category and often find "independent" comparisons. Independent analysis by anonymous bloggers could be a new form of advertising where companies laud their own product by relating the comments of people who don’t exist.

The best way for me to confirm this is to see whether people who don’t like each other very much say the same thing. For instance, in the area of machine translation, I heard a story several years ago about an entrepreneur who specialized in breakdowns (his system broke down whenever someone showed up with real data to test). I was told the same story by people who are no longer on speaking terms.

If the topic I’m interested in also concerns users, I try to get in touch with them. They are my most precious source of information. Instead of just thinking that something might be useful to language professionals, I can find out what makes it useful, what can be improved and so on.

And that’s how I would sum up my quest for information in 2010.

Remarks

  • Back to remark 1* There was a time when researchers published in their own language, but these days English is considered the lingua franca of research. We therefore see more and more authors opting for wider readership at the expense of their local culture.
  • Back to remark 2** RSS feeds alert users to changes in Web content (with hyperlinks). For instance, they would allow me to follow a particular journalist’s work. Certain browsers, such as Firefox, add bookmarks to my list when there are new pages.
  • Back to remark 3*** Unfortunately, this trend is getting worse all the time. Author A constantly cites authors B, C and D, author B constantly cites authors A, C and D, and so on.