The Government of Canada’s terminology and linguistic data bank.


Record 1 2004-07-12


Subject field(s)
  • Statistical Graphs and Diagrams
  • Information Theory

Nevertheless, clustering real-world data sets often raises problems, since the data space is usually a high dimensional feature space. A prominent example is the application of cluster analysis to gene expression data. ... In general, most of the common clustering algorithms fail to generate meaningful results because of the inherent sparsity of the data space. In such high dimensional feature spaces data does not cluster anymore.


The modern world is full of large data sets: census data, multimedia databases, and remote sensing data, to name a few. With the aid of computers it is possible to collect scientific data on a grand scale. These examples are not "large" merely because they have many data points. They are also high dimensional, meaning that there are many measurements taken of each person, multimedia object, or point on the ground. The "data point" representing one person is made up of their age, gender, educational level, occupation, and whatever else was recorded. Suppose there are 12 attributes recorded for each person. We can think of that person's record as a point in 12 dimensional space.


  • Diagrammes et graphiques (Statistique)
  • Théorie de l'information


Save record 1

Copyright notice for the TERMIUM Plus® data bank

© Public Works and Government Services Canada, 2020
TERMIUM Plus®, the Government of Canada's terminology and linguistic data bank
A product of the Translation Bureau


Language Portal of Canada

Access a collection of Canadian resources on all aspects of English and French, including quizzes.

Writing tools

A collection of writing tools that cover the many facets of English and French grammar, style and usage.

Glossaries and vocabularies

Access Translation Bureau glossaries and vocabularies.

Date Modified: