Tuesday, March 10, 2015

Measurement Theory

Just because something is expressed as a number, doesn't mean you can do arithmetic with it. Let's say I give you three numbers: 1,2,3. What's the mean of these numbers? (1+2+3)/3=2, right? Well, what if I told you that 1=apple, 2=pear and 3=banana? What's the mean of an apple, a pear and a banana? Plainly, the question is ridiculous. Yet there is still a substantial number of people in science, and in computational intelligence, who fall into this trap, especially when presenting data to models like neural networks.

Neural networks aren't magic: they can't tell what a number submitted to their input layers mean, they just multiply them by their connection weights, sum the products and apply a transformation function to them. So if the numbers you are submitting to them represent classes, rather than measurements, what are they really modelling?

Measurement is the most fundamental part of data collection, as all natural data originates as measurements of properties of events. Measurements should represent reality and relationships between measurements should reflect the relationships between attributes. This is most important is consideration is given to the principle that data represents reality: as the source of data, measurements must yield and adequate representation of reality.

Measurement theory, as originated by Stevens, helps us achieve this. By specifying and formalising what exactly measurement is, we can better use measurement to gather data. By understanding exactly what the numbers mean, we can better analyse and transform the data into information and knowledge, while avoiding such traps as making meaningless statements about the numbers or performing a meaningless transformation on the data. A crucial point to bear to in mind is that measurements represent reality but are not the same as reality.

Measurement Scales

At the heart of Steven’s measurement theory is the concept of measurement scales. Four such scales are defined (although other have been added since) where each scale is distinguished according to four characteristics:
  • Distinctiveness: individuals are assigned different values if the property being measure is different.
  • Ordering in magnitude: larger numbers represent greater quantities of the property being measured;
  • Equal intervals: a difference in measurement represents the same difference in the property.
  • Absolute zero: a measurement of zero represents an absence of the property being measured.

These four characteristics define the “strength” of the measurement scale. The scales, from “weakest” to “strongest” are:
  • Nominal
  • Ordinal
  • Interval
  • Ratio
  • Absolute
Also associated with each of the measurement scales are specific, permissible statistics and transformations. The term permissible is slightly misleading: if a statistic is not permissible for a certain scale, it is not forbidden. Rather, the results of that statistic or transformation are not reliable, with the unreliability of the result determined by the way in which the measurements were made. Permissible statistics and transformations are simply those statistics and transformations that yield reliable results. A permissible statistic tells us something meaningful about the data, while a permissible transformation maintains the properties of the data as appropriate for the particular scale. A statistic may still be applied to data from a scale, for which that statistic is impermissible, and it may yield useful results, but these results need to be treated with caution, and interpreted within the context of the original measurements. Note also that permissible statistics and transformations are cumulative across scales, that is, all statistics and transformation permissible for a lower scale are permissible for a higher scale.

Nominal Scale

The nominal scale is the weakest of the measurement scales. It possesses only the characteristic of distinctiveness. In other words, if the same attribute of two individuals are assigned the same number, then the attributes are identical. No other conclusions may be drawn from those numbers, as they are simply arbitrary numeric labels. For example, the colours Red, Green, and Blue can be placed on the nominal scale with the measurements Red=1, Green=2, Blue=3. However, two reds do not make a green. They could just as easily be labelled Green=1, Blue=2, Red=3, or any other permutation, without altering their distinctiveness. The only permissible statistics for nominal scale measurements are the number of cases and the mode. Permissible transformations are permutations and one-to-one substitutions.

Ordinal Scale

Measurements on the ordinal scale have the properties of distinctiveness and ordering in magnitude. In other words, objects are ordered in the scale according to some pair-wise comparison. That is, measurements on this scale can be compared to one another with the equality, greater than or less than operators. However, while we can say that one measurement is greater than or less than another, we cannot say how different they are. Numbers in this scale are categories; they do not have the arithmetic properties of numbers. An example of an ordinal scale measurement is teaching evaluations: a teacher’s performance is evaluated by students over several variables, with the performance being rated from one to five, with one being “Poor” and five being “Excellent”. While it is meaningful to draw the conclusion that a score of four is better than a score of two, it is not meaningful to draw the conclusion that a score of four is twice as good as a score of two, nor is it meaningful to say that a score of five is the same “distance” from a score of three, as a score of three is from one. Permissible statistics introduced at the ordinal scale are medians and percentiles. Permissible transformations introduced are monotonic increasing functions, that is, any transformation that will maintain the order of the individuals.

Interval Scale

Measurements on the interval scale have the characteristics of distinctiveness, ordering in magnitude and equal intervals. In this scale, objects are placed in order on a number line with an arbitrary zero point and an arbitrary interval between objects. While the numerical values have no significance other than as labels, differences between the values do have meaning. An example of an interval scale is the date in years. The common era (CE) scale has an arbitrary zero point (set at the putative time of the birth of Christ) and equally sized intervals (the length of a year does not vary, excepting leap years, which actually make up for errors caused by the slight mismatch between the arbitrary length of the year set at 365 days and the actual length of the Earth’s orbit).  It is meaningful to say that 1973 is later than 1928, and that the difference between 1999 and 1973 is twice the difference between 1986 and 1973. It is not meaningful, however, to say that 2004 is twice the year that 1002 was. Permissible statistics introduced at the interval scale are the mean, standard deviation, rank-order correlation and product-moment correlation. Permissible transformations introduced are linear transformations of the format y=ax+b, where x is the measurement, and the constant a cannot be zero. In other words, permissible transformations are those transformations that preserve the order of the objects, and the relative intervals between them.

Ratio Scale

Measurements on the ratio scale have the characteristics of distinctiveness, ordering in magnitude, equal intervals and absolute zero. In this scale, objects are placed in order on a number line with equally sized intervals and a true zero point. A measurement of zero on the ratio scale indicates the absence of the property being measured. A ratio scale can also be defined as the differences between two interval measures: a difference of zero between two interval measurements indicates an absence of difference. In the ratio scale, the values themselves have significance, as do the differences and ratios of those values. Many properties in physics are ratio scale measurements. An example of this is speed. An object with a speed of zero isn’t moving, that is, it has no speed, while an object moving at fifty metres per second is twice as fast as an object moving at twenty-five metres per second.

Permissible statistics introduced at the ratio scale are the coefficient of variation, and permissible transformations are affine transformations, that is, y=ax.

Absolute Scale

Whereas measurements on the ratio scale have an absolute zero point, measurements on the absolute scale have and absolute zero and an absolute upper bound. The classical example of this is probabilities: the probability of an event can range from zero (the event will never happen) to one (the event will always happen). A probability of less than zero or greater than one is meaningless.

Only affine transformations are permissible for measurements on the absolute scale.

Transforming Between Scales

It is possible to transform a measurement made on a particular measurement scale to a weaker scale only. This transformation will involve a loss of information, and cannot be reversed. In other words, it is not possible to transform to a higher measurement scale. For example, consider the heights, in metres, of a group of three people. One person is 1.4 metres tall, the second is 1.8 metres tall, and the third is 2 metres tall. If we say that a person’s height is 1 if they are short, 2 if they are average and 3 if they are tall, then it is possible to transform these ratio scale measurements into the ordinal scale, by assigning the first person’s height a value of 1, the second a value of 2 and the third a value of 3. However, if we know only that a persons height is 2 on this scale, we cannot determine exactly what their true height is.


The major implication of this is that data must be collected with great care. Once a measurement is made on a particular measurement scale, it cannot be transformed into a higher scale. Once the measurement is made, no further information can be associated with it.

You must know which scale the measurements belong to, as they will determine what you can meaningfully do with the data. They will also inform as to how you represent the data for presentation to your models. A working knowledge of measurement theory, therefore, is essential for any serious practitioner in computational intelligence.