# Microscopy colocalization theoretical background

The purpose of a Colocalization coefficient is to characterize the degree of overlap between two channels in a microscopy image. Several of these coefficients are widely used in literature and lend themselves in principle for comparison of results obtained in different studies. Therefore SVI's Colocalization Analyzer focusses on these established coefficients. Still, it should be remarked at this point that the specific properties of the coefficients, especially properties related to the image background, make cross-study comparison problematic.

Generally the colocalization coefficients depend much on correct estimation of the image background and resolution. For these reasons we strongly recommend to compute colocalization coefficients only on deconvolved images. Deconvolution has proven to sensibly enhance colocalization analysis ( 1, 2 ), see Blur And Noise Affect Colocalization. In case the coefficients need to be computed from raw data it is possible to have this function compute the image background on a frame by frame basis.

If the image is affected by chromatic aberration, it is advised to correct the image with Huygens Chromatic Aberration Corrector.

See Colocalization Basics for illustrations of the experimental difficulties that affect colocalization.

In the definitions of the coefficients below we follow the naming convention for the two compared channels: R for the first channel, G for the second channel. The pixel values in the channels are Ri and Gi, respectively with i the pixel index. Since the coefficients do not take spatial relations into account they hold for 2D and 3D images, the index i running from zero to N - 1 , N the total number of image elements.

For a discussion on the interpretation of colocalization coefficients, see ( 3 ).

## Coefficients and maps

The coefficients as defined below parametrize the full image, while the maps parametrize the colocalization locally. A single value is calculated per VoXel creating a 3D map, that can be represented in a 3D image. The Colocalization Analyzer plots maps in the form of iso-colocalization surfaces, with all points with a given colocalization level joined forming 3D surfaces. In this way regions in which the degree of colocalization exceeds an isosurface threshold become objects that can be analyzed independently.

Conventionally, the two data channels under comparison are called R and G. They are also known as red and green channels, independently of the WaveLength they have actually registered.

### Pearson's colocalization coefficient

Pearson's linear correlation coefficient can be used to measure the overlap of the pixels. It is defined as follows:

$$r_\mathrm{p} = \frac {\sum ((R_i - R_{\mathrm{avg}}) (G_i - G_{\mathrm{avg}}))} {\sqrt{\sum (R_i-R_{\mathrm{avg}})^2 \sum (G_i - G_{\mathrm{avg}})^2}}$$

with Ravg and Gavg the averages of the R and G channel respectively and the summations with index i over all the image voxels. The value of rp is between -1 and 1. Negative values occur when on majority values of R above Ravg coincide with values of G below Gavg. The inclusion of the averages makes this coefficient independent from the image background. The coefficient describes how well the red and green channels are related by a linear equation (G = a R + b), but without saying anything about which is that equation. It will be equal to 1 if, for instance, all the red voxels are exactly double in intensity than the green ones, but also if the ratio is exactly 1.33, 0.7, or any other factor (plus an optional additive constant b). The closer rp is to +/- 1.0, the more linearly related all the voxels intensities in the two channels are. The real linear relationship doesn't affect the coefficient. Therefore, if for example the green channel detector measures with less intensity output than the red channel's, you can still calculate correlations. The coefficient is the ratio between the CoVariance of the channels and the product of their Standard Deviations.

A Pearson colocalization map MP consists of the following values:

$$M_{\mathrm{P},i} = \frac {(R_i - R_{\mathrm{avg}}) (G_i - G_{\mathrm{avg}})} {\sqrt{\sum (R_i - R_{\mathrm{avg}})^2 \sum (G_i - G_{\mathrm{avg}})^2}}$$

### Object Pearson's coefficient

This new Pearson's coefficient is available from Huygens version 3.6 and higher. It is the same as the Pearson's coefficient explained above, but with Ravg and Gavg the averages of Ri and Gi values respectively, with index i over the object voxels only, i.e. for i where Ri or Gi is larger than a threshold intensity level. In this way the Object Pearson coefficient is not biased anymore by large background areas.

$$r_{\mathrm{op}} = \frac {\sum_i ((R_i - R_{\mathrm{avg}}) (G_i - G_{\mathrm{avg}}))} {\sqrt{\sum_i (R_i-R_{\mathrm{avg}})^2 \sum_i (G_i - G_{\mathrm{avg}})^2}}$$, such that Ri > Rthresh $$\vee$$ Gi > Gthresh

### Spearman's coefficient

Spearman's coefficient can be found in the Colocalization Analyzer in Huygens version 3.7 and higher and it is equal to Pearson's coefficient, but based on the intensity ranks instead of intensity values. It is this difference that gives Spearman's coefficient the extra property that it can measure all monotonic depencies between two channels, while Pearson's coefficient only measures linear dependencies.

The intensity rank (Rr or Gr) of an intensity value (Ri or Gi) is determined by the position of this value, if all the image intensities of the channel were ordered. It does not matter if this is in decreasing or increasing order, since the ranks are compared with the average rank. In Huygens the highest intensity value gets the highest rank (=1).

If there are ties, the rank of these intensities is the average position in the ordered list of image intensities. Spearman's coefficient is then defined as:

$$r_{\mathrm{s}} = \frac {\sum ((R_r - R_{\mathrm{avg}}) (G_r - G_{\mathrm{avg}}))} {\sqrt{\sum (R_r-R_{\mathrm{avg}})^2 \sum (G_r - G_{\mathrm{avg}})^2}}$$

with Ravg = Gavg = $$\frac{n+1}{2}$$ , with n the image volume (total number of voxels).

The corresponding Spearman colocalization map is then determined by the folowing values:

$$M_{\mathrm{S},i} = \frac {(R_r - R_{\mathrm{avg}}) (G_r - G_{\mathrm{avg}})} {\sqrt{\sum (R_r - R_{\mathrm{avg}})^2 \sum (G_r - G_{\mathrm{avg}})^2}}$$

For an example of Spearman's coefficient, see Wikipedia Spearman's rank correlation coefficient

### Object Spearman's coefficient

This coefficient is available from Huygens version 3.7 and higher. It is the same as Spearman's coefficient explained above, but with Ravgr and Gavgr the averages of image rankings Rr and Gr of the voxels that belong to the object (i.e. that exceed a certain threshold intensity level). Like the Object Pearson's coefficient, the Object Spearman's coefficient is not biased anymore by large background areas. It is therefore defined as follows:

$$r_{\mathrm{os}} = \frac {\sum_r ((R_r - R_{\mathrm{avg}}) (G_r - G_{\mathrm{avg}}))} {\sqrt{\sum_r (R_r-R_{\mathrm{avg}})^2 \sum_r (G_r - G_{\mathrm{avg}})^2}}$$, such that Rr > Rthresh$$\vee$$ Gr > Gthresh

### Overlap coefficient

Because the negative values in rp are not so easy to interpret the subtraction of the averages can be omitted to create an overlap coefficient as follows:

$$r_{\mathrm{o}} = \frac {\sum (R_i\ G_i) } {\sqrt{\sum R_i^2 \sum G_i ^2}}$$

The value of ro is between 0 and 1. As with Pearson's coefficient this coefficient is not dependent on the relative strengths of the channels, however it does depend on the background intensity level.

The overlap colocalization map Mo consists of the following values:

$$M_{\mathrm{o},i} = \frac {R_i\ G_i} {\sqrt{\sum R_i^2 \sum G_i ^2}}$$

### Manders' coefficients

A consequence of the symmetry of the way both channels contribute to ro is that it can not distinguish between situations when non-colocalized signal is added to the R channel versus the G channel.

We may be interested in knowing how well the red pixels colocalize with the green ones, and vice versa. It may happen, for example, that all the red pixels overlap with green pixels but many of the green ones are "alone", in regions where no red signal is present (see e.g. the first example in Two Channel Histogram).

To make this distinction ro can be split in the following coefficients:

$$k_1 = \frac {\sum (R_i\ G_i)} {\sum R_i^2}$$

and

$$k_2 = \frac {\sum (R_i\ G_i)} {\sum G_i^2}$$

These coefficients allow distinction between the cases outlined above: addition of non-colocalized signal to G will not affect k1 but will affect k2 .

Still also these coefficients are not without disadvantages: k1 will scale proportionally with the signal strength in G, and similarly k2 will scale with the signal strength in R.

A possibility to render these coefficients independent of scaling effects is to replace Gi in the definition of k1 by 0 if Gi = 0; 1 otherwise. In effect this means taking the sum over all Ri for which Gi > 0. This yields the following coefficients, known as Manders' coefficients:

$$M_1 = \frac {\sum R_{\mathrm{coloc},i}} {\sum R_i}$$

and

$$M_2 = \frac {\sum G_{\mathrm{coloc},i}} {\sum G_i}$$

Please mind that this upper-case M coefficient doesn't mean 'map' but 'Manders'. The colocalization maps of the k1, k2, M1, M2 coefficients are constructed in the same way as for the previous coefficients.

### Intersection coefficients

All the above explained coefficients are based on voxel intensities, but in some situations these may be difficult to interpret. Simpler (but probably more unstable) coefficients can be calculated based just on whether there is some signal in a voxel or not, independent of its actual intensity value. For some illustrations on how these new coefficients arise, see Colocalization Coefficients with practical examples.

A voxel can be considered to have some interesting signal once its value is above a certain threshold intensity level. In such a case its value could be accounted for as 1, independent of its actual intensity, and otherwise it could be accounted for as 0. This in fact implies defining a red binary image Rweight with intensity Rweight,i at voxel i based on the real red intensity Ri and a red threshold intensity level Rthresh as:

$$R_{\mathrm{weight},i}=\begin{cases}0 & \text{ if } R_i\ \leq\ R_{\mathrm{thresh}} \\ 1 & \text{ if } R_{i}\ >\ R_{\mathrm{thresh}} \end{cases}$$

and similarly for Gweight,i, based on Gi and the green threshold intensity level Gthresh.

With noisy images this can generate quickly fluctuating 0-1 values around the background when intensities are close to Rthresh and Gthresh. To have smoother transitions around the background a Soft Threshold can be defined in such way that, within a certain range, a partial contribution between 0 and 1 is taken into account for some voxels:

$$R_{\mathrm{weight},i}=\begin{cases}0 & \text{ if } R_i\ \leq\ (R_{\mathrm{thresh}} - \mathrm{range}/2) \\ f(R_{i}) & \text{ if } (R_{\mathrm{thresh}} - \mathrm{range}/2)\ <\ R_i <\ (R_{\mathrm{thresh}} + \mathrm{range}/2) \\ 1 & \text{ if } R_{i}\ \geq\ (R_{\mathrm{thresh}} + \mathrm{range}/2) \end{cases}$$

The simplest function f(Ri) is a first order polynomial: intensities Ri between the range limits Rthresh ± range/2 are linearly mapped to values between 0 and 1. Now the Rweight and Gweight images are not binary anymore, but gray-valued with intensities between 0 and 1.

The intersection contribution of a given voxel can be defined as the product of Rweight and Gweight. The simplest case (hard threshold) implies that these contributions are always zero or one. In the soft threshold case, some pixels will contribute partially to the total coefficient with values between 0 and 1. In either case, the intersection coefficient is given by:

$$\mathrm{intersection} = \dfrac {\sum (R_{\mathrm{weight},i}\ G_{\mathrm{weight},i})} {\sum R_{\mathrm{weight},i} + \sum G_{\mathrm{weight},i} - \sum(R_{\mathrm{weight},i}\ G_{\mathrm{weight},i}) }$$

In the numerator: the total intersecting volume (voxels with intensities in both channels). In the denominator: the total volume of both channels together, which is calculated as the total red volume plus the total green volume minus the intersection volume (to avoid accounting for it twice).

We can also split the intersection coefficient to report what portion of the red and green volumes are intersecting:

$$i_1 = \frac {\sum (R_{\mathrm{weight},i} \ G_{\mathrm{weight},i})} {\sum R_{\mathrm{weight},i} }$$

$$i_2 = \frac {\sum (R_{\mathrm{weight},i} \ G_{\mathrm{weight},i})} {\sum G_{\mathrm{weight},i} }$$

### Li's ICQ

The intensity correlation quotient (ICQ) was defined in Li et al. (2004), A Syntaxin 1, Galpha0, and N-Type Calcium Channel Complex at a Presynaptic Nerve Terminal: Analysis by Quantitative Immunocolocalization, The Journal of Neuroscience, 24(16): 4070-4081. For each voxel "i", the following value is calculated:

$$(R_{i} - R_{avg}) (G_{i} - G_{avg})$$

By counting the voxels for which this value is positive and dividing this number by the total number of voxels, a ratio between 0 and 1 is generated. Then, 0.5 is subtracted from that ratio to map the ICQ to the -0.5 to +0.5 range. For random or mixed staining this number will tend towards 0, for segregated staining it will tend towards -0.5, and for dependent staining it will tend towards +0.5. By only using the polarity instead of the intensity of each voxel pair, this coefficient has the advantage that biases towards particularly high or low staining intensities are removed.

### Van Steensel's CCF

Van Steensel's cross-correlation function (CCF) was described in van Steensel et al. (1996), Partial colocalization of glucocorticoid and mineralocorticoid receptors in discrete compartments in nuclei of rat hippocampus neurons, Journal of Cell Science 109, 787-792. It is obtained by calculating Pearson's coefficient after shifting the red image over a distance of dx voxels, where -20 ≤ dx ≤ 20. The CCF can be useful to evaluate whether non-random colocalization occurs, since non-random overlap will result in a peak at dx = 0 and non-random exclusion will result in a dip at dx = 0. Uncorrelated distributions will not show any clear peaks or dips in the CCF.

## Threshold handling

### Pearson's coefficient

In principle the Pearson's coefficient is unaffected when a constant value (threshold) is subtracted to either of the channels. However, to handle specified threshold settings in a consistent manner across all coefficients computed by the Colocalization Analyzer thresholds are taken into account. If you specify threshold (and threshold range) values in the ColocalizationAnalyzer, those values will be subtracted from all the pixel intensities when calculating colocalization. In case negative pixel values occur, these will be set to zero. This means that setting a user specified threshold value can change the resulting Pearson's coefficient. To calculate Pearson's coefficient following its standard definition, the threshold and threshold range should both be set to 0.

### Object Pearson's coefficient

The per-channel threshold settings in the Colocalization Analyzer can be used to determine the values of Rthresh and Gthresh. A Soft Threshold can also be applied by setting a threshold range value. If binary object thresholding is desired, the threshold range should be set to 0. Similar to the calculation of Pearson's coefficient, negative pixel values will be set to zero (which can change the resulting coefficient).

### Spearman's and Object Spearman's coefficient

The per-channel threshold setting in the Colocalization Analyzer can be used to specify a binary threshold. In the case of Spearman's coefficient, voxels with intensities below this value will be given the same rank (they will be ranked as 'background'). In the case of the Object Spearman's coefficient, voxels below the threshold are not included in the calculation. The threshold range setting has no influence on Spearman's and Object Spearman's coefficients.

### Overlap coefficient

Similar to the handling of Pearson's coefficient, the per-channel threshold and threshold range values are subtracted from the pixel values, where negative pixel values will be set to 0 (which can change the resulting coefficient).

### Manders' coefficients

As the Manders coefficients as defined above are all sensitive to background intensity levels, Ri and Gi can be corrected for the background. For the k1 and k2 coefficients, similar to the threshold handling of Pearson's and the Overlap coefficient, the per-channel threshold and threshold range are subtracted from all pixel values. In cases where this yields a negative value the pixel value is set to zero. For the M1 and M2 coefficients, the Soft Threshold is also applied. Each element will be calculated as follows:

$$R_{\mathrm{coloc},i} = R_i \ G_{\mathrm{weight},i}$$
$$G_{\mathrm{coloc},i} = G_i \ R_{\mathrm{weight},i}$$

To retrieve Manders' original definition, the threshold and threshold range should be set to 0.

### Intersection coefficient

As described above, in calculating the intersection coefficient, the per-channel threshold setting can be used to specify a threshold intensity level Rthresh. A Soft Threshold can also be applied by setting a threshold range value.

### Li's ICQ

Similar to the handling of Pearson's and the Overlap coefficient, the per-channel threshold and threshold range values are subtracted from the pixel values, where negative pixel values will be set to 0 (which can change the resulting coefficient).

### Van Steensel's CCF

For Van Steensel's CCF, the same threshold handling is used as for Pearson's coefficient.

## Further information

### Scaling affects ratiometry

As explained above, most of the colocalization parameters are not affected by scaling the relative strengths of the channels. This does not apply for direct channel ratios, of course. See Ratiometric Images for further information.

### Colocalization in the Huygens Software

See the Tcl Huygens command coloc.