# 16.10 Galaxy Clustering

We live in a universe filled with structure. From stars tracing our the arms of galaxies, to galaxies tracing our a lace-like pattern of voids and filaments, across all size scales gravity shapes material into complex structures of varying sizes. One of the largest types of structure is the Galaxy Cluster. These structures consist of anywhere from a few to a few tens of thousands of galaxies clumped together on the sky and held together by gravity. In the more massive clusters, galaxies regularly interact with one another, and these interactions led to dust and gas being stripped from the galaxies to fall into the cluster's core. This dense, gravitationally heated material, can give off X-rays. Often, a special type of elliptical galaxy called a cD galaxy sits in the center of the galactic cities.

The Coma cluster of galaxies, as seen by the Hubble Space Telescope. Click here for original source URL

Random chance predicts that if any set of objects are randomly scattered, some of them will end up with nearby neighbors while most will not. In looking for clusters on the sky, astronomers compare the actual distribution of galaxies to theoretical random distributions. This is a two step task: First astronomers look for places where the number of galaxies on the two-dimensional sky is greater than expected. Second, when areas with too many galaxies are located, the astronomers check if the galaxies are at the same distance. When galaxies are found clumped up together at the same distance in the same place on the sky, we call these clumps galaxy clusters and galaxy groups.

The significance of these clusters can be measured by comparing the number of observed galaxies directly to the number of galaxies expected in a random distribution:

Clustering amplitude = (N_{observed} – N_{random}) / N_{random}

N is the number of galaxies observed in a particular area of the sky — or the number of galaxies in a particular volume, if we can measure galaxy distances as well. It's an average, so clustering can be measured in regions of high and low galaxy density equally well. The clustering strength is the excess probability of finding a neighboring galaxy over the probability if the galaxies were randomly distributed.

To make these measurements, astronomers take any galaxy and count its companions out to a certain angular radius. That number is N_{observed}. Now they calculate how many galaxies would be found within a circle of the same radius in a random distribution — it is found easily from the average surface density of galaxies and the area of the circle. That number is N_{random}. The difference between the two is the excess number of galaxies within a certain angle. The clustering amplitude is the percentage excess of galaxies caused by clustering. We can calculate the clustering amplitude around each galaxy in the catalog and combine the numbers. We can also calculate the clustering amplitude on all angular scales. The result is called an angular correlation function. This function contains the amount of clustering on all scales.

However, we know that galaxies are distributed in three dimensions. In a two-dimensional map, there will be close alignments of nearby and distant galaxies that occur by chance. This will act to dilute or wash out the correlation signal. If a sample of galaxies has measured red shifts, we can use the red shift as a distance indicator and calculate the clustering amplitude in three dimensions. The procedure and the calculation are the same as given above. This time N_{observed} is the number of companions a galaxy has within a sphere of a certain size. N_{random} is the average number of galaxies in a sphere of the same size if the galaxies are distributed randomly in space. The result is called a spatial correlation function.

If N_{observed} = N_{random} on average, the clustering amplitude is zero. This would mean galaxies are randomly distributed. If N_{observed} > N_{random}, then the clustering amplitude is positive. Astronomers have found that the clustering amplitude is largest on small scales of about 1 Mpc. Galaxies are gregarious — the most likely place to find a galaxy is near another galaxy! The clustering amplitude is somewhat smaller on scales around 10 Mpc. By scales of 100 Mpc or larger, the clustering amplitude is very low so the universe is smooth. It is even possible to measure negative clustering amplitude. In this case, galaxies have been pulled out of one volume of space, leaving behind a void, as they fall into a cluster.

It takes a lot of data to measure clustering, especially when looking for small groups. Suppose that we are interested in measuring clustering on a scale of 1 Mpc, but galaxies are so thinly distributed that on average there are only a couple of galaxies in a 1 Mpc sphere. If N_{observed}and N_{random} are both small numbers, then the clustering amplitude will have large fluctuations due to the small number of statistics, and the measurement of clustering will be noisy.

In general, the random number of expected galaxies will have a statistical error of σ_{Nrandom}. To detect clustering, we require the excess number of galaxies to exceed the random fluctuations due to counting statistics, or N_{observed} – N_{random} < σ_{Nrandom}. For example, if N_{observed} = 115 on scales of 10 Mpc and N_{random} = 100, the clustering amplitude is (115 - 100)/100 = 0.15, or a 15% excess of galaxies. However, the excess of 115 - 100 = 15 is not much larger than the statistical fluctuation of σ_{Nrandom} = 10, so this is not a very convincing measurement. Looking for large clusters, spread over larger areas on the sky, is a bit easier. If N_{observed} = 1150 and N_{random} = 1000 for our larger area on the sky, the clustering amplitude is the same, (1150 - 1000)/1000 = 0.15. But now the excess of 1150 - 1000 = 150 is five times larger than the statistical fluctuation of σ_{Nrandom} = √1000 = 32.

In practice, the number of galaxies that randomly scatter themselves across the sky — called the background count — is highly variable. The most reliable means for finding gravitationally bound objects is to make three-dimensional galaxy maps and determine the local average galaxy count for different volumes of space, and then look for small cities against this changing background population.