Measuring Capacity of Constellations

Having just the other day finished off my fourth year project (and in doing so, completed my degree! hooray!), I thought I’d share a little tidbit from it which perhaps might be useful to someone else.


Part of the project involved QAM constellation shaping. That is, moving points around in a constellation — in this case, to increase the capacity.
A typical QAM constellation would look something like this, shamelessly taken from wikipedia (created by user Splash):

Whereas here is an example of a shaped constellation which I investigated (coloured by decision region for hard decision decoding):


Comparing these constellations required that I be able to measure their capacities. Here I’m considering a very simple system:
It’s simply a two-dimensional AWGN channel.

Grid Evaluation

This is the first method I tried. It didn’t work very well, and I’ll explain why.

The capacity of the channel is definied by the mutual information between the input and the output:

Here, H(Y|X)=H(X+N|X)=H(N), which is simply the entropy of the gaussian distribution and has an analytic form. In this case, where the gaussian is bivariate and isotropic (that is, the \Sigma matrix is diagonal), this becomes:

H(Y) on the other hand is a nastier affair – it’s the entropy of the noisy signal, or that of a sum of gaussians, -\iint{f_Y(y)log(f_Y(y))dy}. This has no closed form so instead the approach I took was to sample f_Y(y) over a grid, and then sum across it for -\sum{f_Y(y)log(f_Y(y))}

This worked quite nicely for some signals, like this one: pdf-good
Most of the probability mass is included in the sampled grid (in fact it sums to 0.9922 here), so the summation serves as a good approximation to the integral.

But, at low SNR, a lot of probability mass spills over the sides:
Here the probability mass inside the grid only sums to 0.6353. This gave utterly meaningless results (as I’ll show later). I also tried re-scaling the probabilities so that they’d sum to 1, but ultimately this didn’t help either.

Monte Carlo Method

When I mentioned this to my supervisor, he mentioned another possible method, using Monte Carlo instead.

First, flip the mutual information expression around and consider it the other way:

In this case H(X) is easy — X is drawn equiprobably from an alphabet size of M. So H(X)=log(M).

H(X|Y) is the tricky part here, but… we can find it by Monte Carlo.
Pick a load of random points from the joint f_{XY}(x,y):
(This is an example from one of my optimised constellations)

Then for each point, -log(p_{X|Y}(x|y) is calculated and the average is taken. Since the process is ergodic, this tends to the expectation, -\mathbb{E}log(p_{X|Y}(x|y))=H(X|Y)


The Monte Carlo method produced much more sensible results:

Where the grid evaluation method gives meaningless negative capacities at low SNR, the Monte Carlo method gives results tending gracefully to zero.

In hindsight, this could probably have also been done using the original capacity expression, but calculating H(Y) by Monte Carlo… I’ll leave that as an exercise for the reader.


Just for a little bit of fun, here’s a few constellations of different orders compared using the method above.

That’s about it for this post. I’ve run out of enthusiasm to write any more. In fact I did about half an hour ago. Maybe it shows. I almost gave up and deleted the post but I figured I’d finish it in case anyone cared. Do let me know in the comments if you found this useful! It’s good to know what posts interest people, and it helps for motivation too…

Here’s my MATLAB code for the two implementations. It’s not particularly tidy and the call to mvnpdf breaks in octave, but perhaps it will be useful as a reference for someone:
ConstellationInformation.m, ConstellationInformationMC.m