Allan Variance etc - Harvard University

Notes on Allan Variance and Related Topics (Rev C)

The original application of Allan Variance was to provide a way of quantifying the performance of oscillators. The relevant quantity was the fractional amount of variation in the frequency over a given interval of time. Thus, if the continuous record of the frequency is divided up into intervals of length T and fi is the average frequency in the ith time interval, the Allan Variance is

AV(T) = < [ (fi+1 /) ? (fi /) ]2 >/ 2

where the < > symbols denote taking the average over all the data.

The interesting information is usually in the way in which AV varies as a function of the time interval T, so obviously what one has to do is to go through the data dividing it up into different intervals and calculating AV for each value of T. Note that here the results have been normalised by dividing by the mean frequency, so AV is a dimensionless quantity. This standard definition also includes the factor of 2, which has the effect that for completely white noise (no correlation between the successive values of f) the Allan Variance is just equal to the variance of the individual values of normalized frequency, fi /.

Since the average frequency over the period from t to t+T is just given by (t+T ? t) / 2T, the Allan Variance can also be written in terms of the second differences of the phases at the boundaries of the time intervals. That form is often seen in references. (See the Wikipedia article for details and more background.)

Although the frequency case is relevant to issues like the LO stability when doing VLBI, the main use that we have made of the Allan Variance in ALMA is in characterizing the amplitude stability of the receivers. If one thinks of a typical total power observation, where one integrates for a period T ON-source and then for the same period OFF-source and takes the difference of the mean signal ON minus the mean signal OFF, it is clear that AV(T) is exactly the quantity one is interested in. If the total power output is a sequence of values V, then AV can be calculated using the same expression as above and simply substituting V for f. If it is normalized then it will be a measure of the fractional fluctuations and it will again be dimensionless. See any of the test reports on the ALMA front ends for examples of these plots. (One might instead choose to convert the values V into antenna temperatures and not normalise the variance. One would then, by taking the square root of 2AV(T), get the error in estimating the signal ON ? OFF in units of antenna temperature.)

It is clear that we can use a similar method to analyse phase or delay variations that we see when doing interferometry. Looking at the shape of AV(T) is useful in that it helps us see what the dominant source of the variations is ? e.g. receiver noise, atmosphere or something else ? and, if it is the atmosphere, it tells us about the spatial and temporal structure of the variations.

In this case it makes no sense to normalize by dividing by the mean value since the mean phase or delay is arbitrary. We should therefore use the form

AV(T) = < (yi+1 ? yi)2 > / 2

Here y could be either the phase or the delay, averaged over the time interval T, and the result will have the same units as y2, i.e. radians2 or degrees2 for phase and sec2 or, more plausibly, ps2 for delay.

Implementation

I have a couple of routine that I have used for calculating the Allan Variance in the past and have dug them out and run some examples.

The first of these was designed to work on very large data sets arising from simulations so it is fast:

Suppose the sequence of values is y1, y2, ... yN , and they are regularly spaced with interval t.

Set M = 1 and form AV(Mt) = i = 1,N-1 (yi+1 ? yi )2 / [2.(N-1)]. Then substitute y1 = (y1+y2)/2, y2=(y3+y4)/2, etc.

and put M = 2.M, N = N/2 (truncating to the integer). Now, so long as N is still 2 or greater, form the next sum of squares of differences.

1

This results in values for AV that are spaced by intervals of a factor of two in the time T. The next plot is an example of running this algorithm a few times sequences of 512 random numbers (i.e. uncorrelated) with a Gaussian distribution and variance 1. (The mean is actually 0.1 but of course this has no effect on the result.) The x-axis is the value of M and the y-axis is AV.

2.00E+00 5.00E-01 1.25E-01 3.12E-02 7.81E-03 1.95E-03 4.88E-04

1

4

16

64

256

This is on log scales and one sees the expected result that the variance goes inversely with M. The other thing that is obvious is that for larger values of M the result becomes increasingly noisy. The reason for this is of course that one only has a small number of differences from which to estimate the variance. So long as one has stationary statistics one can repeat the process a number of times and average the result. Here I have run it 32 times and plotted the mean values of AV. The error bars are the error on the mean.

1.28E+00

3.20E-01

8.00E-02

2.00E-02 5.00E-03

y = 1.0185x-1.024

1.25E-03 1

4

16

64

256

To make the errors quantitative I have also calculated the standard deviation S on the estimates of AV and plotted the ratio S/ as a function of M. The calculations are the crosses and the line is the expected value of this ratio, which is roughly sqrt{3/[int(N/M)-1]}. Here the quantity in [ ] is the number of differences.

1

0.25

0.0625 1

4

16

2

64

256

There are two points that are not altogether satisfactory about this approach: 1) it is not using all the information, and 2) one might wish to sample the function AV(T) more closely than every factor of 2. To see that the first point is true, consider a sine wave whose period is twice that of the integration time, T. If it is in phase with the sampling produced by the integrations, i.e. the zero crossings coincide with the boundaries of the integrations, then the presence of this signal will show up strongly in the differences between neighbouring integrations. If however it is out of phase, the differences will be zero. The situation is similar to evaluating the power spectrum but only using the cosine terms.

I have therefore implemented a more general version which avoids these problems. This already exists in the literature and is called the "overlapped" Allan variance. The most efficient way of calculating it is as follows.

First integrate the y values to form a new set of numbers x:

x1 = 0, x2 = x1 + y1 ... xi+1 = xi + yi and so on up to xN+1 = xN + yN

Accuracy will be a problem here if the data series is long and there is a lot of low frequency power, so for my Fortran I used real*8 (64-bit floating-point) here. I suppose this will not be an issue with e.g. Python.

Then form the Allan variance by averaging the double difference:

AV(Mt) = i = 1,N-2M+1 (xi+2 ? 2xi+1 + xi)2 / [2.M2.(N-2M + 1)]

Here is the result for the same data set as before:

2.00E+00 5.00E-01 1.25E-01 3.12E-02 7.81E-03 1.95E-03 4.88E-04

1

4

16

64

256

Comparison with the equivalent plot on the previous page shows that the scatter is reduced. This is confirmed by calculating the ratio S/ as before. The orange line is the same prediction as before and the green line is a factor root 2 lower, which is what one would now expect for the fractional error in AV.

1

0.25

0.0625 1

4

16

64

256

(Note that one does not gain the factor of root 2 for the first data point ? there is no missing information in that case ? and what happens with the last point depends on how many data points one starts with.)

3

For that exercise I increased the value of M by a factor of 2 each time so we had the same sampling of AV(T) as before. If instead we increase it by, for example, roughly sqrt(2) each time (choosing appropriate integers) then we get something like this:

2.00E+00 5.00E-01 1.25E-01 3.12E-02 7.81E-03 1.95E-03 4.88E-04

1

4

16

64

256

This does indeed give the desired finer sampling but note that, if you compare this to the previous case, you can see that all one is really doing is interpolating the AV(T) ? because of the way the differencing works there is little independent information1 for sampling that is closer than a factor of 2.

There is a relationship between the power spectrum of the fluctuations and the form of the Allan variance. From this (and by numerical experiments) one can derive that for a power law spectrum S(f) = h f the Allan variance is also a power law AV(T) = K T where = ? ( + 1), at least over the range 0 > > -2. One has to be a little careful outside this range because of divergences. For a power law spectrum with positive slope there is infinite power at high frequencies and for > +1 saturates at 2. Similarly for < -2 the Allan variance is divergent for large times. Recall that the case with = -2 is a random walk which means that the rms deviation from the starting point increases with the square root of time. Our real signals do not have either of these problems: to the extent that the overall ALMA system is phase stable there must be some timescale after which the fluctuations stop growing and, at least in the case of atmospheric fluctuations, there must also be some shortest timescale on which they occur ? roughly the time for a "blob" to cross the beam of an antenna. I have therefore done the numerical cases with a low- and high- frequency cut-offs at 1mHz and 2Hz respectively. Here is the case with = -1, which is "flicker" noise.

4.00E+00

1.00E+00

2.50E-01

6.25E-02

1.56E-02

3.91E-03

0.06

0.24

0.96

3.84

15.36

61.44 245.76 983.04

The expected result for the Allan variance is = 0, i.e. independent of the integration time, T. It can be seen that this is indeed what we get and that the low- and high-frequency cut-offs come into effect roughly where one would expect them to.

1 In fact the degree of independence between the estimates for different values of T depends on the nature of the signal being analysed. They are even more correlated for "red" noise of the type that we actually see for atmospheric phase fluctuations. The errors in the estimates are also larger than in the white noise case. There is an extensive literature on this.

4

Here are the results (averaged over 16 runs) for the cases = +1 ("blue"), = 0 ("white"), = -1 ("flick"), = -2("walk") and = -3("run"). I have normalized them to have the same variance at T = 10sec.

1000

100

10

blue 1

white

flick

0.1

walk

run 0.01

0.001

0.0001

0.05

0.5

5

50

500

The behaviour seems to me to be pretty much as expected.

Real Data

I have run the second of these two implementations on some of the data from the roughly one-hour observation (uid A002_X4158c3_Xb6) that I was looking at for the dispersion issue. The first example is for a short baseline (CM06 to CM05) and for the baseband (BB4) that has poor signal-to-noise because it is on an absorption line. The phase (in radians) looks like this.

1.6 1.4 1.2

1 0.8 0.6 0.4 0.2

0 0

-0.2 -0.4

500

1000

1500

2000

2500

3000

3500

4000

Clearly this is mostly thermal noise with just a hint of some slower fluctuations.

Note that the data consists of eleven 300-second long scans with substantial gaps in between. I will come back to the question of what to do about the gaps, but for now I have simply analysed the individual scans separately. This means that we can only measure AV(T) out to T of a little over 100 seconds.

5

1.0E-01

1.0E-02

1.0E-03

1.0E-04

1.0E-05

0.5

2

8

32

128

Here the x-axis is T is seconds and the y-axis is the Allan variance in radians squared. The thick orange line is the mean and the black line has the slope of -1 expected for white noise. Generally it is clear that the receiver noise dominates, except perhaps at the longest integration times where there is a slight turn-up.

If we now go to baseband 4 on the same antenna pair, the signal to noise improves by a factor of about 5 and one can clearly see some drifts, which are probably the atmosphere or possibly something else.

1.4

1.2

1

0.8

0.6

0.4

0.2

0

0

500

1000

1500

2000

2500

3000

3500

4000

The Allan variance plot now looks like this.

1.0E-01

1.0E-02

1.0E-03

1.0E-04

1.0E-05

0.5

2

8

32

128

The black line is again slope 1 but with a factor of 25 times lower intercept. (Since this is variance we expect it to go as the signal to noise ratio to power minus 2.)

6

Clearly there is excess phase noise that is evident for integration times as small as 2 seconds. It declines slowly with increasing T and then starts to increase at about 50 seconds. This is interesting and it would be worth trying to find out what the cause is, e.g. is it in the instrument or the atmosphere.

If we now move out to longer baselines and look at the pair DV03 to DV10, the phase now looks like this. Clearly there is a lot of atmospheric phase variation present.

1.5

1

0.5

0 0

-0.5

500

1000

1500

2000

2500

3000

3500

4000

-1

-1.5

The variance plot now shows a steep rise with integration time, but then appears to flatten out:

1.0E+00

1.0E-01

1.0E-02

1.0E-03

1.0E-04

0.5

2

8

32

128

I have kept the scales the same as in the previous plots to make comparison easier. Here the black line has a slope of +1 but the same intercept of 0.002 at the sampling interval.

Note that the results for the individual scans have a larger scatter than expected. This is telling us that we are dealing with non-stationary statistics ? i.e. the amount of short-term phase fluctuation is itself a function of time. This is not surprising given the complexity of the system that we are look at here ? the atmosphere. This is telling us that we will need to get quite large data sets if we want to get meaningful statistics.

7

Finally here is a sample of the long baseline data ? PM03 to DV12

20

15

10

5

0

0

500

1000

1500

2000

2500

3000

3500

4000

-5

Note that we are now dealing with nearly 20 radians of phase variation peak to peak. This time I have had to change the vertical scale by a factor of 10. The black line has a slope of +1.5 and the intercept is 0.004 and this is seen to be a good fit to the mean of the data (orange line). We see that the atmosphere is making a significant contribution to the phase fluctuations even on the 0.96 second sampling time (although the loss of coherence would be small) and then rises rapidly, reaching ~ 1 radian2 at 32 sec.

1.0E+01

1.0E+00

1.0E-01

1.0E-02

1.0E-03

0.5

2

8

32

128

Allan Variance rising as fast as T1.5 is pretty steep (although remember that this is the variance ? the Allan Deviation, AD, which is the square root of AV, would be rising as T 0.75). Recall that this corresponds to a power law spectrum S(f) going as f ?2.5. There does appear to be a turn-over at the longer times, but it is only marginally significant given the scatter at that end.

Note that we are still only talking about times of up to 120 seconds here which is the result of the fact that we are just analysing the individual scans. To analyse the whole time series we need a way of bridging the gaps. A possible way of doing that would be to construct artificial data with the same statistical properties as that in the scans and joining on them together smoothly. Methods of doing that certainly exist but it would be quite elaborate. For now therefore I have simply smoothed the data somewhat and then re- sampled it at an interval which is coarse enough to make sure that none of the samples fall in the gaps. See plot on next page.

What I have done there is to average the phases over 15 points and then sample every 69 points. In general sampling like this, where the interval is not equal to the averaging time, will not give us a consistent estimate of the Allan variance (since the definition requires these to be equal). We cannot do that because the averaging would have to extend into the gaps. In this case, however, the fact that the variance is rising

8

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download