2O5
The Chirplet Transform:
A Generalization of Gabor's Logon Transform
Steve Mann and Simon Haykin
Communications Research Laboratory, McMaster University, Hamilton Ontario, L8S 4K1
e-mail: manns@McMaster. CA
Abstract We propose a novel transform, an expansion of an arbitrary function onto a basis of multi-scale chirps (swept frequency wave packets). We apply this new transform to a practical problem in marine radar: the detection of floating objects by their "acceleration signature" (the "chirpyness" of their radar backscatter), and obtain results far better than those previously obtained by other current Doppler radar methods. Each of the chirplets essentially models the underlying physics of motion of a floating object. Because it so closely captures the essence of the physical phenomena, the transform is near optimal for the problem of detecting floating objects. Gabor's "Logon" Paradigm |
envelope. This notion is known formally in the physics literature as the Weyl-Hiesenberg group. (The "sliding window Fourier transform" belongs to the Weyl-Hiesenberg group since we can think of the bases as being modulated versions of one parent window.) Gabor emphasised the use of a Gaussian window, since it minimises the uncertainty product tf (provided that these support measures are quantified in terms of the root mean square deviations from their mean epochs[2]). The time-frequency logon diagrams, depicted in figures 1 and 2 show how the Gabor function bases cover the time-frequency space, and how we can trade frequency resolution for improved temporal resolution. Gabor originally used rectangles to designate each of these elementary signals. If each of these rectangles were a pixel, and its brightness was set in accordance with the appropriate coefficient in the signal expansion (for example by replacing the rectangle with a dither pattern), the logon diagram would be a density plot (image) of the TF distribution. |
Vision Interface '91
206
1.1 The wavelet transform within the logon paradigm 2 "Wavelets" and Wavelets |
3 Chirps 4 The Chirplet at a Single Scale |
Vision Interface '91
207
but with one small modification. Since we desire unit L2 norm, we instead let |
it easier to form an orthonormal set, but here we have deliberately introduced a specific form of asymmetry and will exploit this structure in what follows. 4.1The Nyquist boundary problem |
Vision Interface '91
2O8
down to -1/2). This chirp will lie on the b = 0 axis, as far to the right as possible (or as far to the left as possible). But then a chirp which has the same value of a, but a non-zero intercept (for example one going from fractional frequency -1/4 to 3/4), will violate the Nyquist limit and give rise to aliasing in frequency space. 5 Multi-Scale Chirping "Wavelets" (GLT basis functions) We propose a basis of multi-scale chirps. Of course the Ga-bor functions (of both the Weyl-Hiesenberg and the affine groups) are just special cases, where the chirp rate is zero (the beginning instantaneous frequency equals the ending instantaneous frequency), corresponding to constant velocity |
(zero acceleration) in terms of Doppler. Our previously described "bowtie" space is also a special case of this generalization, where the scale is fixed constant. 6 Application of the GLT to Detection of Floating Objects in Ocean Based Radar |
Vision Interface '91
209
6.1 Fourier based Doppler processing 6.2 Underlying physics of motion 6.3 Projections of the chirplet as indicators of Doppler evolution of floating objects 6.3.1The single-scale chirplet snapshot revisited |
floating object, (compare figures 9 and 10), but in an even more pronounced way. Furthermore, we have an explicit measure of the target's "chirpyness". In other words, we can see the acceleration signature of the target, and use this information to fit to a physical model. Such physical constraints are useful for verification and also for target tracking. We will refer to this particular choice of two variable parameters as a single-scale chirplet snapshot. The term single-scale refers to the fact that all the bases have the same duration, or physical support (in this case one second). The word snapshot refers to the fact that we are glancing at the target at a particular "instant" (over a short interval of time) and are not tracking the temporal evolution of the acceleration signature.
|
Vision Interface '91
210
the chirplet) of the original time series. In figure 11 (a) we show this slice. If, however, all we want is one free rameter, we are far better off to take the slice through the "bowtie" center. In figure 11 (b) we see that the "Average Instantaneous Frequency" spectrum (AIF spectrum) is much sharper than the Fourier spectrum. 7 Classification using the GLT 7.1 GLT snapshot features |
distribution to the data Although the logon is no longer Gaussian in the GLT space, the Gaussian fit allows us to quantify the spread by the determinant of the covariance matrix, |S|. These values correspond to the second-order moments of the distribution.
7.2 Classification Results |
Vision interface '91
211
We have used Fishers Linear Discriminant (FLD)[10] in preference to Principal Components Analysis (PCA) which only looks at the variance of the features, and requires scaling in accordance with an assumed a priori knowledge of the feature importances. 7.3 The Neyman-Pearson Paradigm 7.4 Weighted k-Nearest Neighbours |
from the corresponding exemplars. If, for example, the distance from the input feature vector to the "winner" was just a bit less than the distance to the "runner up", the first and second weights would be nearly equal. If, on the other hand, the winner was much closer than all the others, its class would be weighted very highly compared to the classes of all the others. 8 Skeletonisation and Time Evolution of the GLT Snapshot
evolution of the GLT snapshot itself. The images in figures 9 and 10 were just "snapshots" of the GLT evaluated at a fixed temporal center. If we move the center epoch of the bases through the data, we can see, basically, a single prominent elementary chirp, moving around on an elliptical locus. (We developed software to display a "movie" or animation of the successive GLT snapshots in sequence on the computer screen. 64 of these snapshots appear in figure 12.) We have coined the term "hypermatrix" for this structure, which we say has 64 "pages", and the rows and |
Vision Interface '91
212
columns of each page make up the image. In figure 13 the loci of this movement are shown, for two different sets of 64 snapshots, projected onto the temporal center axis. (By projection, we mean the average, summing over all pages of the hypermatrix, to reduce three "index dimensions" to two.)
If we visualise this path in three dimensions, we have a helix. A slice through this hypermatrix, along the plane fbeg = fend results in the time-frequency squiggle we saw in figure 7. 9 Current Research |
process on the screen as either constant area ellipses moving and twisting to fit the TF distribution, or as bowties moving around in the chirplet space. References [2] D. Gabor. Theory of communication. J. Inst. Elec. Eng., 93:429-456, 1946. [5] D. Thomson. Spectrum estimation and harmonic analysis. Proc IEEE, Sept. 1982. [10] Richard Duda and Peter Hart. Pattern Recognition and Scene Analysis. John Wiley and Sons, 1973. |
Vision Interface '91