Background

Researchers and investment analysts often want to know if their data (see Time Series Concepts) contain significant amounts of signal at frequencies of interest to them. These frequencies may correspond to driving forces, cyclical or quasi-periodic (seasonal) market forces, environmental constraints or system responses, and may result from intrinsically non-linear processes.

Often, mathematical or phenomenological models exist to explain some observed behavior, and experimental data is collected to verify whether or not the model is correct. System Identification is the name given to attempts to determine the underlying realities sampled in a data set.

Similarly, a transient waveform may exist in a long data stream: a brief but possibly recurring signal. Your requirement may be to isolate all occurrences of the unique waveform in time. Examples include characteristic trading patterns, deformations in manufacturing processes, or heart-wave EKG signature fluctuations. To some extent a waveform may be partially degraded or obscured by another signal (either deterministic or noise).

Spectre is a unique data analysis tool which determines precisely how an ensemble of sine waves make up a data set or time series: periods, amplitudes and phases, data set "energy" in each sine wave. Specify a table of periods (cycle lengths) you believe are present in your data, or let Spectre's AutoGen feature generate a table of relevant periods for you from a quick spectral/temporal estimate.

Spectre tests these periods for their optimal contribution to the data set, and graphically assembles the selected, precisely phased sine waves into a revealing, evolving portrait: an educational visualization of your data you can't readily get from a Fast Fourier Transform, not without specialized training and a lot of demanding effort.

A plot of the original data and its reconstruction can be printed, and the reconstruction can be saved as a data set, on the same scale as the original data. The reconstruction can even be extrapolated into the "future".

If you want to get to work right away, Spectre is highly automated. And if you don't have a good math background, Spectre will be an educational experience. Use the Tools/Generate Data Set option to synthesize sine waves, square waves, sawteeth, noise, trends, offsets, etc. in a graphical environment. Spectre does all the work, you just do the thinking to set up new "experiments" and analyze "why". There are many menu options under PreProcess to alter, combine and shape the "synthetic" data sets you make. Often, you can learn to appreciate what the data is telling you, in a manner no other analysis tool can supply.

As a demonstration, on first running Spectre choose Analyze/Frequency Search from the menu and select a supplied data file. Breath.dat is a physiological data set. The synthesized curve is not a bad approximation to the data, as you can see while the reconstruction builds before your eyes, even though only about 2/3 of the "energy" in the data is accounted for by the identified periods, amplitudes and phases, analyzed in a few seconds. Breath.tbl was edited by a researcher to contain several periods of interest in his work. Refining this table of candidate periods might find even better periods to use, but it would be difficult to beat the capabilities of AutoGen.

The most important experience while running Spectre is in watching the highest-power, long period sines define major trends in the data and seeing the short-period sines fill in the peaks and valleys. On viewing this unique visualization for the first time, you can achieve a very intuitive sense for what the data means.

Spectre will line-plot the actual data in bright green, and the evolving "reconstruction" from the selected periods as a yellow dotted line. Plots are automatically re-scaled on window resizing, actually recomputed every time the window is painted to optimize the displayed data information content.

Each time a new period is selected as containing the next most significant amount of energy, the yellow curve is redrawn to include the new information. You'll get a good sense for periods to include in the .tbl file, and which to exclude, by watching the analysis and then examining the .out file that pops up on the screen after each analysis is complete.

In fact, watching the earlier, higher-energy sinusoids define the major features of your data is an insightful experience; one researcher termed it a "spectacular visualization" of his data. The side slopes of the major sine curves will lie along major trends in the data, although the peaks won't necessarily correspond to peaks in the data. That is left to shorter period components, which when properly phased with the long period components will ride up and down the peaks in your data with increasing fidelity.

For large data sets where the expected periodicities may vary with time, the Analyze/Partition Large Data Set option allows analysis of (optionally over-lapping) sections of the data file.

More...

Spectre has been in use by data analysts for over 12 years. As users registered their copy, they were asked what features would be valuable to them in future upgrade, and so Spectre contains many of the features suggested to us. We encourage purchasers of this release to make us aware of their inrended application, and how Spectre could be made better to accommodate the surprisingly diverse user community. Spectre is used by investors seeking to identify cyclical behavior trading patterns, by vibration control engineers, by animal sound researchers, neurophysiologists, and even by a gentleman in Australia who uses it to help handicap greyhound races. It seems as though each new user applies Spectre to data analysis in a different manner, inspired by a new mind-set to solve problems they could formerly only wistfully dream of solving. We're amazed and pleased at how often users say they can now confidently tackle analysis projects which formerly daunted them. We feel that the original developer of the underlying mathematics, Dr. Michael J. Korenberg of Queens University in Kingston, Ontario, has created a very powerful and flexible System ID research tool. He has earned our collective gratitude and appreciation. Overview... Researchers, investment analysts and engineers often want to know if their data contain significant amounts of signal at frequencies of interest to them. These frequencies may correspond to driving forces, cyclical market forces, environmental constraints or system responses, and may result from intrinsically non-linear processes. Often, theoretical or phenomenological models exist to explain some observed behavior, and experimental data is collected to verify whether or not the model is correct. System Identification is the name given to attempts to determine the underlying realities which are sampled in a data set. Similarly, a known but transient waveform may exist in a long data stream: a brief but possibly recurring signal. Your requirement may be to isolate all occurrences of the waveform in time. Examples include characteristic trading patterns or heart-wave EKG signature fluctuations. To some extent a waveform may be partially degraded or obscured by another signal, either deterministic or noise. Spectre is a unique data analysis tool which determines precisely how an ensemble of sine waves make up a data set or time series: periods, amplitudes, phases, and data set "energy" in each sine wave. Simply specify a table of periods (cycle lengths) you believe are present in your data, or let Spectre's AutoGen feature generate a table of relevant periods for you from a quick spectral/temporal estimate. Spectre tests these periods for their optimal contribution to the data set and graphically assembles the selected, precisely phased sine waves into a revealing, evolving portrait: an educational visualization of your data you can't get from a Fast Fourier Transform, even with specialized training and a lot of demanding effort. A plot of the original data and its reconstruction can be printed, and the reconstruction can be saved as a data set on the same scale as the original data. The reconstruction can even be extrapolated into the "future". If you want to get to work right away, Spectre is highly automated. And if you don't have a solid math background, Spectre will be an educational experience. Use the Tools/ Generate Data Set option to learn about sine waves, square waves, sawteeth waves, noise, trends, offsets, etc. in a graphical environment. Spectre does all the work, you just do the thinking to set up new "experiments" and analyze "why". There are many Tools/Pre-Process options to alter and combine the "synthetic" data sets you make with real data of interest to you. You rapidly learn to appreciate what the data is telling you, in a manner no other analysis tool can supply. As a demonstration, on first running Spectre choose from the menu Analyze/Frequency Search. Select the supplied table of candidate periods BREATH.TBL and data file BREATH.DAT, and accept the defaults in the Search Parameters Dialog. This operation is also encapsulated as File/Run Yoga Breath Demo. BREATH.DAT is a physiological data set. The synthesized curve is not a bad approximation to the data, as you can see while the reconstruction builds before your eyes, even though only about half of the "energy" in the data is accounted for by the identified periods, amplitudes and phases, analyzed in well under 1 minute. BREATH.TBL was edited by a researcher using Spectre's built-in editor to contain several periods of interest in his work. Four of those periods are the dominant ones identified. Refining the BREATH.TBL file might find even better periods to use, but the periods supplied are pretty good, or they wouldn't be used in this convincing demonstration. Compare the results using this hand-made table with the AutoGen feature, which generates the period table automatically from a statistical analysis of the data. Frequency Search is a real learning experience to watch in action, which Spectre lets you do. To freeze the action, hit the Alt key to pause the search. Windows halts all action until you hit the escape key (Esc). Also, use Tools/Copy Data Set to capture the analysis at any point into a new Graph Window, including the intermediate Reconstruction. The most important visualization experience while running Spectre is watching the highest-energy, long period sines define major trends in the data and seeing the short-period sines fill in the peaks and valleys. Viewing this unique perpective for the first time yields a very intuitive undertanding of what the data contains. Spectre will line-plot the actual data in bright green, and the evolving "reconstruction" from the selected periods as a yellow dotted line. Plots are auto-rescaled on window resizing, actually recomputed every time the window is painted to optimize the displayed data information content. Each time a new period is selected as containing the next most significant amount of energy, the yellow curve is redrawn to include the new information. You'll get a good sense for periods to include in the .TBL file, and which to exclude, by watching the analysis and then examining the .OUT file that pops up on the screen after each analysis is complete. In fact, watching the earlier, higher-energy sinusoids define the major features of your data is an insightful experience; one researcher termed it a "spectacular visualization" of his data. The side slopes of the major sine curves will lie along major trends in the data, although the peaks won't necessarily correspond to peaks in the data. That is left to shorter period components, which when properly phased with the long period components will ride up and down the peaks in your data with increasing fidelity. The usual, tedious frequency analysis approach is to pre-process your data, apply a Fast Fourier Transform to the series, and plot the spectrum magnitude. Peaks in the FFT spectrum may correspond to interesting frequencies. However, it is difficult for anyone but a signal processing expert to know how much energy (or "activity") in the time series is actually accounted for by a given frequency. It is even harder to resolve nearby, overlapping broad peaks. More often than not, spectral leakage or noise distorts the spectrum and cannot readily be de-coupled from signals of interest. Pre-processing (filtering, tapering and windowing) of data sets is a very demanding discipline, and many of the rules to assure the validity of pre-processing operations are difficult to apply, as such operations actually modify the characteristics of the manipulated data. Further, most researchers with a need for waveform analyses do not have formal training in the subject, and many feel uncomfortable with having to use a host of implicit assumptions. There is an alternative, nearly painless method available to perform such frequency analyses. The researcher first prepares (as a text .TBL file) a table of candidate periods, or Spectre will construct (AutoGen) such a table for you. On initial creation, the table usually contains a fairly large number of entries, as there may be no prior knowledge of what is really important to you in the data. Periods may be longer than the time series, or as short as twice the time interval between points. But the researcher often knows what to look for based on theory or existing work, and the table will contain several periods in the regions of interest. Spectre searches your data using a table of candidate periods in the .TBL file which you chose, selecting those periods which account for the most "energy" or "activity". Spectre is not just a curve-fitter, nor is it simply an FFT. To use Spectre productively, you should know something about your data, but you need nearly no data analysis background. Spectre uses an adaptation, called Fast Orthogonal Search (FOS), of the Orthogonal Search Method developed by Michael J. Korenberg at Queens University in Ontario during the late 1980's (see References). The algorithm is applied to your data set using an associated table of candidate periods. The precise energy, amplitude, and phase of sine waves corresponding to each selected entry in the table are displayed as soon as they are computed. The objective is to determine if frequencies of interest to the researcher are present in significant measure, and report the results both graphically and in a tabular format. The algorithm analyzes a time series stepwise, determining the ability of each period to explain a significant portion of the total Mean Square Error Reduction, or MSER: roughly, the data set's "energy" content in each slected period. It then orthogonally removes the sinusoid explaining the largest percentage of the variance between the data and the sinusoid. This process is repeated on the residuals until there is no further significant error reduction or until a specified number of periods have been identified. The algorithm is capable of much greater time resolution than a Fourier transform, and is not limited to harmonics (multiples) of a fundamental frequency. It is also quite insensitive to noise, as all data elements are used only in series-wide averages over the orthogonal basis functions (sine waves, in our case). Finally, it tolerates missing data points, irregularly-spaced data sets, and short data segments. In many nonlinear or biological systems, the signal frequencies change, or breathe, as the system evolves, so short segments are necessary for system identification. Spectre is a Multiple Document Interface application, so it will display the results from multiple Frequency Searches on-screen at the same time, including the reconstructions (dot plots on top of data line graphs) and text output files. An FFT is included, also used internally for the AutoGen feature to automatically generate a period table (AUTOGEN.TBL), partly from the FFT's crude energy distribution. Time Series Concepts A time series is a succession of data values at stated intervals. Typically, a time series consists of pairs of numbers: one member of each pair is the value, the other member is the time at which the value was measured. But the values can be from just about any source, and the other member need not be time (for example, in a trading environment you can plot price vs. volume). A time series may be created by an instrument recording sound, heartbeats, light intensity, price fluctuations -- any phenomena at meausred intervals of time, of position, or any event at all which can be recorded. The expectation is that the recording will contain information to assist understanding or controlling the phenomenon. Time series analysis is a collection of techniques to extract that information. The intervals do not have to be regular: a data value may be recorded only on the occurence of a heartbeat, which may be irregular, or even skipped. Many devices log, or record, data in this way. In Spectre, the menu option Tools/Pre-Process/Interpolate may be used to regularize a data set. A data set is made of pairs of X, Y values -- so-called ordered pairs, as the X value is always stated first (X is usually time). The relationship is usually expressed Y = Y (X) and the parantheses here imply Y is to be considered a function of X, to depend on X. X is usually considered the independent variable, Y the dependent variable: each value in a succession of recorded events depends on the time it was recorded. The set of X,Y pairs which comprise a data set is, in graph parlance, referred to the Abscissa and the Ordinate of a graph: the abscissa X values run out along the horizontal axis, and each corresponding ordinate Y value is plotted above its X, at a height proportional to the Y value. In Spectre, the graph is the fundamental element, not the individual X,Y pairs, or even the X or Y values themselves: a data set exists as a collection, and the Graph Window is the simplest, most complete visualization of the data. The main menu option Tools/Generate Synthetic Data allows a researcher to emulate real data using superposed pure sines, square waves, sawtooth waves, squirt waves, a linear trend (constant sloping line), a constant (often called DC, for Direct Current) offset, and additive noise. The researcher first specifies a name for the contrived test data set. The number of data points, temporal resolution (spacing), noise level, trend, the type of wave, and the periods, amplitudes and phases of the waves are then entered. A new Graph Window then appears on screen. Association and Visualization Spectre plots a data set in green, and the synthesized data (the Reconstruction from a Frequency Search) for the selected periods as a yellow dotted line. Plots are auto-rescaled on window resizing, actually recomputed every time the window is painted to optimize the displayed data information content. Each time a new period is selected as containing the next most significant amount of energy, the yellow curve is redrawn. You can get a good sense for periods to include in the .TBL file, and which to exclude, by watching the analysis and then examining the .OUT file which pops up on the screen after each analysis is complete. In fact, watching the earlier, higher-energy sinusoids define the major features of your data is an insightful experience. The side slopes of the sine curve will lie along major trends in the data, and the peaks won't necessarily correspond to peaks in the data. That is left to shorter period components, which when properly phased with the long period components will ride up and down peaks in the data with increasing fidelity as more sinusoids are identified. Select Data Set as Abscissa Select Data Set as Ordinate These features enable the user to associate two data sets, using one as the abscissa, or independent variable, and the other as the ordinate, or dependent variable. Click the System Menu in the upper left corner of each data set graph window, click again on "Select Data Set As Abscissa" or "Select Data Set As Ordinate" (selections are checkmarked on the menus). As soon as a pair has been selected, a new window will be opened showing the relationship. In general, a positive (upward) sloping elliptical blob of green line segments implies a positive correlatiuon between the two data sets. A downward-sloping trend indicates anti-correlated sets. A more or less uniform disk indicates a lack of correlation. Note when running a Fast Fourier Transform on a data set that the spectral magnitude is symmetric (and the spectral phase is anti-symmetric) about zero frequency for real data sets. An asymmetric spectral magnitude would indicate a non-zero imaginary component to the data. You can see this for yorself, and maybe get some ideas how to "package" your data sets, if you try the Graph Window options Select Data Set as Real Part and Select Data Set as Imaginary Part available on the System Menus for two data sets that seem to have a lot in common, though they may not be from identical sources. See also Lissajous Figures. Try Tools/Combine Graphs to subtract or divide two data sets. The resulting graph will emphasize, literally, the differences betwen the data sets. Also, play with the Tools/Pre-Process Data Set options. These can "bend", shift, scale your data. Use the Clip tool to look at a small region of the data (hold and drag the mouse (double-click and hold down, takes practice!) over the region first to define a "rubber-band" rectangle around the region before selecting Clip). The region will appear in a new graph, magnified. At the most elementary level, you don't need to scan columns of numbers to see very subtle differences between two data sets. By juxtaposing two data sets in some way (not necessarily just on the screen or in print), relationships between data sets from very different sources may be seen which would not likely be discovered through conventional analyses. Sometimes, simply tiling the data set's graphs appropriately is sufficient: use Window/Tile Vertically to look at differences in peak and trough values between two graphs, or Window/Tile Horizontally to look at differences between two graphs in the locations of X-axis crossings (zero crossings) and the X-coordinate (abscissa) values at peaks and valleys. Lissajous Figures Choose a sine wave graph as abscissa, and another, different sine wave graph as ordinate from the two graph windows. Click the upper left corner (System Menu) of one window using Select Data Set as Abscissa and the other using Select Data Set as Ordinate. A new graph is created, where instead of the regular, monotonically increasing X values, the abscissa now is the Y value from the second graph. Striking swirling patterns called Lissajous Figures appear in the new window. They represent the interaction between two quantities which are not usually combined in this way. Even more striking patterns are created by making a Fourier Transform from the new graph. Program Environment Due to the modular, low-level nature of each processing function, Spectre is an in-depth tool-kit for educational purposes as well as serious work . It is not a trivial program to learn to use productively, but the feedback is immediate and visual. Spectre was written using the Windows Multiple Document Interface to allow any number of graph and text windows to appear on-screen at once, either overlapped or tiled, up to memory limits. Functions may be applied to Synthesize (Tools) or Analyze a data set or time series, and the results will be left on-screen as one or more new Graph Windows for assessment. Original data is never modified unless you expressly save a Graph Window by over-writing the original data; see Edit/Rename Data Set. The intermediate data set graphs can themselves be operated on and displayed simultaneously on-screen, allowing a degree of controllability not available in programs which allow only one graph on-screen at a time. Any graph may be saved to disk with a couple of mouse clicks or keystrokes, as an internal naming system keeps track of the operation which generated each intermediate graph. Graphs can be iconized, tiled, or stretched to fill the entire screen. Machine Environment Installing Spectre does not modify ANY of your system files, and de-installing Spectre is accomplished by simply deleting the Spectre Program Group and the files, which are usually found in C:\Spectre. CAUTION: Disable any Screen Savers (Program Manager/Control Panel/Desktop) before running a Frequency Search on a long data set. Screen Savers suck up large amounts of CPU time. To use Spectre productively, your hardware environment should be at least a 386 running at 33 MHz, with 8 MB of 70 - 80 ns DRAM, a hard drive with better than 20 - 25 ms seek time, and a Windows-compatible mouse. A math co-processor (Intel 80x87 or equivalent) is required: although Spectre will run without one, it would be too slow to be useful on a large data set. Processing speeds are noted (subjectively) where relevant. A Pentium 100 is at least twelve times faster than a 386/33. Windows 3.1 or better is required. Any DOS version 3.5 or later should work. Time series data processing is intrinsically memory-intensive, so having at least 8 megabytes of fast DRAM installed is not a luxury. Spectre video display utilizes the standard Windows video drivers as well as any specialized drivers which are supplied with video/graphics adapters, as long as they were registered with Windows Setup. Even EGA/VGA 16 color video modes are supported. Spectre has been run successfully under Windows 95. Each (X,Y) data point requires 16 bytes from the global heap (2 double-precision floating point numbers) . Another 16 bytes per point, plus change, for a rough total of 30,000 points per MByte, is used for working arrays ONLY during the Analyze/Frequency Search and Waveform Identification. All working arrays are returned to the system (global heap) after each analysis. Stack usage is minimized by block declarations and dynamic allocations on the global heap for most objects. Each graph window stores its own own auxiliary data, such as the data set mean, the extrema, the RMS value, etc. to save recomputing each time a data set statistic is wanted. Spectre began as a DOS program; the main reason for porting to Windows was to get an 84,000 point datafile analyzed for a physiological researcher. Spectre will analyze a 200,000 point data set on an 8 Mbyte system, over half a million points in 16 Mbytes, without swapping to disk for virtual memory. Use Partition Large Data Set to analyze truly huge files with millions of points. Spectre closes all files immediately after each access, but be aware that Windows uses many files internally. Typically, you will want to insert statements into CONFIG.SYS to set the FILES= and BUFFERS= limits to between 30 and 40. Help For Spectre Users Extensive help is available on all features from the menu selection Help/Contents, or hit Shift + F1. For specific menu selections, the ordinary context-sensitive help accesses Help topics directly for any menu option. Hit the Alt key to enter the menu system and hit Shift+F1 (together) to evoke a Help Cursor; use the arrow keys or drag the mouse over the menus to highlight the relevant option. This loads the full WinHelp engine and the Spectre.HLP helpfile, opened to the section selected by the special Help Cursor. Although this is the mechanism preferred by Microsoft to display context-sensitive help, it's a bit intrusive, as the entire screen is taken over. If the Help Cursor is no longer needed, hit the Esc key to restore normal cursor functioning. From within Help, the Search option allows the user to scroll through all Spectre help entries. Finally, some dialog boxes are supported by a Help button which either displays a text window of help information or opens the WinHelp system to the relevant topic. If unusual program behavior occurs, perhaps accompanied by speaker "chirps", Spectre will activate a text diagnostics file, DIAG.TXT. Registered purchasers of Spectre can call CoDebris (858-755-4492, 9am - 7pm Pacific Time) with questions or leave E-mail at codebris@ucsd.edu with any questions. We can probably steer you to the right selection or sequence of selections, or make arrangements for new features. As there is often more than one way to do something, discussions about time series processing are usually illuminating to both parties. CoDebris will also recommend one of the other available commercial packages if it becomes clear you need to spend that kind of money (typically over a thousand dollars). We gain insight into what people really need this way, so new features and interfaces can be added to future versions. CoDebris is particularly interested in supporting more data acquisition devices: storage scopes, A/D converters, digital I/O boards, sound boards, etc. Let us know what you're using and if there is sufficient interest we will contact the device manufacturer for interface specifications. Contact CoDebris for information on making user specifications for a custom version upgrade: special purpose filters, proprietary data formats, new features. Source code for portions of Spectre is available, and many parts of Spectre can be exported as an object library (LIB) or Dynamic Link Library (DLL). For a negotiable fee, any desired feature or customization tailored to your requirements can be incorporated, either in the retail or a private version. If the feature is of general utility, the fee may be small or even waived completely. We are very interested in enlarging the variety of data set formats Spectre can recognize: let us know if you want Spectre to read your spreadsheet or data base files or accept data from acquisition hardware: storage scopes, sound boards, DSP boards, A/D or digital I/O boards. If you run Spectre and like it, let us know. If you run it and don't like something about it, let us know that too. Spectre is becoming a fairly comprehensive waveform analysis and synthesis package, and most features are added in response to user suggestions: in the works are more powerful digital filters, multi-variate data sets and non-linear dynamics tools. Users are highly encouraged to help prioritize additions, and to add their own requirements. Future upgrades to Spectre will add these features: Multi-variate data sets in a multi-graph window (e.g. simultaneous heart rate, EEG, and cardiovascular output vs. time, or simultaneous price and volume in different markets vs. time). Stacked "3-D" FFT plots from a partitioned large data file. Full-resolution (not scaled bitmap) printed graphs, color optional, with settable titles, legends, and captioning. Expanded capabilities for Waveform Identification: multiple active templates, adaptive incorporation of user acceptance or veto criteria in succeeding analyses. Frequency Search Introduction This is a new and greatly improved version of the Spectre Frequency Search and Data Synthesis program. Install Spectre in a convenient Program Group under Windows and click the icon to run, or select File/Run under Program Manager and fill in C:\Spectre. For a demonstration run, select File/Run Demo..., or choose Analyze/Frequency Search and select a (supplied) ASCII table of candidate periods (TBL) and a supplied data file (DAT). The TBL file can be edited and saved in Spectre. A time-stamped output file (OUT) contains summary information including the selected periods, their computed amplitudes and precise phases. The OUT file can be loaded into Spectre for viewing, editing or printing. Take a look at WHEAT88.DAT, a commodity price dataset. It was Frequency Searched with the AutoGen option, letting Spectre automatically decide which periods are likely to be significant using a crude FFT and a Period/Amplitude estimate. The Reconstruction (dotted yellow) curve is an excellent approximation to the data, with about 90% of the "energy" in the data accounted for by nine identified periods, amplitudes and phases, analyzed in less than a minute. The remaining six periods, all less than 1% of the data, fill in more details. Although the bulk of the data set energy is long-period contributions, the low-energy features are revealing from a technical analysis viewpoint: mixed short- and long-term cyclical behavior (roughly: month, quarter, semi-annual and annual), where several periodic factors converge every so often to yield a significant change even though none of the cycles is individually very strong. Way to subtle for a Fourier Transform. Frequency Search is a real learning experience to watch in action, which Spectre lets you do: to freeze the action, hit the Alt key to pause the search. Windows halts all action until you hit the escape key (Esc). Or, use Tools/Copy Data Set to capture the analysis at any point into a new Graph Window, including the intermediate Reconstruction. True analysis power. Full text editing for files smaller than 32kB is built into Spectre. Note that the actual data files are not meant to be directly editable in Spectre, but they may be altered in the pre-processor (menu selection Tools/Pre-Process Data Set). This allows outlier (artifact) removal, partioning, segmentation, decimation, scaling, filtering, etc. to be applied to named data sets. References Korenberg, M. J. "Fast orthogonal identification of nonlinear difference equation and functional expansion models" Proc. 30th Midwest Symp. Cir. Sys. 1:270-276 8/87. Korenberg, M. J. "A robust orthogonal algorithm for system identification and time-series analysis" Biol. Cybernetics, 60:267-276 1989. Korenberg, M. J. "Identifying nonlinear difference equation and functional expansion repesentations: The fast orthogonal algorithm" Ann. Biomed. Eng. 16:123-142 1988. Korenberg, M. J.; Bruder, S. B.; McIlroy, P. H. "Exact orthogonal kernel estimation from finite data records: extending Wiener's identification of nonlinear systems" Ann. Biomed. Eng., 1988. Korenberg, M. J. "Orthogonal identification of nonlinear difference equation models" Proc. 28th Midwest Symp. Cir. Sys. 1:90-95 8/85. Ljung, L. "System Identification: Theory for the User", Prentice-Hall, Englewood Cliffs, NJ 1987. Pincus, Steven M. "Approximating Markov Chains", Proc. Natl. Acad. Sci. USA, 89:1-5, 5/92. Uchida, Sunao et al "Computerization of Fujimori's method of Waveform Recognition. A review and methodological considerations for its application to all-night sleep EEG", J. Neuroscience Methods, 64:1-12, 1996. Zawadzki, Eugene M. "Identification and Extraction of Waveform Signatures From Electro-Encephelogram Data", CoDebris 95-2, 8/28/95 2. Program Navigation Finding a comfortable set of methods to use with complex software is usually a trial-and-error process, as documentation is usually specific to a given task and doesn't offer a section which looks at the software from the user's viewpoint. Here is Spectre from the keyboard, mouse button and cursor point of view: Graph Window A Graph Window contains a plot of the X,Y values from a data set as a bright green line plot on a black background. If a Frequency Search was performed on the Graph Window, the superposed, reconstructed sine waves identified during the search are plotted as a yellow dotted curve on the same graph. If the data file was loaded as "Scrolling", clicking on the left or right side of the graph "moves" the data within the window, loading only enough data into RAM to fill the window. Abscissa and Ordinate Abscissa: the set of X values of a Graph Window. The horizontal "time" axis of a time series. Every X value has a corresponding Y value in a data set. Ordinate: the set of Y values of a Graph Window. The vertical axis of a time series. Every Y value corresponds to a definite X value in a data set. Typically, Y is the "event" and X is the time the event occured. Scrolling Graph Window A Graph Window which can be scrolled, for data files too large for physical memory (RAM). This is done to avoid using Virtual Memory (swapping to the hard disk) with the associated slowing of routine operations... a factor of up to 100,000 slower! Scrolling can also be used to provide a detailed view of data files too large to be viewed in the few hundred pixels across a screen. Click the left mouse button on the right or left half of the Graph Window to move data forward or back in the window. The keyboard right/left arrow keys also scroll data, and may be held down for nearly-continuous scrolling. All Spectre operations on the data will occur only on the section currently in the window. To operate on an entire large data file, use the Analyze/Partition Large Data Set options. Text Window A Text Window contains editable text. The text can be altered and saved to disk. If the text is a Period Table (.TBL file), then any edits made to the entries MUST be saved to disk before a Frequency Search is performed using the modified Period Table, as candidate periods are always loaded from file. The same caveat applies to edited script (INI) files. Running Spectre As with most new programs, use the defaults first time through and play with the options as experience accumulates. Using Spectre can be an insightful and educational experience, giving you great control over some standard time series processing elements, and introducing a few new ones. After loading a data set (File/Open Data File, or creating a synthetic data set with Tools/Generate Synthetic Data), exercise all options which seem even remotely applicable to your line of inquiry, and discard unsatisfactory intermediate graphs. Time series processing is often accompanied by surprises which yield real insight into your data, and is a whole lot more fun than word processing. Save interesting results to file often for comparison with later results. You may want to create an empty subdirectory to ease cleanup and discarding of intermediate results which have been superseded by improved versions. If you don't want to use the built-in file naming convention, choose unique filenames which will allow easy cleanup: AB1.DAT, AB2.DAT, AB2.SYN, etc. are backed up or deleted with a single command COPY AB*.* \SAVE or DEL AB*.*. Use the context-sensitive help (Alt Shift F1) for menu options often while in the menu system. Dialogs requiring user input also have context-sensitive Help buttons. Diagnostic Sounds Sounds from your system speaker during a Spectre session are usually intended to tell one of two things: either the system could not satisfy a request of some sort from you or the program, or something needs to be done by you and you're not doing it. The occurrence of a "beep" from the speaker often indicates an action needs to be accomplished before you can move on. Maybe a dialog box or message window needs its "OK" or "Cancel" button pressed. Usually, hitting the keyboard "Enter" key will suffice, as the default option is the one you'd normally take, and it is highlighted with a small, dotted rectangle around the text printed on the key. A sound like a "chirp", with some variation in the tone and duration, is used by Spectre to indicate a degradation in performance. Either too many system resources are in use, or requests for memory are failing. The solution is commonly to shut down any other applications running in background. If things have sunk low enough (windows are not being painted, are broken up or do not appear when expected), save your work, exit Spectre, close down everything else, exit and re-start Windows itself. A diagnostic and log file DIAG.TXT is produced whenever a chirp occurs. The file also contains a log of all significant operations run during the session. DIAG.TXT is over-written each time Spectre is started. Monitor the status of your system memory, hard drive, and resources occasionally from the Help/About Spectre... menu selection, especially if operations are getting to be slow and uneven. Scripted Operations To automate a series of operations, a capability to run scripted commands exists in Spectre. An operation is typically of the form "command, datafile, parameters". A sample INI file is provided with Spectre: ; SPECTRE.CFG ; Sample editable configuration file 1/20/2000 ; Copyright (c) 2000 CoDebris, all rights reserved. ; Backup your configuration file before editing it, and ; save your changes using File/Save or Save As with a ; personalized file name. ; Free-form comments to end of line after a ";" are for your ; use, the ";" may be anywhere. Comments, blank lines, ; tabs, spaces are ignored. Use comments liberally to help ; recall what your entries mean. ; Section header names (tokens) are between brackets [...], ; missing sections use defaults. Data after a section header ; is for that section only. ; Commands are of the form "command, datafile, parameter, ..." ; (note commas) All datafiles must have a Spectre header to ; auto-load for scripted commands [commands] ; Executed in listed sequence search, kc41.dat, kc1.tbl open, kc22.dat save, active extract, kc41.dat, kc1.dat, autogen rms, m80.dat, 1024 ; Cretaes m80.par [datapath] ; Data files directory, usually whole path c:\freq ; from root (first "\"). Can be a ; subdirectory. "c:" may be any drive. ; Trailing "\" optional. ; Default is "." (current dir) [templatepath] ; Template files directory (waveform c:\freq ; identification samples) Keyboard Mouse Emulation With no mouse, most processing is difficult, but possible. The keypad arrow keys will move the cursor over a graph. For mouse emulation, make sure the Num Lock key is not active (unlit). The keypad will also permit precise one-pixel-at-a-time position adjustments over a Graph Window when mouse movement is awkward or mouse response is "sticky". The rate of movement increases as an arrow key is held down. The arrow keys can always be used in conjunction with the mouse for greater positional control. By the way, all menu responses can be invoked from the keyboard as an "Alt key sequence": Window/Tile Horizontally is selected with Alt W H. The mnemonic letters are not always initials on pop-up menus with several options, as naming conflicts often result. The relevant key code is underlined on each menu entry. Several Menu selections also have dedicated "virtual keystrokes". Ctrl X, for example, is usually used to Edit/Cut highlighted text from a text window. Graph Cursor A special cursor appears automatically over a Graph Window as the mouse is moved around. It is a cross-hair type, but with an open region in the center to allow precise alignment over any pixel. As long as the mouse left button or a keypad cursor key is depressed, the cursor cannot leave the plot surface, allowing you to get into corners or right up to plot edges without triggering the system move/size cursors. Dragging the mouse draws a "rubber-band" over the graph. The left, right, upper and lower edges are used to select a clipping region for Tools/Pre-Process/Clip Outside Rubber-Band. Clipping Rubber-Band Drag the mouse (double-click and hold down; takes some practice) to draw a rectangular "rubber-band" over part of a graph. The rubber-band edges are used to select a clipping region (Tools/Pre-Process/Clip). Regions left of, right of, above or below the clipping rectangle are excluded. A very useful tool for making a new data set from a piece of an existing Graph Window, either to work on or simply for a "zoom lens" close look. Graph Coordinates Display Select Window/Display Graph Coordinates. Move the mouse over any Graph Window. The x,y coordinates, data set index, and corresponding pixel position can be noted. The keyboard cursor keypad can be used to move in precise increments across the graph. Here is a good place to note that if there are fewer pixels available across the screen than the number of data points, all of then won't get painted (displayed). If you want to see point-to-point variations in your data , try Tools/Pre-Process/Clip to zoom in on a small section of the graph and expand it into a larger Graph Window, as discussed above. Right Mouse Button The right mouse button draws a vertical line over the active graph window at the cursor location. Click at that X value again to remove the bar (use the keypad arrows to align cursor). In all other situations, the right mouse button either has no effect, or produces the same response as the left button. Activating Graph Windows When more than one Graph Window is open on-screen, clicking anywhere over the window surface is sufficient to activate it. Clicking over the Title Bar (also called the Caption Bar) and dragging the mouse is used to move a window. Try Ctrl-Tab to activate graph windows in succession to "thumb through" a screenful of data sets. It can bring up graph windows from the bottom of the "z-order" window stack, and "sinks" the formerly active window to the bottom. See Window Keys. Graph Window Sizing You can, of course, drag the border edges or corners to resize a Graph Window. Spectre will completely recalculate the plot to optimize the information displayed. Keep in mind that a large data set may have many more (or quite a few less) points than there are horizontal pixels on the screen: you can't expect to see every datum unless you slice up some data sets. The Tools/Preprocess/Interpolate, Segment or Clip options can help here. The keyboard can also be used to move or re-size graph windows. See Help/Keyboard for more information on using the sometimes much more effective keyboard in Windows applications. 3. File Menu File/Open Data File Open a data file into a new Graph Window, where it appears as a bright green plot over black background. The data set in any active (highlighted title bar) Graph Window is directly available for any processing option in Spectre. After selecting the data file name, if the data file does not have a Spectre header attached, you are presented with a Data Type Dialog, whre you are asked to specify the data format. Spectre accepts both ASCII text and binary formats. Binary data files are assumed to be Y-only values (ordinates only), but you can select "X,Y Pairs" to over-ride this default if you know your binary data file layout contains data in the abscissa followed by ordinate order. For ASCII data, Spectre auto-detects whether "X,Y Pairs" (2 coulumns) or "Y-Only" (1 column) data is present. If the file already has a Spectre-style header, the Data Type Dialog is skipped, as all the info necessary to load the file is in the header. Very large files (too large for physical memory) can be opened into a Scrolling Graph Window. Enter the number of points to show on-screen in each scroll. Only binary data files may be scrolled. For scrolling, convert long ASCII data files to binary by loading them non-scrolling and saving as binary. Data may be scrolled forward or back by clicking on the Graph Window right or left of center, or by using the right/left arrow keys on the keyboard.. All Spectre operations occur on the data currently in the scrolling window. You select the scroll size, in terms of data points displayed on-screen. Also select the calibration options: Auto-Calibrate (default) means the data is always displayed so it fills the Graph window. If this option is de-selected then you can enter the lower and upper calibration limits. If for example -32768 to 32767 (the range of a 16-bit integer) is chosen, then as the data is scrolled it will maintain a uniform appearance, and only limit-up or limit-down excursions will touch the Graph Window edges. Otherwise, every time you scrolled the data set it would change its vertical scaling to fill the window. Any data outside the calibration range appears clipped, but the actual data is not modified. ASCII data files are not intended to be directly editable as text, but a minor subterfuge will allow you to bring in at least a piece of one for viewing as a Text Window. Just open the data file using the File Filter list box option "All Files (*.*)"in the File/Open Output File... dialog and click on the file name or enter the data file name directly in the file name edit box. If the file is larger than 32kB, only a portion will fit in the text buffer, so don't edit it and don't try to save it. Spectre accepts a few proprietary binary data file formats (EEGs, some instrumentation) in this version. Users with requirements for these or other special formats should contact CoDebris for assistance. Data Type Dialog On opening a data file, you are asked to specify the data type (ASCII text, binary, or proprietary) and the data layout (whether the data is arranged in two columns as pairs of X Y numbers, or as a single column of Y data values). The data may be opened into a normal Graph Window or (for binary data files only) into a Scrolling Graph Window. You need not specify the data layout for ASCII (plain-text) data files, Spectre figures it out based on the number of columns of data. Binary data files consist of 8-, 16-, or 32-bit integers, or 32- bit floating point numbers in the universal IEEE format. Binary data is usually produced as an ordinate-only (Y-only) file. It can be difficult, if you don't know the data type, to choose the correct option; Spectre tries to determine the correct format on loading the file and offers an option to go ahead or try another format if an anomalous data layout is found. The most common binary format is as 16-bit integers, or occasionally as 32-bit floating point. For binary files, the data is assumed to be arranged in rows of X Y pairs, or as ordinate (Y) data only. In the first format, Spectre assumes the X value (the abscissa) precedes the Y value (the ordinate) on each row. Binary data files contain NO delimiters, just a contnuous array of values (XYXYXY... or YYY...). The Y-only format consists of a single number, the Y value, in each row. Many research and engineering analysis instruments produce data files in this format; you are expected to know that the X values are uniformly spaced, and what values to use for the initial X value and the interval between points. If you aren't sure which values to use, use the defaults (initial X = 0.0, interval = 1.0) and proceed. In most cases, the X values determine what numbers to choose as candidate periods in the Period Table (see menu option Analyze/Frequency Search) when performing a Frequency Search. The important consideration is to be internally consistent in your choice. Here again, using the AutoGen feature will greatly help. Knowing the range of your data may help you to decide the data type. 8-bit data has a range of 0-255 or -128 to +127. 16-bits have a range of 0 to 65535 or -32768 to 32767. 32 bits extend the range to 0 to over 4 billion, or -2 to +2 billion. As floating point data is in exponential (also called engineering or scientific) notation, e.g: 1.234e-19) form, the range can be huge. Choosing the wrong format can result in a graph which looks like a scatter diagram, or may exhibit a "ladder" or "comb" effect with valid (ordinate) data interspersed with invalid (abscissa) points. Spectre tries to determine the data type and layout when it reads the data file. If Spectre thinks you have chosen the wrong type or layout, you will be informed and may continue or retry loading with another format. If you decide to open the data into a Scrolling Graph Window you can select Auto-Calibration, where Spectre sizes the display of each scrolled section to the minimum and maximum Y-values, or you can set the minimum and maximum limits to keep each scrolled section uniformly scaled. Finally, note that unlike an FFT, Spectre will accept data files with unequally-spaced, or missing, X (abscissa) values. The Frequency Search algorithm doesn't mind, but you will want to convert to equi-spaced data using the Tools/Pre-Process/Interpolate option before using the Fast Fourier Transform. Additional information: Data File (Time Series) Format Preparing for a Frequency Search Data File (Time Series) Format Data files accepted by Spectre have a time-series formatted either as X,Y pairs of (time, value coordinates) or simply as a list of ordinate (Y) values. Data can be read from ASCII (text) files, or sequential binary 8-, 16- or 32-bit integers, or 32-bit floating point values. "Time", of course, can be any variable relevant to you work (e.g. angle, position, heartbeat), not necessarily just physical time; the usage "time series" is traditional. Any valid DOS filename will work, although Spectre first filters files with a .DAT extension. Note that for a Frequency Search the times need not be uniformly spaced, and there may be gaps. Times, however, MUST be uniformly spaced with no gaps (or made so) for a Fast Fourier Transform or for digital filtering to work. This limitation on the FFT is one of the key advantages of the Frequency Search technique. An irregularly-spaced data set may be regularized by using the Tools/Pre-Process/Interpolate option with the default entries. For ASCII text data files, absolute spacing of columns is not important. All entries are normal, ASCII (man-readable as well as computer) text, not binary-coded. Integer, decimal, or exponential notations are acceptable. An example of the "Abscissa, Ordinate" format: 0.0 -47.3 0.5 -22.4 1.0 14.6 . . . . . . if your data set is of the form "Ordinate Only", it will look like the example below for ASCII text data. If you are getting your data from a spreadsheet or data base, creating data in the "Ordinate Only" format should not be a problem. Most binary data sets from instruments, or from so-called streaming data, are in the "Ordinate Only" format. You will be asked to supply the starting time and the time interval between points as the file is being loaded into memory. Spectre accepts negative start times. If you just don't know the start time or the time interval between points, use the defaults: start time 0, interval 1. You will have to construct the PERIOD.TBL and interpret the Frequency Search results realizing that the "1" interval could mean milliseconds, seconds, weeks, furlongs, or heartbeats. Know your data's source intimately and it won't lie to you. Y-Only data format: -47.3 -22.4 14.6 . . . Remember that although the terms 'time' and 'time series' are used, the X variable in a (X,Y) data set can be any quantity on which the Y values depend. However, be VERY careful over your choice of units (degrees, radians; or seconds, minutes, heartbeats)... it matters greatly in your analysis. Strive for internal consistency during a Spectre session, and throughout your work. Try "walking through" a data processing session on paper, using printout of computer input and output files for raw material: check that numerical answers are appropriate to your assumptions. Spectre can be of great assistance here, as you can design your own data set in the Tools/Generate Synthetic Data dialogs. Anything you don't understand about your data can be simulated and run through a Frequency Search or FFT. You will learn a lot in a short session about your assumptions concerning nuts and bolts stuff: units, intervals, bandwidths, sampling rates, etc. File/Open Period Table Opens a Table of Candidate Periods (.TBL) into an editable Text Window. If edited, the table must be saved (File/Save) before it can be used in a Frequency Search, as it will be read in from disk before the search. IMPORTANT: period tables are not data files. The distinction needs to be made clear for many users. See Period Table Format for more details. AutoGen Period Table The AutoGen capability allows you to produce a reasonably complete Frequency Search analysis automatically, without having to specify a range of candidate periods in a .TBL file. AutoGen bootstraps the table generation using an FFT of the data to be analyzed (see Fast Fourier Transform) to roughly localize the significant energy. The FFT is supplemented by "Period/Amplitude" analysis, as the FFT by itself can never have sufficient resolution to generate all periods having substantial contribution to the data. The AutoGen table is automatically saved to file as AUTOGEN.TBL, and appears in an editable text window on-screen. Mean Square Error Reduction (MSER: roughly, waveform energy) results using AutoGen are typically several percent more than when using a hand-made table, but may not precisely identify exact periods you know to be present in your data. If you know that a certain period is expected to be in the data, edit AUTOGEN.TBL to include this period, and invoke File/Save As to write it to disk as a named file. The added period need not be ordered with the others; it may appear anywhere in the list. Period Table Format A Frequency Search requires a table of candidate periods in a text file FILENAME.TBL to be created by the user (Spectre contains a rudimentary text editor for creating, altering and saving files). A typical PERIOD.TBL has the form (for example): 30.0 20.0 15.5 2.25 8.85 . . . where 30.0 is the first candidate period, 20.0 is the next candidate period, . . . The units for period are up to the user, but must be consistent with the data abscissa values in dimensioning (e.g. seconds, heartbeats, inches). The periods may be in any order, not necessarily monotonic. To help you associate a period table with a particular data file, create it with the same filename, followed by the .TBL extension (e.g. EXPR33.DAT, EXPR33.TBL). If you are only interested in determining whether a given set of periods is found in your data, you can include only those numbers in the Period Table. The amplitudes, %MSE, etc. will be correct, but the reconstruction won't look very convincing. Some users load a larger number of candidate periods into the table, running from several times the length of the time series to be searched all the way down to a few times the interval between data points, and covering the region between in a quasi-logarithmic way: say 5 points above 1000.0, 5 points in the hundreds, 5 more from 10.0 to 100.0, etc. After a preliminary run (perhaps on an abridged data set if it is very long, or one interpolated to have fewer points (see Tools/Pre-Process/Interpolate) examine the .OUT file for likely candidates to refine, and others to drop from further consideration. Selecting appropreiate values depends greatly on the characteristics of your data set and the Period Table (see Period Table Format) you have created. If your data set is long, you may want to practice with a lower-resolution version of the data: use Pre-Process/Interpolate and enter, for the number of data points, no more than (say) 1000. You'll get a "rough copy" of the data set to refine your analysis skills; save it to disk with File/Save As with an easily recalled name, including the DAT extension so it appears in the File/Open Data File when you repeat your tests. File/Open Output File A text output file is produced by Frequency Search (.OUT), by Pre-Process Data Set and by Generate Synthetic Data (.SYN) or Reconstruct From Search (.RCO). These are informational files only, and are not used directly in any time series processing. As they are time-stamped and reasonably detailed, they can be used as a record of processing steps. In addition to logging error diagnostics, DIAG.TXT is a record of almost all activity during a session. As it is over-written each run, you can use File/Save As to reanme it for storage. See Interpreting The OUT File. By the way, any ASCII text file up to 32kB in size may be opened and editted in Spectre using the list box entry "List Files of Type: All Files (*.*). File/Open 3-D Plot File You can view the results of a previously-run Frequency Search on a large data file (Analyze/Partition Large Data Set) as a rotatable and scalable Energy vs. Period vs. Time plot. The PLT file has 5 columns: Time, Period, Amplitude, Phase, MSER. The Energies (actually %MSER) and Periods are those identified from the search over short sections of a large data set, typically sized to identify characteristic waveforms. If a certain waveform has, say, a length of 120 points the interval chosen would be about twice that. Only periods identified to match a "Template" waveform (to settable tolerance) would be included, usually producing a sparse plot, or all periods found might be included, which could generate a very dense (and slow) plot. The keypad cursor keys rotate the plot about axes in the screen plane, the Insert/Delete keys rotate it about the normal to that plane (called the Z-axis). Working in pairs, the "/" or "*" keys, the "Home" or "End" keys, the "+" or "-" keys, the "PgUp" or "PgDn" keys and the "<" or ">" keys scale or move the plot, which is drawn in perspective. Each point is marked by a colored ball, and points are optionally connected by a colored solid line. See Settings/Screen Layout. The results are printable as a graph, and are only a simple wire-frame plot. Printable colored or shaded plots with hidden-surface removal, legends and axis values are in the works. File/Open Script File Opens an editable script .INI file as a Text Window which can use the Clipboard, be edited, and saved to a named file (File/Save As or File/Save). Retain the .INI extension. Script files are used to tailor the Program Environment for a particular facility or for uniform analyses of related groups of subjects. Any directory structure you want is OK, even just leaving all files in the Spectre local directory (the default if no [datapath] or [templatepath] is specified). An example file is shown on the next page. ; SPECTRE.CFG 1/20/2000 ; Sample editable command and configuration file ; Copyright (c) 1998 CoDebris, all rights reserved. ; Backup your configuration file before editing using ; File/Save As with a unique file name, and save your ; changes using File/Save before using File/Run Script ; Command to execute commands. ; Free-form comments to end of line after a ";" are for your ; use, the ";" may be anywhere. Blank lines, tabs, spaces are ; ignored. Use comments liberally to help recall what entries ; really mean, or to "store" unused commands, etc. by ; preceding them with ";". ; Section header names (tokens) are between brackets [...], ; missing sections use defaults. Data after a section header is ; for that section only. All entries may be in mixed case. [datapath] ; Data files directory, usually whole path c:\Spectre ; from the root (first "\"). Can be a ; subdirectory. The "c:" may be any ; drive. Trailing "\" is optional. ; Default is current directory, usually ; c:\Spectre [templatepath] ; Waveform identification template files c:\Spectre ; directory ; Commands have the form "command, datafile, parameter, ..." ; (note commas). Parameters are the values you would enter ; into dialogs if running the command interactively from the ; menu system. For scripted commands all datafiles must have ; previously been saved with a Spectre header to auto-load. ; To speed up all Analyze/Partition runs, minimize or shrink the ; large data file Graph ; Window. [commands] ; Executed in listed sequence rms, a80.dat, 500 ; RMS Reduction, 500 points ; Other examples of command usage: ;extract, kc41.dat, kc1.dat, autogen ; Extract, file, template, tbl ;pause ; Wait for "OK" ;search, kc41.dat, autogen ; Search, datafile, period tbl ;longsearch, kc41.dat, 200, kc1.tbl ; Search 200 points at a ; time ;longfilter, ktest.dat, 500, 21, 0.1, 0 ; file, points, taps, lo, hi ; cutoff ;open, kc1.dat ; Open file and display ;save, kc1.dat ; Save active Graph ; Window to file ;closeall ; Close all windows ; Demos mostly use hard-coded Settings, but equivalent ; command strings can use any Settings (within reason) ;demo1 ;demo2 ;demo3 ;demo4 ;search, wheat88.dat, autogen ; Demo1 (Commodity) ;search, breath.dat, breath.tbl ; Demo2 (Yoga Breath) ;extract, ktest.dat, kc2.dat, autogen ; Demo3 (Waveform Ident) ;search, test3031.dat, test3031.tbl ; Demo4 (FFT comparison) File/Convert Multi-Column File Convert an ASCII-only multicolumn data file into individual data files, one for each column. The files are named COLxxz.DAT, where xxx starts at 001for the first (left-most) column. Up to 512 columns may be split out from, for instance, a large spreadsheet. The output files are presently loaded into new Graph Windows, this will soon be an option. Output files are created with a Spectre header, and are stored as a continous stream of float (IEEE standard 32-bit coded decimal) numbers. To remove the Spectre header, or convert the data format, File/Save or Save As and de-select the Spectre header checkbox or change the output layout and format. As it is not certain column 1 will contain abscissa values for the other coulumns, the data is output Y-only (ordinates only) with an X (abscissa) interval of 1. This too can be changed by the Graph Window System Menu options Select Data Set as Abscissa Select Data Set as Ordinate to produce a single data set from separate X and Y files if the X (abscissa) values are unevenly spaced and you want to retain the unevenness. You will also find this option illuminating for "association" of two sets of ordinates with one another on a single graph. Use Tools/Pre-Process/Scale X or Y Data to change the interval (X spacing) to a new, but uniform, value. File/Close Closes the active Text or Graph Window. If data has been altered in a Text Window, you are prompted to save it before closing. A Graph Window is never "dirty", as processing always produces a new, titled Graph Window. No cautionary message appears before closing graphs. File/Save Writes the contents of the active Text or Graph Window to disk. If the window title is not a valid DOS file name, you are presented with the File/Save As dialog to enter or select a filename. For data sets, the Save Data Dialog presents several options to format the data for storage. Use the "Attach Spectre Header" option to continue using the file in Spectre. File/Save As Presents a dialog to select or enter a file name under which to save the active Text or Graph Window. For graphs, the Save Data Dialog presents several options to format the data for storage. Use the "Attach Spectre Header" option to continue using the file in Spectre. Save Data Dialog Before writing a data file to disk, you are asked to specify the data type (ASCII text, 16-bit integer or 32-bit floating point binary) and whether the data is saved as pairs of X Y numbers, or as Y data values only. No file compression is supported. You can also choose to attach a Spectre-style header record to the data file, which means you won't have to fill in the Data Type dialog when re-loading the file into Spectre. This isn't only a time saver, it means you won't have to REMEMBER the data format, layout or display mode. The proprietary, plain-text header is required for running scripted commands. If you want to strip the header from an existing data file, load it and then save it with the Attach Header check-box de-selected (clicked off). Then you'll have to remember whether the data is ASCII, 8- or 16-bit integer, 32-bit float, X,Y pairs or Y-only. Most users like the header, and only strip it off when exporting data to another application. In the X,Y layout, Spectre assumes each X value (the abscissa) precedes each Y value (the ordinate) and writes the data to file that way, with no delimiters: XYXYXY... The other layout format consists of Y values only, with no delimiters: YYYY... Information on the X-coordinate values is not saved when using this layout. For ASCII text or binary files, the data will be arranged in rows of X Y pairs (separated by blanks (white space), or as ordinate (Y) data only. For the widest range of values, data files should be saved as ASCII or as 32- bit floating point numbers in the universal IEEE format. Binary data is usually produced as an ordinate-only (Y-only) file, but you can elect to save the data set as X,Y pairs. Avoid ASCII for files larger than (say) 2000 points, as loading will be slow and the file large. File/Save Graph Bitmap The currently active (highlighted title) Graph Window is saved to disk as an uncompressed bitmap, in the .BMP format suitable for pasting into word processors and other applications, like PaintBrush (many such applications do not know how to deal with the various compression modes). The saved bitmap is gray-scale colored for printing: white background, black plot, gray axes and labels. The aspect of the saved bitmap will be identical to the apsect of the screen Graph Window, allowing you to tailor its size and aspect for insertion without distortion by re-sizing the window. While saving to a BMP file, the gray-scale bitmap appears over its parent Graph Window. File/Print Text Print the contents of the active Text Window. Graph Windows and 3-D Plot Windows are printable as scalable bitmaps using Print Graph. File/Print Graph The currently active (highlighted title) Graph Window or 3-D Plot Window is printed with a white background, black plot, gray axes and labels. The size of the printed bitmap can be identical to the size of the screen Graph Window (pixels equals half-tone dots), or you can tailor the size and aspect for printing. A dialog to select printing options includes Best Fit, Stretch To Page, and independent X, Y Scaling. Stretch To Page will probably distort the graph's aspect. To preserve the aspect, or to choose a size on the printed page, use Best Fit or the Scaling option. As long as the X Scale and Y Scale are the same, no distortion will occur. For printing a maximized Graph Window to nearly fill the printed page, use Landscape Mode (see Printer Setup) if supported by your printer (virtually all Windows printers do). The upper left corner of the graph will always lie near the upper left corner of the page. During and after printing, the as-printed gray-scale bitmap appears over its parent Graph Window. If you Cancel the print operation, the graph will remain gray-scale until you recover the original screen coloring, by re-tiling the screen, or re-sizing or maximizing the Graph Window. File/Printer Setup The standard Windows Printer Setup dialog is presented. Depending on your printer, you can select Portrait or Landscape orientation, Black Density, Print Quality, and other aspects of your printer's behavior. The next version of Spectre will support user-selectable options for color printing. File/Run Demos Depending on the intended audience, several demos can be run from the File Menu. In this version, you can run: Commodity Demo: a Frequency Search on an annual commodity price data set. The "energy" cutoff is set quite low, to allow Spectre to find small-amplitude cycles (roughly, monthly, quarterly, semi-annual) which occasionally coalesce in phase to produce a meaningful excursion. Looking only at the data set, or running a simple FFT on it, none of these cycles is evident. Only the the cumulative effect, or breakout, can be seen, not the individual cycles which underlie the phenomenon. Spectre not only calculates these cycles, but precisely aligns them, properly phased, on top of the data during the search. You can pause the action by hitting Alt and return to the movie with Esc. Settings/Frequency Search allow the user to "tune" the analyses to his particular needs. A text output file is also generated. The dotted yellow line Reconstruction can be extrapolated into the "future" with a few clicks. Spectre includes several conventional and some very unconventional research tools. Yoga Breath Demo: a Frequency Search on a physiological data set from a Yoga researcher. No Period Table was specified, instead Spectre used its AutoGen capability to generate one from the crude FTT and some low-level techniques, without effort from the user. Waveform Identification Demo: a Waveform Identification from a time series containing repeated, slightly degraded samples of an electroencephalogram waveform (K complex). A rotatable, sizeable 3-D Plot produced at the end is a map of computed energy in each identified period, at the time each "event" occurs. If the waveform had been simply repeated and identical to the template used, the 3-D Plot would have perfectly aligned. File/Run Script To facilitate the processing of one or more data files and one or more operations, a scripted command capability is built-in. On selection, Spectre automatically and sequentially executes entries under the [commands] section of the selected configuration (.INI) script file. You can specify data directories to hold data and template files under [datapath] and [templatepath. If not specified, their default directory is the Spectre.EXE directory). Analyze menu options can run as script commands, except ApEn. Commands have the form "command, datafile, parameter, ..." (note commas). For scripted commands, all datafiles must have previously been saved with a Spectre header to auto-load. To speed up scripted Analyze/Partition runs, minimize or shrink the large data file Graph Window. You also can script a sequence of commands to execute. For most commands, additional parameters are given on the same line as the command, comma-delimited. Scripted commands generally require the same parameters you would normally enter into dialogs during interactive (menu-driven) operation. For response with fewer keystrokes or clicks, the default script file Spectre.CFG is pre-loaded in the Open Script File dialog's file name edit control. If the program chirps when you run a script command, check the directory and filename spellings in the script INI file, and assure that the directories exist and actually contain the requested files. Examine the diagnostic text file DIAG.TXT (in the same directory as Spectre.EXE) for more information. Analyze/Partition Large Data Set commands available at this time are, for long data files: Waveform Identification from large data file LONG.DAT, a short template data file (sample waveform to match in each scrolled window) and a period table previously chosen to characterize template.dat (or say AUTOGEN). The number of points on-screen in the scrolling Graph Window will be on the order of (but not less than) the number of points in the template. Example: extract, long.dat, template.dat, template.tbl Frequency Search using command "longsearch", large data file long.dat and period.tbl. The "points" parameter is the number of points on-screen in each scrolled Graph Window during the run. Example: longsearch, long.dat, 512, period.tbl RMS Reduction specify "rms", the large data file, and the points per partition. A partitioned Graph windw (.PAR) is produced. Example: rms, long.dat, 512 Continuous Filtering, specify "longfilter", the large data file, the points per partition, the number of taps (delays), the lower and upper cutoff frequencies fL and fU. For fL = 0 the filter is lowpass, for fU = 0 the filter is highpass, and if both are non-zero the filter is bandpass. Example shows a 21-tap high pass filter and cutoff 7.5 (Hz, or whatever frequency units you're using) on long.dat with 512 points on-screen in the Scrolling Graph Window: longfilter, long.dat, 512, 21, 7.5, 0 To run a Frequency Search using a normal-length (around a thousand points or so) data file, specify "search" and datafile.dat with a table of candidate periods period.tbl. Example: search, datafile.dat, period.tbl 5. Analyze Menu Spectre presents the novel and useful Frequency Search (or Fast Orthogonal Search), which greatly exceeds the power and resolution of the ordinary Fourier Transform, and with more flexible control over the process. Information is made available in both the spectral and temporal domains. Frequency Search is a superb data visualization tool. Partition Large Data Set allows Frequency Search, Waveform Identification, RMS Reduction or Continuous Filtering operations on sequential, selectably short segments of very large data files. This is partly a means of avoiding time-consuming swapping of memory to disk, but the algorithms have significant analysis power for data that "evolves" with time. Each algorithm is written with data visualization as a primary interface goal. Approximate Entropy measures the "regularity" of a data set. A Graph Window is produced from the analysis; each point is a measure of the "regularity" of the corresponding data set partition segment. The "regularity" is a mathematically and statistically valid data set "signature", which has proven useful in detecting Sudden Infant Death Syndrome (SIDS) episodes from cardiac data. Analyze/Frequency Search A Frequency Search is the most powerful analysis tool in Spectre, and the reason Spectre was written. There is no other comparable technique for researchers to determine the information in their data. Fourier series and related techniques are difficult to interpret, have drastic limitations, and often require that finite-length (that is, all) data sets be extensively modified prior to application. On selecting the Analyze/Frequency Search menu option, Spectre will present the user with the File/Open Period Table dialog for the period table and if necessary again for the data file: File/Open Data File. If period table text windows and/or data set graph windows are already open, Spectre assumes that the last active windows of each type are to be used in the Frequency Search. Search Progress During the Frequency Search a dialog box appears on-screen displaying the progress of the search: the last identified period, its relative Mean Square Error Reduction (MSER, or wave "energy"), the computed amplitude and phase for the ID'd period, and the cumulative MSER. The candidate period currently being tested is also shown. Hit Abort Search to stop processing at any point; results to that point are presented. Fast Orthogonal Search Researchers and analysts often need to know if experimental data (a time series) contains significant amounts of signal at frequencies of interest to them. These frequencies may correspond to driving forces, environmental constraints (e.g. boundary conditions), or system responses, and may result from intrinsically non-linear processes. Many times, a mathematical or phenomenological model exists to explain some observed behavior, and experimental data is collected to verify whether or not the model is correct. A frequency search, or decomposition, is a fundamental approach to performing these analyses. The usual approach is to pre-process the data, apply a Fast Fourier Transform to the series, and plot the spectrum magnitude. Peaks in the FFT spectrum may correspond to interesting frequencies. However, it is difficult for anyone but a signal processing expert to know how much energy in the time series is actually accounted for by a given frequency. It is even harder to resolve nearby, overlapping broad peaks. More often than not, "noise" dominates the data and cannot be de-coupled from signals of interest. Pre-processing (filtering and windowing) of time series data sets is a demanding discipline, and many of the rules to assure the validity of pre-processing operations are difficult to apply. There is an alternative, well-researched and nearly painless method available to perform such frequency analyses. You first prepare, as a text file, a table of candidate periods, or have Spectre do it for you. On initial creation, the table usually contains a fair number of entries, as there may be no prior knowledge of what is really present in the data. Periods may be longer than the time series, or as short as twice the time interval between points. But the researcher often knows what to look for based on theory or existing work, and the table will contain several periods in the regions of interest. An algorithm called the Fast Orthogonal Search is applied to a time series, using the table of candidate periods, and the precise energy, amplitude, and phase of sine waves corresponding to entries in the table is displayed graphically over the original data graph. The objective is to determine if frequencies of interest to the researcher are present in significant measure. The measure employed is the "Mean Square Error Reduction" or MSER. The MSER is a measure of the "energy" in the waveform. and is computed for all candidate frequencies. The most significant results are reported in a text .OUT file after the run. You control the criteria for significance via an interactive dialog before the search. The Fast Orthogonal Search (FOS) algorithm is an adaptation of the original Orthogonal Search Method developed by Michael J. Korenberg at Queens University, Kingston, Ontario (see References). FOS searches a user-supplied table of candidate periods and selects the best candidate periods to fit an input time history. Results are stored to an OUT file formatted to be read by the user. Search results can also be saved as a data set by invoking the menu option Tools/Reconstruct Search Results to make a reconstructed time series, which can optionally be extended forward into the "future". The program examines a table of candidate or search frequencies created by the investigator or automatically by Spectre (AutoGen), determines the ability of each period to explain a portion of the total variance (mean square error, or MSE) and orthogonally removes the sine explaining the largest percentage of the time series variance. The process is repeated on the residuals until there is no further error reduction or until a specified number of periods have been identified. Fast Orthogonal Search (FOS) allows a researcher to select from a time series some assumed periods of interest, which are tested by FOS for the Mean Square Error Reduction (roughly the "energy") accounted for by sinusoids of computed amplitude and phase. The 'time', of course, may be any monotonic physical variable characterizing the data, e.g. position. FOS analyzes a time series stepwise, reducing it to a combination of sines. Unlike a Fourier Transform, the most common analysis tool, FOS does not have a limitation that the sines be related in a harmonic series. In FOS, identified periods can be arbitrarily close. FOS is thus not subject to spectral "leakage" which occurs when the actual frequencies comprising the data are not harmonically related, which is almost always true. FOS is reasonably noise-tolerant because all data elements are used only in series-wide averages over the orthogonal basis functions. FOS is especially valuable in the presence of sporadic noise. FOS does not require that the data be recorded at constant intervals. In fact, some of the data can even be missing without noticeably degrading the results! Note that FOS execution time is proportional not only to the length of the time-series, but to the number of candidate periods in PERIOD.TBL and the number of spectral features to search for. Preparing for a Frequency Search The user is responsible for maintaining two types of external files: a table of candidate periods and a data file, or time series. The tables must have a .TBL extension. The data set files ordinarily have a .DAT extension, but that extension (or any extension) is optional. Period tables can be read into Spectre and edited, but they MUST BE SAVED using File/Save or File/Save As) before use in a Frequency Search, as the table is read from disk each time it is used. After selecting a Period Table for analysis, you are presented with a dialog to choose the number of candidate periods to select (the default is all the periods in the table) and for a "MSE cutoff". The MSE cutoff (default value 2%) is basically the lowest "energy" sine wave Spectre will use in building a 'Reconstruction' data set (the yellow dotted curve which builds up during the Frequency Search). Typically, a couple of long-period sinusoids contain most of the energy in a data set and define its trends, while shorter period waves above the MSE cutoff build detail and structure into the reconstruction. A new feature allows the automatic generation of a table of candidate periods from an initial analysis of the data set. The technique, called AutoGen, uses a combination of FFT and a collection of techniques called Period/Amplitude analysis to determine a range of periods which contribute significant amounts of wave energy to the data set. If you are interested in determining whether only a given set of periods is found in your data (and not others which may contribute significant energy), you can include only those numbers in the Period Table. The amplitudes, phases, %MSER, etc. will be correct for those periods identified, but the reconstruction won't necessarily look very convincing. Some users load a larger number of candidate periods into the table, running from several times the length of the time series to be searched all the way down to a few times the interval between data points, and covering the region between in a quasi-logarithmic way: say 5 points above 1000.0, 5 points in the hundreds, 5 more from 10.0 to 100.0, etc. After a preliminary run (perhaps on an abridged data set if it is very long, or one interpolated to have fewer points (see Tools/Pre-Process/Interpolate) examine the OUT file for likely candidates to refine, and others to drop from further consideration. AutoGen is obviously a good place to start building a table. Note that data set files are not directly editable as a Spectre menu option, but they (or copies of them, actually) may be manipulated in the pre-processor (menu selection Tools/Pre-Process). This allows outlier (artifact) removal, partitioning, smoothing, etc. to be applied to named data sets. ASCII data files can be opened as text for viewing only in an editable window by choosing "All Files (*.*)" from the file type list box in the File/Open Period Table or the Open Output File dialog. But do not try to save the data set from the text window: you will end up with a truncated data set file if its size was larger than the 32kByte limit imposed for text windows, about 900 ASCII data points. Editing data sets manually is something of a "No-No" anyway, like cheating at Solitaire. Frequency Search Parameters Spectre uses two user-specified criteria to decide when to conclude a Frequency Search. In the Search Parameters Dialog, the user enters the number of periods to search for, and the Mean Square Error Reduction (roughly, the "energy" in a given period) cutoff value, in percent of the total (100%) "energy" in the data set. The dialog initially contains the number of candidate periods in the TBL file selected by the user, and a MSER Cutoff value of 2%. The MSER cutoff is simply the lowest "energy" sinu wave Spectre will use in building a Reconstruction data set (the yellow dotted curve which builds up during the Frequency Search). Typically, a few long-period sinusoids contain most of the energy in a data set and define its trends, while shorter period waves above the MSER cutoff build detail and structure into the reconstruction. You also can choose whether to display a Periodogram or Spectrogram displaying percent Mean Square Error Reduction (relative "energy" in an identified sine wave) versus the selected periods. Alternatively, you can display the same results as a spectrum: the periods are inverted to frequencies (F = 1 / P). In either case, the graph consists of narrow lines at heights corresponding the % MSER (as listed in the OUT file). The lines are narrow because the Frequency Search algorithm has very fine resolution: two periods very close together can easily be resolved. This is one of the many advantages of using Spectre to perform Frequency Search, rather than merely running an FFT spectrum on your data. Selecting appropriate values to use in a period table depends greatly on the characteristics of your data set and the Period Table (see Period Table Format) you have created. If your data set is long, you may want to practice with a lower-resolution version of the data: use Tools/Pre-Process/Interpolate and enter, for the number of data points, no more than (say) 1000. You'll get a "rough copy" of the data set to refine your analysis skills; save it to disk with File/Save As with an easily-recalled name, including the DAT extension so it appears in the File/Open Data File when you repeat your tests. The AutoGen option may help in creating a first-cut period table you can refine as experience grows. Interpreting Frequency Search Results Also see Reading the OUT File and examining the Periodogram for more detail... The Frequency Search option selects periods from your TBL table to satisfy its requirement to remove as much of the residual Mean Square Error from the data set as possible. Each period is tested sequentially. The one which removes the largest error (fits the data best) is selected and the process is repeated on the remaining periods until completion criteria you selected are met. To aid in interpreting results, you can look at the MSER, or Mean Square Error Reduction, as 1/2 times the squared amplitude of the wave (e.g. the "energy" in the series). This is strictly true only if either an integral number of cycles of the selected period sine waves fit in the data set, or if there are many such cycles. But it is a convenient way to visualize the ability of selected periods to "explain" your data. Probably the most illuminating insight into your data can be achieved by watching the reconstruction form on your monitor screen as the search progresses. During a Frequency Search, you will see the yellow superposed sine curves begin to hug the data closer and closer, eventually lying on top of the data curve. The first wave drawn corresponds to the period in your table responsible for the most "energy". Notice that it tends to fit the side-slopes, or "trends" in your data set, and pretty much ignores peaks and valleys unless they are very broad. The eye tends to see the peaks, but the energy resides in the slopes. Successive, usually shorter period selections do a better job of snaking their way up into the peaks and valleys. If your Period Table contained a good enough portrayal of the periods actually in the data set, you will account for more than 50% of the energy. If not, think about possible reasons, enter new or different periods into the TBL file using the Spectre text editor or your own, and re-run the search. Periods that didn't show up in the initial passes probably are not found in the data and can be removed from the table; this will markedly improve run speed. Use the AutoGen capability to let Spectre estimate a range of periods too use based on a quick internal spectral/temporal estimate. The AutoGen option typically yields upwards of 60% of the energy in a data set. You can use the period, amplitude and phase offset of each identified sine wave from a Frequency Search to make a new, 'synthetic' data set reconstruction, virtually automatically. The reconstruction may optionally be extended into the "future". The reconstruction may help assess the effect of noise in real data sets. Assure that the data set graph on which the Frequency Search was performed is active, by clicking the mouse anywhere over it or by hitting Ctrl-Tab repeatedly until the graph window is highlighted (see Activating Graph Windows). Then select the Tools/Reconstruct Search Results menu option. A new Graph Window appears, holding the data set reconstruction comprised only of sine waves selected by the Frequency Search. Try running an FFT on the Reconstruction and compare with an FFT on the actual data. If you subject this reconstructed data set to a Frequency Search using the same Period Table as for the original search, the yellow reconstruction curve will finally lie very nearly on the data plot, in fact you may not see any difference. A nice bit of closure... Spectre is not a curve-fitter, nor is it an FFT. It searches the data using the supplied candidate periods in the Period Table, selecting those periods which account for the most "energy". The terminology for energy in use is Mean Square Error Reduction, but MSER is not necessarily intuitive to many researchers. Strictly, the MSER is wave energy = 1/2 time the squared amplitude only if many cycles or an integral number of cycles exist in the data set. Spectre examines the table of candidate or search frequencies created by the investigator, determines the ability of each period to explain a significant portion of the total variance (mean square error, or MSE) and orthogonally removes the sinusoid explaining the largest percentage of the time series variance. This process is repeated on the residuals until there is no further significant error reduction or until a specified number of periods have been identified. Frequency Search doesn't do much with spikes and sharp peaks in data. The sharper the peak, the less energy it contains (for finite amplitude) and Spectre seeks to explain where the energy is. Also, recall that a sharp spike contains a large number of frequencies over a very wide band. Spectre does best with rolling hills and valleys. It's country code. Make a single sawtooth wave using the Tools/Generate Synthetic Data selection. Make a Period Table with periods extending from vary large (much larger than the length of the data set) to very small values (on the order of the interval between data points). You can use AutoGen to automate this task. Run Analyze/Frequency Search on the sawtooth. The code is likely to choose a single very long period, so that the slope of the wave coincides with the slope of the sawtooth. Not what one would expect from an FFT, which would say the sawtooth contains a large number of harmonics of the fundamental sawtooth frequency. Is Spectre wrong? No, it matched the data. Is it right? Not many researchers would be inclined to say they matched their data with the initial rising slope of a sine wave several times longer than their data set. But they could... CoDebris will eventually modify the criterion for selecting periods from a reliance on MSER alone, to one which (optionally) weights specified period components preferentially, after most of the energy has been accounted for in the low-frequency components. This will allow replication of more high-frequency (low total energy) features. For now, you can set the MSE cutoff value (Settings/Frequency Search), to something like 0.1% to get more detail, with a commensurate increase in running time. The derived amplitudes are roughly sqrt {2 times the energy in the associated period}. Only periods in the table having the most energy are used. Researchers commonly 'shotgun' a wide range of candidate periods chosen from their experiments or theory, which works but can be time-intensive to compute. Run time is proportional to the product of: the total number of candidate periods (12 in BREATH.TBL), the number of data points (1000 in BREATH.DAT), the number of periods to select (up to 12 in BREATH.TBL), or the MSER cutoff (default value 2%), in percent of the total energy in the data, whichever occurs first. A lower cutoff will cause Spectre to spend more time finding low-energy data set components. The Fast Orthogonal Search algorithm is capable of much greater time resolution than a Fourier transform, and is not limited to harmonics of a fundamental frequency. The algorithm is also quite insensitive to noise, as all data elements are used only in series-wide averages over the orthogonal basis functions, especially compared to techniques requiring a very narrow band filter to isolate the signal. In many nonlinear or biological systems, the signal frequencies move, or breathe, as the system evolves, so specifying a narrow band filter is not adequate. Finally, FOS even tolerates missing data points, or irregularly-spaced data sets. An amazing piece of work by Michael Korenberg, I think you'll agree. See the References. Periodogram and Spectrogram As an auxiliary aid for depicting the results of a Frequency Search you can select to view a graph of the percentage of data set "energy" (actually MSER) versus periods P identified. Alternatively, the results can be viewed in the frequency domain, like an FFT, but a as line spectrum. The graph will then show relative energy in each identified frequency F, where F = 1 / P. The analogy with a line spectrum is not accidental: the FOS algorithm used in Frequency Search has no intrinsic limitation on how close together two periods can be, either in a data set or in the period table (TBL file) you use. Try Tools/Generate Synthetic Data to make a data set of two sine waves very close together in period. Edit the periods you chose into a TBL file and run Frequency Search on the data set. Run an FFT on the data set for contrast. You will find that Spectre identified both periods with no ambiguity, even if one of the periods was longer than the data set. The FFT fails badly. Resolution does not seem to be a problem, even when two narrow, overlapping bands of frequencies are analyzed. The Periodogram or Spectrogram option (or none) is selected in the Search Parameters Dialog before running a search. Analyze/Partition Large Data Set A new set of analysis procedures in Spectre allows processing data files too large to fit in RAM. It is definitely not a good idea to use Window's Virtual Memory to handle large files, as the overhead in swapping the data set in and out of memory to the swap file on disk (access time in milliseconds) ), can be up to 100,000 times slower than accessing RAM (under 100 nanoseconds). The Help/About... window shows an approximation to the actual amount of RAM available, not only the Virtual Memory size which includes the size of the swap file. As an rough guide, each data point on-screen requires about twenty bytes of code and data memory to just sit there and look pretty. I RMS Reduction The analysis of large files is enabled by reading into memory user-selectable sections of data called partitions, which (except for Continuous Filtering) are treated independently of data preceding or following in the file. Thus, systems which evolve ("breathe" or "cycle") as time progresses, such as physiologists and investment analysts concern themselves with, can be analyzed for changes to and recurrences of their dominant and minor modes. To proceed with the Waveform Identification option, plan to prepare a Template data set and a corresponding table of candidate periods (TBL file). See Select Template for more information. The number of Scroll Points will be automatically selected to be the same as the number of points in the template data file. The template will first be analyzed for those periods which maximize their contribution to its waveform (it is suggested that the table Autogen capability be used here). Each partition from the large data set will then be analyzed using the same candidate periods which were successfully identified for the waveform template. Those partitions which have a signal closely resembling, but not necessarily exactly matching, the template will be identified and extracted. A new time-stamped Graph Window is created from each successful identification. The Waveform Identification option is thus a method to identify and isolate specific types of short waveforms embedded in large data sets. Note that the data set is scrolled by the number of points in the template, but each new scroll is from the (optionally) mid-point of the previous scroll. This helps prevent chopping a candidate waveform which may lie across a scroll boundary. Frequency Search saves results from the FOS analyses of each section to a file for display in a 3-dimensional plot (energy vs. selected periods vs. time of occurrence in the large file) or for interpretation by the researcher later. The plot file is ASCII, so may be edited or merged for display with other same-format plot (PLT) files. RMS Reduction replaces the points in a partition by a single value: the RMS value of all those points. The resulting Graph Window thus produced is a good measure of how the large data set energy varies from beginning to end. See Scroll Controller for a method to use the RMS Reduction Graph Window to enable visually-controlled, single-click access to the large data file. Continuous Filtering a large data set works the same as Tools/Filter Data Set and writes the filtered data to a file with the same name as the original data set and a FLT extension. It is written using an in-line filter algorithm so the multi-tap induced lag does not affect every partition: a continuously-filtered data set is produced and the original data file is NOT overwritten. An update window displays the progress of the partitioning. Continuous Filtering or RMS Reduction typically require less than 2 minutes per million data points on a fast Pentium CPU. Waveform Identification or Frequency Search requires more time (typically about an hour per million points), proportional to the number of data points in each partition and the number of candidate periods in the associated TBL file. Waveform Identification Under the menu selection Analyze/Partition Large Data Set are options to analyze large data files, including the identification in time and the extraction of a characteristic waveform for which you supply a "template" data set (see Select Template). Each occurrence of a section of the large data file which the program identifies as resembling the waveform is extracted into a new Graph Window. You can then visually decide for yourself if the waveform is sufficiently like the template you supplied. Each newly extracted Graph Window is named Txxxxxxx.WVF, where xxxxxxx is the X (usually time) value at the beginning of the extracted waveform, and of course may be saved to a small data file. The option is also available as a scripted command in the Spectre configuration (.INI) file: extract, datafile, template_data, template_TBL where "datafile" is a long binary data file and "template_data" is any data file (both must possess a Spectre header for use in scripted commands, as the data file will be read in without user interaction). Corresponding to the "template_data" file (the waveform being searched for in the large "datafile") is a "template_TBL", the file (table) of candidate periods to be searched for during the analysis. The template datafile is assumed to have been analyzed as a regular Analyze/ Frequency Search menu command. If nothing is known about the template waveform, the AutoGen option can be used to generate a table of candidate periods based on the energy in the template over a broad range of periods. Rename AUTOGEN.TBL to whatever name you want; an edited version of this table will be the one you specify to be applied to the Waveform Identification process. You can edit AUTOGEN.TBL to include exact periods you know to be present in the template, or to delete periods which were not identified during the Frequency search using AUTOGEN.TBL. The algorithm uses a Frecquency Search on each section ("partition") of the large data file. The reconstruction (see Tools/Reconstruct Search Results) from that analysis is normalized and compared in a mean-square error sense to a normalized reconstruction of the template. In addition, the maximum signal amplitude must be at least 50% of the template signal amplitude. Each partition will be analyzed independently of the remaining data in the file. Select Template A Waveform Template is a Graph Window containing a relatively small data set, usually no more than a few hundred points, which is selected to be analyzed for its dominant frequency content and subsequently used to extract similar waveforms from a large data file. For example, if a large data file contained a few million data points and every now and then a particular "signature" waveform was embedded in the data, the "signature" can be selected as a template to automatically find and extract all similar occurrences of the signature in the large data file. Open the large data file into a scrolling Graph Window. Enter the number of points per scrolling interval to be somewhat more than the number of points in the "signature" waveform. Scroll through the Graph Window until a good example of the signature appears. Drag the mouse over the region containing the entire signature. Make sure the rubberband completely encloses the signature, in both the X and Y dimensions. Using the Tools/Pre-Process/Clip Outside Rubberband option, the signature will be copied to a new Graph Window. This is your template. Save it to a named data file in the script file [templatepath] directory, or simply to the local Spectre directory if [templatepath] is not specified. The Waveform Identification process is relatively insensitive to moderate noise, and the process does not greatly depend on the exact location of a signature in the large data file. If the signature varies somewhat in duration, however, you will want to make a few templates over the range of durations expected. RMS Reduction RMS Reduction produces from a large data file a new Graph Window composed of the summed root mean square value of every point in each scrolled partition: if the partition size is 2000 points, partitioning a 1,000,000 point large data file will produce a graph of 500 points, each of which is the RMS value of the corresponding 2000-point section from the large data file. The Graph Window produced is a good measure of how the large data set energy varies from beginning to end. After the run, it is now possible to quickly access any portion of the large data set by double-clicking on its Scroll Controller Graph. If the Controller is the RMS Reduction of the large data file, you actually have a "road map" of the large data file (which is too large to be viewed on-screen all at once) and you can instantly go to any interesting-looking location (approximately) by simply clicking on the Controller Graph Window. At a place on the Controller graph where the "RMS energy" suddenly experiences a sharp drop, you can instantly view the section of actual data that contributed that sharp drop. Now scrolling slowly back through the large data set from that section, it may be possible to see a "precursor" waveform or behavior which can be used to "predict" when that kind of sharp drop might be expected to re-occur. The RMS Reduction operation for a 1,000,000 point data file requires less than one minute on a fast Pentium, and a Controller/Slave pair quickly provides a very useful "gross behavior" view of the large data file while simultaneously viewing "fine-scale" details of interesting regions. See also Tools/Pre-Process/RMS Reduction for the corresponding option applied to a short data file, all of whose points are loaded into one Graph Window (non-scrolling). Continuous Filtering Continuous Filtering uses an in-line filter algorithm so the multi-tap induced lag does not discontinuously affect every partition: a continuously-filtered data set is produced and the original data file is NOT overwritten. The procedure is speeded up tremendously if the scrolling Graph Window is minimized, avoiding the overhead of graph repaints. Continuous Filtering writes the filtered data to a file with the same name as the original data set and a FLT extension. See also Tools/Filter Data Set. From the command script (INI) file, the command line syntax for "longfilter" is: longfilter, datafile, partition_points, taps, lowercutoff, uppercutoff where "partition_points" is the number of points to filter in each partition, "taps" is the number of filter taps (or delay lags, in analog parlance), "lowercutoff" and "uppercutoff" are the lower and upper roll-off frequencies in frequency units (not radian frequency, not period). The .FLT file is created in the [datapath] directory if [datapath] is specified in the script file. Scroll Points When running Analyze/Partition Large Data Set, select the number of data points to be on-screen in a scrolling Graph Window at any time. For Frequency Search, pick a number which will include at least the largest period you believe to be present in the data over at least most of one cycle. For example, if the largest such period is 100 days, and the interval between data points is 1/2 day, use at least 200 points: say about 250 or 300. If the large data file records phenomena which evolve between various states (e.g. "breathing" cycles during high and low activity intervals) keep in mind that if the partition is too long, this evolutionary behavior may be "washed out", or intermixed, in the analysis. For RMS Reduction, pick the number of points to be such as to yield a reasonably detailed graph of the partitioned data. For example, if the large data file has 1,000,000 points, using a partition size of 2000 points will result in a 500-point graph of the result, sufficient to show all the data on a VGA screen. Experiment to find the balance between resolution (detail) and comprehensiveness. For Continuous Filtering the choice is not too important, as you will probably want to minimize (iconize) the large data file's scrolling Graph Window to avoid time-consuming repaints of the window as the data scrolls through it. Minimizing the scrolling window during a Partition run is usually a good idea for any of the above options unless you want to look over the data as it is processed. Script Commands Some Spectre menu options are available as scripted commands under the [commands] token in an editable .INI file. The menu options under Analyze/Partition Large Data Set are available as scripted commnds in the Spectre configuration (.INI) file: extract, datafile, template.dat, template.tbl longsearch, datafile, points, period.tbl rms, datafile, points longfilter, datafile, points, taps, lowcutoff, highcutoff where "datafile" and "template.dat" must possessing a Spectre header, as the data file will be read in without user interaction. A "datafile" which is already opened into a Graph Window on-screen won't be re-loaded; the command will be executed on the same-named Graph Window. Commands, datafiles, and other parameters may be in upper case, lower case, or mixed case. "points" is the number of data points in each partition (See Scroll Points), and is the number of points on-screen in the scrolling Graph window. If "template.tbl" is AUTOGEN.TBL, the table of candidate periods will be automatically generated by Spectre. For either "period.tbl" and "template.tbl" the .TBL extension is optional. A series of Commands may be entered in any order; they will be executed sequentially. A failure of any command to execute successfully (bad filename, incorrect directory, etc.) will halt processing of any subsequent commands. This is done because often a command uses the results of a preceding command as its input "datafile". The diagnostics from a run are listed in the DIAG.TXT file generated during a Spectre session. Analyze/Approximate Entropy Approximate Entropy measures the "regularity" of a data set. A Graph Window is produced from the analysis; each point is a measure of the entropy of the corresponding data set partition segment. This entropy is a mathematically and statistically valid data set "signature", which has proven useful in detecting Sudden Infant Death Syndrome (SIDS) episodes from cardiac data and in other venues. Entropy is the statistical measure of disorder, or lack of regular behavior, in a system. For a data set, entropy can mean reasonably long runs of points that don't jump out of a pre-determined range of values. Researcher Steven M. Pincus (see References) has derived a measure of entropy for data sets which quantifies this concept in a manner consistent with the mathematical and physical definition of entropy. Parameters Enter the Partition Length D, which is the number of data set points (default 50) over which the entropy is to be calculated to yield one point on the ApEn graph. After computing ApEn in the first partition, the next D points are analyzed independently of the rest of the data set values and the process repeated until all partitions are so analyzed. You also enter the Run Length (number of data points, M) over which the enrtropy of the data set is to be estimated. Every batch of M consecutive points is checked against all other M-sized groups. The Run Length must be less than the Partition Size. The default is 3 points. Finally, enter the "regularity" filter criterion R, defined as a percentage of 1 standard deviation (s.d.). For R = 3% (a typical value, but play around), we check to see if during a run of M points any of them exceed the preceding value by more than R = 0.03 s.d. Given a run of points which fall into a pre-determined band of values (usually stated as a percentage of the data set standard deviation (RMS value), Spectre calculates the number of cases in each data set partition for which this criterion is achieved, and compares it with the corresponding number from all other run lengths in that partition. The logarithm of the ratio of these number is averaged over the partition. Interpretation The result is called Approximate Entropy, or ApEn, and is near zero for very regular data sets, increasing for progressively more disordered data sets. Pincus shows electrocardiogram (EKG) data for an infant who is experiencing great cardiac stress (Sudden Infant Death Syndrome, or SIDS) while in the cradle, but who survives the episode, a so-called "aborted SIDS event". Comparison of the EKG with that for a normal infant shows the SIDS infant heart is curiously devoid of great beat-to-beat variability. Other diseases can show the opposite relationship, in that normal behavior exhibits greater "disorder" than the pathological case. ApEn is able to quantify this variability, or lack of it. ApEn is a function of two parameters, the run length M and the regularity filter criterion R. This is written ApEn (M, R) and the values of M and R are selected by the researcher to best illustrate the regularity difference between the normal and abnormal case. After suitable selections of M and R ranges, the researcher would, of course, seek to establish that the ApEn criterion applies to a number of normal vs abnormal test cases. More prosaically, ApEn can ascribe a "signature" to certain classes of data sets, by which they may be quantitatively differentiated, without have to examine the traces visually. 6. Tools Menu Tools are provided to modify data, and to synthesize a data set from other data sets or from scratch. Synthesis is the process of modifying or creating a data set from components, including components derived from analysis (see the Anlayze menu) or from another data set. These options are available: Tools/Pre-Process Data Set Pre-Process/Clip Pre-Process/Differentiate Pre-Process/Discretize Pre-Process/Integrate Pre-Process/Interpolate Pre-Process/Remove Artifacts Pre-Process/Remove Trend Pre-Process/RMS Reduction Pre-Process/Scale X or Y Pre-Process/Segment Pre-Process/Smooth Pre-Process/Subtract Mean Tools/Combine Graphs Tools/Copy Data Set Tools/Fast Fourier Transform Tools/Inverse Fourier Transform Tools/Filter Data Set Tools/Generate Synthetic Data Tools/Reconstruct Search Results Products from any of the synthesis operations below are a Graph Window plotting the data set, an optional data file of X,Y pairs in ASCII or binary format (use File/Save As to create the data file), and a reference text file DIAG.TXT containing summary information on every Tools and Analyze operation during a session: it's useful to review who did what and with which and to whom. Reasons for synthesizing a data set might include creating a noise-free test bed for modeling, "loading" a data set with noise to test extraction and filtering responses, or testing assumptions about the behavior of real data sets before they actually become available. In Spectre, the pre-processed or filtered data set is a modified copy of the original data set, the original is always left unaltered. Tools/Pre-Process Data Set A useful Swiss Army Knife of data set tools. On selection, the following pre-processing capabilitities are listed in a sub-menu: Clip Outside Rubber-Band Differentiate Discretize Integrate Interpolate Offset Data X or Y Remove Trend Remove Artifacts RMS Reduction Scale X or Y Data Segment Smooth Subtract Mean Pre-Process/Clip Outside Rubberband Drag the mouse (holding down left button) over a data set Graph Window to select left and right, upper and lower clipping regions. On selecting this option, points outside the upper and lower rubber-band lines are set to the values under the lines. A fast way to trim spiky data sets; or subtract the clipped data set from the original to retain only the spikes. Pre-Process/Differentiate The active data set is differentated point-wise: Each data set point is used to calculate a new point Yd = (Yn2 - Yn1) / (Xn2 - Xn1), where n2 represents a point in the X,Y data set and n1 is the point preceding (just to the left of) n2. A new data set is formed from the Yd values: the derivative of the original data set. In calculus, the points are infintesimally close together. giving the derivative a geometrical interpretation as the "slope" of the curve at a point. In your data set the points are a finite distance apart, and the slope is the slope of the straight line connecting two adjacent points. The derivative can be very different depending on whether the point n1 is to the right or left of point n2, depending on the smoothness of your data set. Pre-Process/Discretize Change the number of discrete "levels" in a data set. For example, if your Y data consists of samples in the range 1 to 8, and samples are integers, you can change the range and number of levels to 1 to 256. If you have Y data in the range 0 to 65,535 you can change it so there are only 7 levels in the data, but those seven levels are uniformly distributed through the 0 to 65,535 range. Pre-Process/Integrate The active data set is integrated point-wise: Each pair of data set points is used to calculate a new value for the integral from the integral's preceding value and a tall, narrow trapezoid under the "upper limit": Yd2 = Yd+Yn1 * (Xn2 - Xn1) +(Yn2 - Yn1) * (Xn2 - Xn1)/2 = Yd1 + Yn1 * (Xn2 - Xn1) / 2 + Yn2 * (Xn2 - Xn1)/2 where n2 represents a point in the data set and n1 is the point just to the left of n2. Geometrically, the integral is the summed area under a curve between two points. In calculus, the points are infinitesimally close together, so that connecting points on the curve with stright-line segments becomes more reasonable. More accurate methods exist, of course, but for both very smooth and very spiky data this "trapezoidal rule" works reasonably well. Yd1 is initially zero when the integral is commenced. A new data set is formed from the summed Yd values; the process repeats until the end of the data set. Notice that the integral "builds" on the value calculalated from preceding points. It relies on summing and multiplying, and the derivative uses only subtraction and division. Pre-Process/Interpolate Creates a new data set composed of all or part of an existing one. You can have more, fewer, or the same number of points as in the original. You choose the lower limit and upper limit X values to include in the interpolation from a dialog where the selected data set X limits and existing number of points is already in the edit control. Interpolation is a nice way to align two data sets so they have the same minimum and maximum X values, and the same number of points. Aligning two data sets provides a valid method for choosing a pair of data sets to be the abscissa and the ordinate of a new graph window for visualizing the inter-relationship between them (see Select Data Set as Abscissa and Select Data Set as Ordinate). If the sets are not aligned, part of the longer data set is not included in the visualization, and the X values may not even lie in the same ranges. Your responsibility to use caution here is emphasized. A future update to Spectre will automate the alignment process to a couple of keystrokes or clicks, but for now you have to do some thinking in advance. Note that selecting the entire data set for interpolation is not exactly the same as making a copy unless you use exactly the same number of points: it is a linear interpolation of those values, and will be very close, but to get an exact copy use Tools/Copy Data Set. The X lower and upper limit values you choose must of course be part of the selected data set, or you're trying to do extrapolation (prediction) and not interpolation. Extrapolation is a new Spectre option. Pre-Process/Offset X or Y The active data set is shifted horizontally or vertically (or both at once) to a new origin. You enter the amounts (+ or -) by which to translate the data set. Pre-Process/Remove Artifacts Artifacts, or "outliers", are sudden positive- or negative-going spikes in the data. You choose a relative level, in percent, above (or below) which artifacts will be "clipped" from the data set: artifacts extending more than, say, 200% above or below data values on either side are replaced by the average of the side values. Pre-Process/Remove Trend The "tilt", or average slope, of a data set is calculated and used to subtract the "least-squares" best-fit straight line from the data set Y values. This will not necessarily result in a zero mean. The existence of linear trends is often an indicator of non-stationarity, which means that statistical parameters such as the mean and standard deviation are not constant. Non-stationarity can imply a meaningful "transition" between two states, or it can mean an uncompensated "drift" in electronics or data collection equipment. You should attempt to understand the source of a trend in your data: an uncompensated drift may eventually bump all your data into a dynamic range limit, or a "transition" may have been caused by the system being measured non-linearly crosssing into a new behavioral regime. In Spectre, the slope and Y-intercept of the line is calculated from a "least-squares" fit to the data: Slope = ({X Y} - {X} {Y}) / ({X X} - {X} {X}) Intercept = {Y} - Slope {X} where {} is an average over the length of the data set. Pre-Process/RMS Reduction RMS means root mean square: in a partition whose size you select, sum the squared numbers, divide by one less than the partition size, take the square root. A partition is a piece of the data set: a 1000-point data set may be partitioned into 200 segments, each containing 5 sequential points. The user is prompted for the number of points in each partition segment, 5 in this example. The data set being partitioned is elevated so that its lowest value is on the X axis: no negative values, as the process of squaring each data point would cause data tending to go negative to suddenly sweep upward. The RMS value of each (for example) 5 point segment is found, and the RMS values are used to make a new data set. The resulting data set is a somewhat "chunkier" distillation of the original. It finds increasing use among researchers attempting to capture large amounts of data over a long recording time. Pre-Process/Scale X or Y Select a value to multiply every point's X value, Y value, or both, in the active data set. Normally used to change the interval between successive X values, or to bring Y values into a range of interest. Useful hint: multiplying Y by -1 will "flip", or turn the data set upside down. For X, only positive values are allowed, so that the data set will remain in order of increasing X values. Pre-Process/Segment On selection, enter the X-value at which you want to cut the data set into two parts, each of which are written to separate, new Graph Windows. A good way to eliminate "anomalous" regions from a data set. Another method, much more flexible but not as precise for larger data sets, is to use the Pre-Process/Clip option, where you can select the segment (in both X and Y) graphically. Pre-Process/Smooth Applies a "boxcar window" moving averager to the selected data set. You choose the number of points over which to average. For a 5-point boxcar, a given point is averaged with the two preceding and the two following points. More subtle averagers exist, but most analysts use this technique for a first try at smoothing their data to remove pronounced high-frequency fluctuations, although the window "edges" introduce their own high-frequency components. The FFT of a boxcar (solitary square wave) is sin (X) / X, rich in aliasing harmonics; try using Tools/Generate Synthetic Data Set to verify this for yorself. Pre-Process/Subtract Mean The mean (averaged over the data set) value of Y is subtracted from each data set Y value. Tools/Combine Graphs To Concatenate, Add, Subtract, Multiply, Divide, Convolve, Correlate two data sets. Select two data sets for the Combine Graphs operation by opening the Graph System (Control) Menu from the square "dash" button at the upper left corner of each Graph Window and selecting one as Operand 1, the other as Operand 2. Then choose Tools/Combine Graphs from the main menu and select the specific action from a pop-up dialog. Concatenate joins two data sets together at the end of the first one, with a "bridge" interval between them. No attempt is made to "smooth" the joint. Concatenation permits data sets to be "re-assembled" after having been separated (Segment or Clip) for independent processing. The data sets may have different X-interval values; the "bridge" value between the final element of the first data set and the initial element of the second data set is the X-interval from the second data set. Each data set may be independently scaled by constant multipliers (C1 and C2 as below) during the concatenation. The four arithmetic operations are of the form C1 [Data Set 1] op C2 [Data Set 2] where C1 and C2 are user-selectable constant multipliers, and "op" is one of +, - , x, or /. The data sets are checked for a region of overlap in their X values, and the operation is performed only on the overlap region. If the data sets share no common X values, no operation is performed (you can Offset or Scale the data to align them). The number of points in the resultant data set is the maximum of the number of points from one of the data set's overlap region with the other. Note Data Set 1 is also called "Operand 1", Data Set 2 is "Operand 2". For division, if the denominator has zero values in the region, the quotient is assigned the Y value of zero. Also, for division or subtraction note that Operand 2 is divided into (or subtracted from) Operand 1 . Tools/Copy Data Set A one-to-one exact copy of the selected data set is made and displayed as a graph window. It is assigned the data set name PREPxx, and is NOT automatically saved to disk. If a Frequency Search had been performed on the original data set, it will be included in the copy: the same yellow dotted line representing the reconstruction is drawn, using the same search results data. Tools/Fast Fourier Transform With an active data set graph window, an FFT will produce a plot of the "spectral energy", or magnitude of the transform, and a plot of the spectral phase. The FFT operates on a "complex" data set: the Y values are assumed to be complex numbers, of the form Y = Yre + i Yim where i is the notorious square root of minus 1. Yre and Yim are the "real" and "imaginary" parts of the complex number Y. Yre and Yim also appear in a Graph Window, they are necessary to perform the Inverse FFT (IFFT) to transofrm back to the time domain. The important fact here is that for the data sets we normally use, all the Y values are "real", so Spectre loads an "imaginary" array with 0 values for the FFT. If concepts such as "complex", "real", and "imaginary" are foreign to you, learn more about them from a math textbook if you want to progress in real-world research and analysis, no matter what your field. The N-point data set ordinate values Y = y (t) in the time domain become the real part of a complex array of values: (Y + 0i) which when fed to the FFT are "rearranged" into a complex spectrum S(f) = (Sre + i Sim) in the frequency domain. For y (t), read "y of t" and for S (f) read "S of f". Functions y and S are 'inverse' functions of one another, as the FFT and the inverse FFT map y into S and vice-versa. The spectrum magnitude is sqrt (Sre Sre + Sim Sim) and the spectrum phase is arctan (Sre / Sim). The magnitude and phase spectra are displayed by Spectre. The real and imaginary components are used by the inverse transform, but have not been re-ordered for display. In addition to the difficulties interpreting the discrete Fourier transform, there are three problems with using an FFT to learn the characteristics of your data. First, the FFT algorithm requires that the data set contains X values which are in strict arithmetic order, e.g. 0, 1, 2, 3, ... or 0.25, 0.50, 0.75, ... This is a limitation of the technique, and irregularly spaced data sets must be somehow "filled in", or interpolated to yield (X,Y) points whose X values are precisely regular. The Frequency Search (FOS) has no such limitation. In fact, sizeable chunks of the data set can even be missing entirely before FOS begins to be affected. Weird, but true, and potentially very useful for analyzing hacked-up data. A far worse limitation for analysis purposes is that the FFT requires the number of data set points to be a power of 2: ... 256, or 512, or 1024, ..., 65536, etc. Data sets are "zero-padded" to the highest power of 2 less than or equal to the data set size, which means that the 'right end' of most data sets is not valid data. Pre-processing tricks to allow using the entire data set or to minimize spectral leakage involve such techniques as 'padding' or 'shaping' or 'windowing' the data set (Spectre pads the data to the next higher power of 2). All these tricks suffer from modifying the data under analysis. Elaborate arguments exist for justifying these techniques, and in the hands of very skilled analysts their effects on the data can be minimized, but the Frequency Search used by Spectre obviates the necessity for them. When a padded data set is FFT'ed and then inverted back (IFFT) to the time domain, you can see the zero-padded region at the end of the data. It's a good idea to feed the FFT a data set of length close to a power of two (but not larger, or it will pad it out to the next power of two. A third limitation is that the frequencies used for the FFT be in an arithmetic series, each a multple of the fundamental frequency, 1 / Tmax, whre Tmax is the difference between the first and last abscissa points in the data set: Tmax = TN - T0 Try taking the FFT of one of the sharp-peaked periodograms of spectrograms produced by Frequency Search. On inverting the FFT back to the time domain (the IFFT), you'll see that the very sharp spikes cannot be recovered: they have become broadened. You'd need a very, very long data set to even approximate the spikes in the IFFT. The quid pro quo, of course, is that to use Spectre's Frequency Search, the researcher must pre-load the period table with periods known or suspected to be in the data set (but try the AutoGen option which lets Spectre independently determine which periods to look for). The usual approach is to use a large table initially, which contains periods from large to small. As Frequency Search runs are made, the periods in the .TBL file are changed until they reflect reality, but it is always a good idea to leave in a few periods over the entire possible range for your data set. As the Fast Orthogonal Search algorithm is exceptionally good at resolving two nearly-identical periods, consider clustering a number of periods about regions of interest. In this regard, the FFT may serve to determine the spectral width of those regions. Note that for a "real" data set the FFT spectral magnitude is mathematically symmetric (and the spectral phase is anti-symmetric) about zero frequency for real data sets. An asymmetric spectral magnitude indicates a non-zero imaginary component to the data. You can see this for yorself, and maybe get some ideas how to "package" your data sets, if you try the Graph Window options Select Data set as Real Part and Select Data set as Imaginary Part available on the System Menus for two data sets that seem to have a lot in common, though they may not be from identical sources. In this case, the phase plot may be interesting as the two data sets are "woven together" in the FFT. Compare it with the phase plots from each data set FFT'ed separately. Interpreting the Fourier Transform The FFT produces phase and magnitude plots which are readily interpreted visually. Plots of the real and imaginary parts are displayed, but they are primarily for use in performing the Inverse Fourier Transform. Two things emerge immediately when you first examine the FFT graphs for most data sets: first, there is usually a lot of action near the origin (low frequencies) and progressively less energy in the higher frequencies. Second, the Real Part plot is nearly exactly symmetric about the Y axis, and the Imaginary Part and Phase plots are anti-symmetric. As the Magnitude plot is also symmetric, only the positive components are shown to allow more detail on the screen. As an experiment, use the Tools/Generate Synthetic Data option to create a time series with mixed sine waves. Give the series a name, choose 512 points, an interval of 1, and ask for 2 sine waves. Assign an amplitude of 80, a period of 4, and 0 phase offset to the first sine. Assign amplitude 50, period 32, phase 0 to the other sine. Apply the FFT using Tools/Fast Fourier Transform. Look at the FFT magnitude plot, maximizing the graph window by clicking on the "up arrow" in the upper right corner. Because there may be more points in the FFT than pixels in the Graph Window, you may not see all four of the spikes. Re-size the graph by dragging a corner of the Graph Window until all four spikes are visible. Their x-values should be plus and minus 128 for the first sine, and plus and minus 4 for the second sine. Notice that the energy in the second sine is less than in the first wave. Note that 128 is 512 / 4, and 8 is 512 / 32: this reciprocal relationship using the number of points in the series suggests an easy way to mentally move between the time and frequency domains. Be aware that for time series of length not a power of two, some 0 points on the "right end" of the series will be added ("zero-padding"). That's why this experiment used 512 points. If you apply the option Tools/Inverse Fast Fourier Transform to a transformed time series you get back the original series, with zeroes for the portion "padded" to the next power of two. Re-run this test, using a period of 7 for one of the sines. Note that the FFT magnitude plot now shows a pair of spikes at about frequencies plus and minus 73, but the spikes have a broadened base. The period 7 is "incommensurate" with the frequencies used in the discrete Fourier transform, which are all of the form N / (1/2) = 2 N, where N can range from 1 to 512. The transform redistributes the energy in the period 7 sine wave among a number of frequencies in the vicinity of 73: a period of 8 produces a pure spike in the FFT magnitude, but a period of 7 produces a spread ("leaky") spectrum. Try the experiments with various amounts of noise, or other waveforms. Interpretation soon becomes more difficult, maybe very difficult. That's why Spectre was written: to go beyond the normal FFT and allow a more exact, readily interpreted, and noise-tolerant frequency analysis. Tools/Inverse Fourier Transform After Tools/Fast Fourier Transform has been applied to a data set, the Inverse Fourier Transform will return a pair of new graph windows. The real part graph is essentially the original data set Y values, usually to better than 0.1% accuracy. The imaginary graph wiggles around, but examination shows the values are on the order of less than a millionth of the Y values, or essentially 0. Once you verify this, the imaginary graph can be closed. What has this round-robin process accomplished? The information of value to most researchers is in the spectrum magnitude, where the spectrum shape often indicates how energy migrates through a physical system, and the location of peaks indicate the existence of periodic behavior in the time series. The inverse transform's congruence with the original data set verifies that the forward transform was valid. The Graph Window System Menu options Select Data Set as Real Part and Select Data Set as Imaginary Part allowed you to associate one data set with the real part, and another data set with the imaginary part, of a complex data set. If the two data sets are not identical, the FFT spectral magnitude plot will be more or less asymmetric about zero frequency. The inverse transform applied to this complex FFT will yield back the original data sets as the IFFT real and imaginary parts. The IFFT magnitude and phase are also plotted in new graph windows. Tools/Filter Data Set Spectre can apply an Ideal Low Pass, High Pass, Band Pass, or Band Stop (Notch) filter to a data set. Enter at the dialog the Filter Type, the Upper and/or Lower Cutoff Frequency as appropriate, and the Number of Taps, or points in the filter tranfer function; historically, "taps" refers to the pick-off points along analog delay lines. Recall that Frequency = 1 / Period. After applying the filter, new graphs display the filtered data set and the filter transfer function used. The Ideal filters use a Sinc function (= sin (f) / f) convolved with the data set in the time domain, equivalent to a rectangular window cutting out a region from the signal spectrum. These filters tend to introduce large phase shifts (delays) in the signal when a large number of taps is used. Butterworth, Chebychev, Bessel, and Cauer (elliptical) filter types will soon be included in Spectre, as will the option to do time-domain (convolutional) or spectral filtering. Tools/Generate Synthetic Data Spectre can generate a data set, or "time series" from superposed sine waves, square waves, sawtooth waves, and 'squirt' waves. You can add a constant (so-called 'DC', for Direct Current) offset, a linear trend ( a sloping, linear rise or fall along the entire data set) and uniformly distributed noise. These are the Generate Synthetic Data dialog entries: Data set title Free format, up to 12 characters. # of Data Points (2-200000) Use 1025 if you want 1024 intervals. Initial X value (abscissa, + or -) Usually 0, but any X offset is accepted. Interval Between Points (> 0) The difference between adjacent abscissa X values. The inverse of "sample rate". # of Sines to Generate (0-99) Specify amplitude, period, phase for each. # of Square Waves (0-99) Specify Amplitude, period, phase for each. # of Sawtooth Waves (0-99) Specify amplitude, period, phase and relative positioning of peak. # of Squirt Waves (0-99) Specify for "pump" wave: amplitude, period, intake stroke relative duration. Noise Amplitude Uniformly distributed. Specify Constant Offset (+ or -) Also called DC offset. Specify linear trend (+ or -) Choose value to use as right endpoint, the amount by which to "tilt" data set. Subtract Mean After generating data set, remove the mean. Shuffle Data Set Randomly mix the X,Y pairs: 13th point might become 211th. Used for statistical tests. All amplitudes (for waves, noise, Y offsets) are assumed in the same units, your choice. Ditto for periods. Remember that although the terms 'time' and 'time series' are used, the X variable in a (X,Y) data set can be any quantity on which the Y values, in some sense, depend. However, be very careful over your choice of units (degrees, radians; or seconds, minutes, heartbeats)... it matters greatly in your analysis. Strive for internal consistency during a Spectre session, and throughout your work. Try "walking through" a data processing session on paper, using printout of computer input and output files for raw material: check that numerical answers are apprpriate to your assumptions. Spectre can be of great assistance here, as you can design your own dataset in the Tools/Generate Synthetic Data dialogs. Many aspects of your data can be simulated and run through a Frequency Search or Fast Fourier Transform. You can learn a lot this way about nuts and bolts stuff: units, intervals, amplitudes, bandwidths, sampling rates, etc. Tools/Reconstruct Search Results You can use the period, amplitude and phase offset of each identified sine wave from a Frequency Search to make a new, 'synthetic' data set, virtually automatically. You can also use Reconstruct Search Results to run an Extrapolation beyond the end of the original data set. Simply select a number for the percentage of the original data set's length, for example 10: a data set of 1000 points, with 10% extrapolation, will yield a reconstruction of 1100 points, the last 100 points extending into the "future", beyond the end of the data. To the degree that the existing cycles maintain the same amplitude and phase relationships, the extrapolation will be valid. This is probably the most popular option in Spectre. A long-term trend is represented in Spectre not as a straight line, but as the beginning ramp-up of a long period sine wave (sin (x) = x for x much less than 1 radian (about 57 degrees). "Long period" here means longer than the data set length. If the data set spanned a year, a long period might be two years, or ten. There is no way to know where the trend will (in actuality) eventually peak and turn down, so use with caution. The advantage in using Spectre for long-term trends is that an eventual cresting and down-turn is guaranteed, as the trend is based on a sinusoid and not a straight line. No trend lasts forever. The potential error in the computed location in time of a long-term trend peak may be large, and is bound to be as sensitive as a linear trend to small variations in trend value: a small difference in slope between shallow trend lines can mean a lot at later times. Reconstruction may also help assess the effect of noise in real data sets, by creating a data set with the same overall structure, but noise-free. Make sure the Graph Window on which the Frequency Search was performed is active, by clicking the mouse anywhere over it or by hitting Ctrl-Tab repeatedly until the Graph Window is highlighted. Then select the Tools/Reconstruct Search Results menu option. Select a number for the percentage extarpolation, or use 0 for no extrapolation. A new graph window appears, holding a data set comprised only of sine waves whose periods were selected from the Frequency Search. Extrapolation As an adjunct to Reconstruct Search Results, you can extrapolate the reconstruction beyond the end of the original data set. Select a number for the percentage of the original data set's length: a data set of 1000 points, with 25% extrapolation, will yield a reconstruction of 1250 points, the last 250 points extending into the "future", beyond the end of the data. You may want to avoid using very long periods in the Period Table (TBL) used for the Frequency Search, as these long sines are identified with long-term trends, and these trends can extend many times the length of the original data. Of course, there are applications for this type of trend analysis, but leaven your interpretation of the results with some healthy skepticism until you can verify the trend, perhaps using a similar analysis on "historical" data to extrapolate toward the "known" present. Even long-term trends can die off, often accompanied by a region of lessened stability. The trends identified by Spectre are not linear: they are very long-period sinusoids, so they do possess a peak and roll-over. The trend is merely the linear approximation to a large-amplitude sinusoid of small phase angle (sin (x) = x for x << 1 radian). If the trend is quadratic, Spectre will fit that with a sinusoid, and so on. If you are conducting experiments in cyclical behavior, use the Analyze/Frequency Search option to identify the relevant periods, and let the extrapolation feature show the effect of extended application of the identified cycles. + Some registered Spectre users are investors who specifically requested this feature after trying Spectre on their data. The feature was added when the demand for extrapolation capability, caveats and all, became clear. Although characteristically close-mouthed about their specific application, these practitioners are largely engaged in index or commodities trading, and some routinely purchase mixed short- and long-term "straddle" positions in out-of-the-money and nearly-expired options contracts. In other words, they are true speculators, the markets most efficient "leveling force". We admire their nerve. 7. Settings Menu Control is provided over internal algorithm settings for the AutoGen automatic period table generator, for Frequency Search Dialog cutoff values and display modes, and for the Waveform Identification selection criteria. 3-D Plot Window display modes are also controlled from this menu. AutoGen Parameters Dialog See AutoGen for background on automatic generation of period (TBL) tables. Tune AutoGen Period Selection Default Value Spread Periods About Wave Length (%) 1% Include the width between two consecutive data set zero-crossing as a candidate period, and include as periods values 1% above and below that period. Redundancy Filter Factor 2 Two periods within a factor of 2 times the data set sample interval are merged into one candidate period. Frequency Search Parameters Dialog Short Period Cutoff Multiplier 5 Examine data set for candidate periods due to zero-crossings and the extrema between zero-crossings (see References). Periods less than 5 times the data set sample interval are eliminated. FFT Energy Threshold (%) 4% The FFT spectrum is examined for frequencies with substantial power, down to 4% of the peak amplitude. FFT Base Frequency Multiplier 2 and # of Long Periods 10 From point zero in spectrum at 0 Hz (infinite seconds) to point 2 at, for example, 0.004 * 2 = 0.008 Hz (500 seconds), add some periods to fill the range. That is, between 0 Hz and 2. * .004 = 0.008 Hz, 10 periods are interpolated This approach points out a major flaw in FFT analysis: the resolution is so poor that you seldom know exactly wht is being depicted in a spectrum. The spectral magnitude at any frequency is comprised of contributions from other wave components than a pure sinusoid at that frequency. Waveform Identification Parameters Dialog Adjust the criteria for extraction of a waveform from a long data set. Also see Waveform Identification. The waveform resembles a "template" waveform (a short sample data set you supply) in a mathematically definable procedure. These are the "tuning parameters" associated with the identification decision: Waveform Identification Settings Default Value MSE Tolerance 20% The Mean square Error (MSE) ratio of the normalized, shifted wave reconstruction to the template reconstruction must be less than 20%. Raising MSE Tolerance too high will increase the likelihood of false positive identifications, or finding waveforms which are probably not like the template. Minimum Energy Ratio 50% The ratio of actual (not reconstructed) wave MSE to template MSE must exceed 50% or the wave energy is considered insignificant even if the wave reconstruction "fits" the template reconstruction by the first criterion. Raising Minimum Energy Ratio too high will increase the likelihood of false negative identifications, or missing waveforms which are probably valid. Overlap Previous Partition 50% After analyzing a partition, the next partition overlaps the present partition by 50%. 0% means no overlap, and the maximum is 99% (which gives the algorithm the best shot at finding the correct waveform, but will run about 100 times slower. An overlap of 50% works well in most circumstances, catching on the next pass a waveform which may have been "cut" by the partitioning. Period/Amplitude Analysis Period/Amplitude Analysis (PAA) is most useful if the template you have prepared has some very distinguishing appearance, like a large negative excursion before a burst of elevated oscillations. Waveforms which may exist on top of large-scale structure or which have considerable variation in internal details from example to example are poor candidates for PAA. Period/Amplitude Analysis (PAA) is a walk through a data set, recording zero-crossings, extrema, and approximate slopes to determine the wave content while entirely in the time domain. PAA can do some preliminary spadework for the Waveform Identification algorithm, for example discovering large excursions within prescribed time limits. The "large" and "prescribed" qualifiers are determined by a detailed PAA on the template data set you have prepared for use. PAA can save lots of analysis time because the Waveform Identification algorithm only looks where PAA suggests instead of methodically combing the entire data set, and PAA is faster than a Fast Fourier Transform. The downside is that PAA usually "doesn't see the forest for the trees" and has poor results recognizing complex waveforms even when it knows all the "parts" are there. PAA is most useful if the template has some particularly distinguishing appearance, like a large negative excursion before a burst of elevated oscillations. Waveforms which may exist on top of large-scale structure or have considerable variation in internal details are poor candidates for PAA. See References for recent work on Period/Amplitude Analysis. Display Graph Info A selection of data set information is displayed for the currently active Graph Window: data set title, full-path filename, number of data set points, X,Y extrema, Y mean and RMS (root mean square) values, first and last X,Y values, interval between the 1st and 2nd X points, and other items. While Display Graph Info is active, click over other Graph Windows to view their information; the info will update for the activated window. Display Graph Coordinates When selected, this option displays the data set index and X,Y coordinates corresponding to the cursor position whenever the cursor is over a graph window. Note that the keypad left, right cursor keys can also move the graph cursor one pixel at a time over the graph. Holding down the cursor key accelerates the motion. The Axis names and units, if chosen, are also shown, as well as the actual pixel corrdinates under the cursor relative to the graph upper left corner. When remaining physical memory (RAM) approaches zero, disk activity (swapping memory between disk and RAM) dominates and performance deteriorates. Worth monitoring occasionally. When system resources (Graphics and User) go below about 60%, performance suffers, and ultimately some activities will cease occurring (e.g., window repaints.) 10. Graph Window System (Control) Menu In addition to the Main Menu, each Graph Window contains a so-called System, or Control, Menu accessible by clicking the box in the window upper left corner, next to the window title bar. The normal entries Restore, Move, Size, Maximize, Close, Next Window are joined by some special options incorporated in Spectre to allow the user to associate two data sets with one another. Notice the operations are paired, involving two Graph Windows:: Plot Graph With Square Corners Select Data Set as Overlay Source Select Data Set as Overlay Receiver Select Data Set as Scroll Controller Select Data Set as Scroll Slave Select Data Set as Real Part for FFT Select Data Set as Imaginary Part for FFT Select Data Set as Operand 1 Select Data Set as Operand 2 Select Data Set as Abscissa Select Data Set as Ordinate Plot Graph With Square Corners Normally, a Graph Window is plotted by simply connecting points directly with a line, which will usually lie at some angle. This does not accurately represent all data types. Square waves, for instance, should ideally be "squared away". This option continues plotting from the last point graphed until the next point is reached, and then plots vertically to reach the Y value at that point. Particularly useful for digital logic plots, or for bar graph data. Select Data Set as Overlay Source Select Data Set as Overlay Receiver The overlay source Graph Window is drawn as a white dotted line over the overlay receiver Graph Window. Any two graphs may be associated in this manner. Note carefully that no attempt is made to guarantee that a Graph Window data set and its overlay actually correspond: they may have completely different X-values (e.g. 1 to 100 and 10 to 2000). Overlays are strictly for visual comparison. Select Data Set as Scroll Controller Select Data Set as Scroll Slave To aid navigating through a large data set in a Scrolling Graph Window, another (non-scrolling) graph may be selected to be the Scroll Controller. Double-clicking anywhere on the Controller graph will scroll the Slave data set to the corresponding relative location. A circular cursor depicts the location on the Controller where the last Controller/Slave linkage was established. Although the Controller/Slave capability will work using any non-scrolling Graph Window to guide the loading of any scrolling large data set, this feature is specifically designed to be used with the Analyze/Partition Large Data Set/RMS Reduction, which produces from a large data file a new Graph Window composed of the summed root mean square value of every point in each scrolled partition: if the partition size is 2000 points, partitioning a 1,000,000 point large data file will produce a graph of 500 points, each of which is the RMS value of the corresponding 2000-point section from the large data file. As an example, assume as above that the Controller graph has 500 points and the Slave data set has 1,000,000 points. Double-clicking on the 300th point of the Controller will load that portion of the large Slave data set which is at 60% of the maximum X value in the file, or points 600,001 to 602,000. Similarly, double-clicking on the first point on the Controller graph will load a section of the large Slave data set from its first point, up to the size of its Scrolling Graph Window, or points 1 through 2000. Double-clicking on the last point in the Controller graph will load the last section of data from the large data set into its Scrolling Graph Window, or points 998,001 to 1,000,000. It is now possible to go to any portion of the large data set by double-clicking on its Controller Graph. If the Controller is the RMS Reduction of the large data file, you have a "road map" of the large data file (too large to be viewed on-screen all at once) and can instantly go to any interesting location. At a place on the Controller graph where the "RMS energy" suddenly experiences a sharp drop, you can instantly view the section of actual data that contributed that sharp drop. Now scrolling slowly back through the larage data set from that section, it may be possible to see a "precursor" waveform or behavior which can be used to "predict" when that kind of sharp drop might be expected to re-occur. The RMS Reduction operation for a 1,000,000 point data file requires less than one minute on a fast Pentium, and a Controller/Slave pair quickly provides a very useful "gross behavior" view of the large data file while simultaneously viewing "fine-scale" details of interesting regions. Select Data Set as Real Part for FFT Select Data Set as Imaginary Part for FFT Use this option to make a "complex" data set (one having both real and imaginary parts) from two real data sets for use in running the Tools/Fast Fourier Transform. The FFT is run automatically as soon as a real, imaginary pair of data sets has been selected. Click the System Menu in the upper left corner of each data set Graph Window, click again on "Select Data Set As Real Part" or "Select Data Set As Imaginary Part". Select Data Set as Operand 1 Select Data Set as Operand 2 Befor selecting Tools/Combine Graphs, select two data sets on which to perform these operations: Concatenation, Correlation, Convolution, and the point-wise operations Addition, Subtraction, Multiplication, Divsion. These options from the Graph Window System Menu (button at upper left corner of a graph window) are used to perform a Tools/Combine-Graphs operation on data sets: concatenation (join), point-wise addition, subtraction, multiplication, and division, convolution, cross-correlation. The data sets are interpolated and aligned to guarantee that the graph 1 and graph 2 points are at the same X (abscissa) values. As an example, Operand 1 / Operand 2 will divide each point in graph 1 by the correxsponding point in graph 2. Select Data Set as Abscissa Select Data Set as Ordinate Sometimes you may have one data set in two different files: one file contains the abscissa values (time) while the other has Y-only (ordinate) values. Unite them with this feature. This feature also enables you to asociate two data sets, using one as the abscissa, or independent variable, and the other as the ordinate, or dependent variable. Click the Graph Window System Menu in the upper left corner of each data set and select Select Data Set As Abscissa or Ordinate. As soon as a pair has been selected, a new window will be opened showing the relationship. In general, a positive (upward) sloping elliptical blob of green line segments implies a positive correlatiuon between the two data sets. A downward-sloping trend indicates anti-correlated sets. A more or less uniform disk indicates a lack of correlation. See Association and Visualization for more information, and see an interesting application in Lissajous Figures, with which anyone who has ever used an oscilloscope will be familiar.