Did the long-timescale character of the GCP data change
after 9/11?.
Vertical bar marks Sept. 11, 2001.
The plot shows
the cumulative deviation of the daily values of the network variance [netvar]
from Oct 1, 1998 to Sept. 8, 2004. The netvar for each day is expressed as a
z-score. The parabola is the 5% probability envelope for the cumdev. There are
2166 plotted daily values. [there is no GCP data for the period Aug 5-8, 2002]
The random deviations seem qualitatively greater after 9/11.
Does
this suggest a non-random trend in the post 9/11 data?
This is not confirmed
by mean and variance tests on the pre- and post- 9/11 data.
A mean test for zero difference between the pre and post 9/11
data subsets gives a pval = 0.11
A variance ratio test
for identical variance of the pre-/post- datasets yields a
pval = 0.38.
Comment, RDN: While 9/11 is arguably a good point to
make a division (which necessarily is post facto and arbitrary),
another reasonable point might be the visually obvious inflection during
the Afghan war. If that point were used, the mean difference would
almost certainly be significant.
Using data at the minute level gives pvals of 0.08 and 0.24, respectively for the mean and variance. These tests do not support the hypothesis of a non-random trend developing in the data after 9/11. However, visually, there appear to be pronounced long-timescale structure after 9/11, with both positive and negative slopes. The mean and variance tests are not sensitive this.
Another approach is the following:
Calculate two netvar datasets using alternating seconds for each
set. The interdigitated datasets are thus rigorously independent. If there is
strong, non-random, long-timescale structure in the cumdev after 9/11, it will
be present in both datasets. In that case correlations exist between A and B
which is strong evidence for an anomalous effect.
Call the interdigitated
datasets A and B. A plot of the two sets is shown below:
Visually, there is a correspondence between the red and blue curves after
9/11 AND both show structure similar to the full netvar curve (in grey, rescaled
and offset) AND there is little correspondence before 9/11.
Comment, RDN: The visual correspondence before 9/11 is pretty strong
except for the first few months. After 1999, there tends to be a lot of
parallel shifting of the trends on the order of weeks or longer, though
not as much as post-9/11.
This is the main qualitative result of the A-B data splitting.
Note that the
strong peak of the Iraq campaign (near day 1700) and the preceeding steep
descent appear in all three curves. In order to test the correspondence
quantitatively, we want a test sensitive to structure on this scale. Standard
correlation coefficients are not sensitive to detailed structure since they only
test linear, or at best monotonic, correlations.
The z-score for the Pearson
correlation for the full, pre- and post- 9/11 segments of A and B
are:
Full 1.81
Pre 1.72
Post 0.81
which
derives from the modest linear correlation between A and B.
The Pearson
coefficient tests the very long timescale correspondence. Here it is only
marginally significant. And it does not distinguish the pre- and post- 9/11
periods.
Comment, RDN: Not sure of your meaning here. The Z-scores indicate
the pre-9/11 data are significantly correlated as suggested earlier.
One way to test for correlation in the structure is to fit the
curves and test correlations between the two sets of fitting parameters. Below
is a plot of fits to A and B using 51 cosine functions. The fit is done on the
cumdev because we want the structure to be prominent. The fitting parameters are
the cosine amplitudes and the cosine wave vectors are 2n Pi / L ,
where L is the number of data points. The fits are done for n = [0,50]. Using n
up to 50 allows fitting of structure on timescales of 1 month and
longer.
The cosine expansion is most efficient for centro-symmetric
structures, so the cumdev is concatenated with its reflection before fitting.
The figure below shows the fits (grey) for the full 2165 data points (days) for
curves A and B. The centro-symmetric reflection doubles the number of points to
4330. The fit uses 100 cosine functions. [The center of the plot, which is the
last day of data, is the reflection point. It is marked by a vertical
bar.]
The fitting procedure gives a set of coefficients (cosine amplitudes) for
each curve A and B. These are the cosine amplitudes. To look at the difference
between pre- and post- 9/11, split the sets at that date and calculate
correlations for the periods separately. The number of points is halved for each
period and we need only 50 amplitudes. Let the coefficients be A[n] and B[n]
where n = [0,50] labels the cosine functions.
Then the correlation is the sum
of pair products of the coefficients:
Sum[ (n+1)^2*A[n]*B[n] ].
Note: The pair products are weighted by the squared cosine wave index,
n^2, which compensates the average falloff in the cosine amplitudes (they
decrease as (roughly) n). We try to give equal weight to structure on all
timescales. The cosine index weighting is one way to do this. The
correlation thus measures realatively local (in time) structure, such as the
Iraq war "peak", and broad structure, such as the decline. The plot
below shows
the relation between wave index and the standard deviation of fitted cosine
amplitudes. A fit (blue) is stddev = (t/n)^2.16 , where t = 1.618.
Below is a preliminary result of the correlation calculation. The A & B datasets were each split into segments before and after 9/11. The cosine amplitude correlation was calculated for the A & B pre-9/11 data and again for the post-9/11 data. A plot of the cumulative correlation for the two data regions shows that the correlation for the post-9/11 period is significant whereas the pre-9/11is clearly insignificant. (note: the horizontal axis is the cosine wave index; low order indices contribute to the long-timscale features and high order indices to short timescale structure. Indices around number 50 correspond to structure in the netvar cumdev with half-widths of roughly a month.) Explain half-width -- is that like the half width of a distribution, e.g. like a standard deviation of a mean length? Probability envelopes for the correlation are being calculated. Preliminary results suggest that the pval for the 51-amplitude fit is around 0.001 (z-score = 3). The cumulative also shows that many wave indices between n=0 and n=50 contribute to the correlation. This is consistent with the netvar cumdev which shows structure on the scale of months to years. Thus, at the 3-sigma level (and to be confirmed by further calculations on the amplitude correlation probability distribution), the post-9/11 data contain non-random structure on long timescales.
This is the main quantitative result of the A-B data splitting.
Below is the cumulative for the A/B correlation for
the post-9/11 data. Empirical envelopes show the probability of correlation is
roughly .0025 for fits with resolution down to the month level (n up to 50). The
correlation for the pre-9/11 data (blue) is clearly not
significant.
We can study the correlation by looking separately at different timescales.
Steps in the correlation cumulative show that there is correlation associated
with wave
indices where the correlation increases sharply. This information lets us
decompose the fits to see which features are contributing to the overall
correlation. The figure below shows the fits for A and B datasets using cosine
amplitudes through n = 5, 25, 39 and 73. The right hand panels show fits
with the preceeding lower-order-n fits subtracted out. This isolates
structure responsible for correlation for timescale windows evidenced from the
correlation cumulative.
The following plots repeat the A/B analysis for the device variance.
Visually, the datasets have very different cumdevs and no correlation is
obvious. In the plot below, the colors distinguish the two sets A and B.
Correlation for both the pre- and post- periods are negligible for the device
variance. In the plot below the colors distinguish the pre- and post- 9/11
periods.
The following plot compares the full datasets A and B for the netvar and
the devvar. The cumulative A/B correlation for the the full dataset from Oct
1998 to Sept. 2004 is marginally significant for the netvar (Z about
2.4) and insignificant for the devvar (Z less than 1.0).
Is there a change in network behavior associated with major
world events related to terrorism and terror politics?
Polls that ask
the question: "Do you approve or disapprove of the way the
president is handling his job?"
probe a general sense of political and societal well-being.
Does the
network variance grow when there are strong, persistent feelings of unity, rally
and common purpose?
Does the network variance decrease when there are strong,
persistent polarizing forces?
Figure caption: Red trace: US Presidential approval ratings from
6 US polling sources (AP, Harris, Gallup, ABC, Pew, NBC). Blue trace:
cumulative deviation of GCP network variance (variance of network mean at
one-second resolution).
Vertical bars mark major events:
Bush Inauguration,
Shaded region I: Terrorist attack and Afghan campaign (9/11 attacks ,
Sept 11, 2001 to announcement of Taliban defeat, Dec 16, 2001),
Shaded region
II: Iraq campaign (official announcement of bombing , May 19, 2003 to
announcement of end of "major combat operations", May 1, 2003),
capture of
Saddam Hussein (Dec 13, 2003),
Madrid terrorist bombings (March 11,
2004),
Bush
re-election.
The poll results are for 556 separate polls from Aug 9, 1998 to Dec 15, 2004. Poll dates are take to be the closing day of the polling period [most polls are conducted over 3-4 days]. Values are averaged when more than one poll closes on the same day. There are 506 data points representing 506 unique polling dates.
Same as above, with 3-pt smoothing of poll results.
Same as above, with 8-pt smoothing of poll results and 20-pt smoothing of the netwok variance.
A view of the network variance cumdev and poll plots when both are normalized to unit variance.