Version 260 (modified by tbretz, 6 weeks ago) (diff)

--

# Spectrum Analysis

## Theory

### Spectrum

The differential flux $$\phi(E)$$ per area, time and energy interval is defined as $\phi(E) = \frac{dN}{dA\cdot dt\cdot dE}$

Often $$\phi(E)$$ is also referred to as $$\frac{dN}{dE}$$ as observation time and effective collection area is a constant.

For an observation with an effective observation time $$\Delta T=\sum\delta t_i$$, this yields in a given Energy interval $$\Delta E$$: $\phi(\Delta E) = \frac{1}{A_0\cdot \Delta T}\frac{N(\Delta E)}{\epsilon(\Delta E)\cdot \Delta E}$

For simplicity, in the following, $$\Delta E$$ will be replaced by just $$E$$ but always refers to to a given energy interval. If data is binned in a histogram, the relation between the x-value for the bin $$E_\textrm{x}$$ and the corresponding interval $$\Delta E_\textrm{x}$$ is not well defined. Resonable definitions are the bin center (usually in logarithmic bins) or the average energy.

### Efficiency

The total area $$A_0$$ and the corresponding efficiency $$\epsilon(E)$$ are of course only available for simulated data. For simulated data, $$A_0$$ is the production area and $$\epsilon(E)$$ the corresponding energy dependent efficiency of the analysis chain. For a given energy bin, the efficiency is then defined as

$\epsilon(E) = \frac{N_\textrm{exc}(E)}{N_0(E)}$

where $$N_0$$ is the number of simulated events in this energy bin and $$N=N_{exc}$$ the number of *excess* events that are produced by the analysis chain.

Note that the exact calculation of the efficiency $$\epsilon(\Delta E)$$ depends on prior knowledge of the correct source spectrum $$N_0(E)$$. Therefore, it is strictly speaking only correct if the simulated spectrum and the real spectrum are identical. As the real spectrum is unknown, special care has to be taken of the systematic introduced by the assumption of $$N_0(E)$$.

### Effective Collection Area

The effective area is then defined as $$A_\textrm{eff}(E)=\epsilon(E)\cdot A_0$$. Note that at large distances $$R_0$$ the efficiency $$\epsilon(R_0)$$ vanishes, so that the effective area is an (energy dependent) constant while $$A_0=\pi R_0^2$$ and the efficiency $$\epsilon(E)$$ are mutually dependent.

### Excess and Error

The number of excess events, for data and simulations, is defined as

$N_\textrm{exc} = N_\textrm{sig} - \hat N_\textrm{bg}$

where $$N_\textrm{sig}$$ is the number of events identified as potential gammas from the source direction ('on-source') and $$N_\textrm{bg}$$ the number of gamma-like events measured 'off-source'. Note that for Simulations, $$\hat N_\textrm{bg}$$ is not necessarily zero for wobble-mode observations as an event can survive the analysis for on- and off-events, if this is not prevented by the analysis (cuts).

The average number of background events $$\hat N_\textrm{bg}$$ is the total number of background events $$N_\textrm{bg}$$ from all off-regions times the corresponding weight $$\frac{1}{5}$$ (often referred to as $$\alpha$$). For five off-regions, this yields

$\hat N_\textrm{bg} = \frac{N_\textrm{bg}}{5}$

Assuming Gaussian errors, the statistical error is thus

$\sigma^2(N_\textrm{exc}) = \left(\frac{dN_\textrm{exc}}{dN_\textrm{sig}}\right)^2\sigma^2(N_\textrm{sig}) + \left(\frac{dN_\textrm{exc}}{d\hat N_\textrm{bg}}\right)^2\sigma^2(\hat N_\textrm{bg})= \sigma^2(N_\textrm{sig}) + \frac{1}{5^2}\sigma^2(N_\textrm{bg})$

For data this immediately resolves to

$\sigma^2(N_\textrm{exc}) = N_\textrm{sig} + \frac{1}{5^2}N_\textrm{bg}$

with the Poisson (counting) error $$\sigma^2(N_\textrm{sig,bg}) = N_\textrm{sig,bg}$$.

### Weights and Error

In the following $$N$$ refers to a number of simulated events and $$M$$ to a number of measured (excess) events.

$$n(\delta E, \delta\theta)$$ is the number of produced events in the energy interval $$E\in\delta E=[e_\textrm{min};e_\textrm{max}]$$ and the zenith angle interval $$\theta\in\delta\theta=[\theta_\textrm{min};\theta_\textrm{max}]$$. The weighted number of events $$n'(\delta E, \delta\theta)$$ in that interval is then

$n'(\delta E, \delta\theta) = \sum_{i=0...n}^{E\in\delta E}\sum_{j=0...n}^{\theta\in\delta\theta} \omega(E_i, \theta_j)$

Since $$\omega(E, \theta) = \rho(E)\cdot \tau(\theta)$$ with $$\rho(E)$$ the spectral weight to adapt the spectral shape of the simulated spectrum to the real (measured) spectrum of the source and $$\tau(\theta)$$ the weight to adapt to oberservation time versus zenith angle.

$n'(\delta E, \delta\theta) = \sum_{i=0...n}^{E\in\delta E}\sum_{j=0...n}^{\theta\in\delta\theta} \rho(E_i)\cdot\tau(\theta_j) = \sum_{i=0...n}^{E\in\delta E}\rho(E_i)\cdot\sum_{j=0...n}^{\theta\in\delta\theta} \tau(\theta_j)$

The weighted number of produced events $$N'(\Delta E, \Delta\Theta)$$ in the total energy interval $$E\in\Delta E=[E_\textrm{min};E_\textrm{max}]$$ and the total zenith angle interval $$\theta\in\Delta\Theta=[\Theta_\textrm{min};\Theta_\textrm{max}]$$ is then

$N'(\Delta E, \Delta\Theta) = \sum_{i=0...N}^{E\in\Delta E}\sum_{j=0...N}^{\theta\in\Delta\Theta} \rho(E_i)\cdot\tau(\theta_j) = \sum_{i=0...N}^{E\in\Delta E}\rho(E_i)\cdot\sum_{j=0...N}^{\theta\in\Delta\Theta} \tau(\theta_j)$

with $$N(\Delta E, \Delta\Theta)$$ being the total number of produced events.

For a sum of weights, e.g. $$N' = \sum_N \rho_i\tau_i$$ the corresponding error is $\sigma^2(N')^2 = \sum_N\left[\rho_i\cdot\sigma(\tau_i)+\tau_i\cdot\sigma(\rho_i)\right]^2$

As the energy is well defined, $$\sigma(\rho_i)=0$$ and thus

$\sigma^2(N')^2 = \sum_N\rho_i^2\cdot\sigma^2(\tau_i)$

The weights are defined as follow:

$\rho(E) = \rho_0\frac{\phi_\textrm{src}(E)}{\phi_0(E)}$

where $$\phi_0(E)$$ is the simulated spectrum and $$\phi_\textrm{src}$$ the (unknown) real source spectrum. $$\rho_0$$ is a normalization constant. The zenith angle weights $$\tau(\delta\theta)$$ in the interval $$\delta\theta$$ are defined as

$\tau(\delta\theta) = \tau_0\frac{\Delta T(\delta\theta)}{N(\delta\theta)}$

where $$N(\delta\theta)$$ is the number of produced events in the interval $$\delta\theta$$ and $$\Delta T(\delta\theta)$$ is the total observation time in the same zenith angle interval. $$\tau_0$$ is the normalization constant. The error on the weight $$\tau=\tau(\delta\theta)$$ in each individual $$\theta$$-bin with $$\Delta T=\Delta T(\delta\theta)$$ and $$N=N(\delta\theta)$$is then

$\sigma^2(\tau) = \left[\frac{d\tau}{d\Delta T}\sigma(\Delta T)\right]^2+\left[\frac{d\tau}{dN}\sigma(N)\right]^2= \tau^2\cdot\left[\left(\frac{\sigma(\Delta T)}{\Delta T}\right)^2+\left(\frac{\sigma(N)}{N}\right)^2\right]$

While $$\sigma(\Delta T)/\Delta T \approx 1\textrm{s}/5\textrm{min}$$ is given by the data acquisition and 1s per 5min run, $$\sigma(N)/N=1/\sqrt{N}$$ is just the statistical error of the number of events.

As the efficiency $$\epsilon(\delta E)$$ for an energy interval $$\delta E$$ is calculated as

$\epsilon(\delta E) = \epsilon(\delta E, \Delta\Theta) = \frac{N'_\textrm{exc}(\delta E, \Delta\Theta)}{N'_\textrm{src}(\delta E,\Delta\Theta)}$

and $$N'_\textrm{exc}(\delta E,\Delta\Theta)$$ and $$N'_\textrm{src}(\delta E,\Delta\Theta)$$ are both expressed as the sum given above, the constants $$\rho_0$$ and $$\tau_0$$ cancel.

The differential flux in an energy interval $$\delta E$$ is then given as

$\phi(\delta E) = \phi(\delta E,\Delta\Theta) = \frac{1}{A_0\cdot \Delta T}\frac{M_\textrm{exc}(\delta E)}{\epsilon(\delta E)\cdot \delta E} = \frac{M_\textrm{exc}(\delta E)}{N'_\textrm{exc}(\delta E)}\cdot \frac{N'_\textrm{src}(\delta E)}{A_0\cdot\Delta T\cdot \delta E}$

Where $$A_0$$ is total area of production and $$\Delta T$$ the total observation time. The number of measured excess events is in that energy interval is $$M_\textrm{exc}(\delta E)=M_\textrm{exc}(\delta E, \Delta\Theta)$$.

Using Gaussian error propagation, the error in a given energy interval $$\delta$$ is then given by

$\sigma^2(\phi) = \left(\frac{1}{A_0\cdot \Delta T\cdot\delta E}\right)^2\cdot\left[\left(\frac{d\phi'}{dM_\textrm{exc}}\right)^2\sigma^2(M_\textrm{exc}) + \left(\frac{d\phi'}{dN'_\textrm{exc}}\right)^2\sigma^2(N'_\textrm{exc}) + \left(\frac{d\phi'}{dN'_\textrm{src}}\right)^2\sigma^2(N'_\textrm{src})\right]$

with $$\phi':=A_0\cdot\Delta T\cdot\delta E\cdot\phi$$.

$\rightarrow\quad\sigma^2(\phi) = \phi^2 \cdot\left[\left(\frac{\sigma(M_\textrm{exc})}{M_\textrm{exc} }\right)^2 + \left(\frac{\sigma(N'_\textrm{exc})}{N'_\textrm{exc}}\right)^2 + \left(\frac{\sigma(N'_\textrm{src})}{N'_\textrm{src}}\right)^2\right]$

## Conceptual Example

### Get Data File List

A list with file IDs containing the events to be analyed is required a.t.m. The following query retrieves such a list and fills a temporary table (DataFiles) with the IDs.

CREATE TEMPORARY TABLE DataFiles
(
FileId INT UNSIGNED NOT NULL,
PRIMARY KEY (FileId) USING HASH
) ENGINE=Memory
AS
(
SELECT
FileId
FROM
factdata.RunInfo
WHERE
%0:where
ORDER BY
FileId
)


%0:where is a placeholder, for example for

fZenithDistanceMean<30 AND fThresholdMinSet<350 AND
fSourceKEY=5 AND
fRunTypeKEY=1 AND
fNight>20161201 AND fNight<20170201 AND
fR750Cor>0.9e0*fR750Ref


### Get Observation Time

The following query bins the effective observation time of the runs listed above in zenith angle bins and stores the result in a temporary table (ObservationTime). Note that the result contains only those bins which have entries.

CREATE TEMPORARY TABLE ObservationTime
(
.theta SMALLINT UNSIGNED NOT NULL,
OnTime FLOAT NOT NULL,
PRIMARY KEY (.theta) USING HASH
) ENGINE=Memory
AS
(
SELECT
INTERVAL(fZenithDistanceMean, %0:bins) AS .theta,
SUM(TIME_TO_SEC(TIMEDIFF(fRunStop,fRunStart))*fEffectiveOn) AS OnTime
FROM
DataFiles
LEFT JOIN
factdata.RunInfo USING (FileId)
GROUP BY
.theta
ORDER BY
.theta
)


%0:bins is a placeholder for the bin boundaries, e.g. 5, 10, 15, 20, 25, 30 (five bins between 5° and 30° plus underflow and overflow).

### Get Monte Carlo File List

The next query obtains all Monte Carlo runs which have their ThetaMin or ThetaMax within one of the bins obtained in the previous query (so all MC runs that correspond to bins in which data is available). Strictly speaking, this step is not necessary, but it accelerats further processing. In addition (here as an example) only runs with even FileIDs are obtained as test-runs (assuming that odd runs were used for training). The resulting FileIDs are stored in a temporary table (MonteCarloFiles).

CREATE TEMPORARY TABLE MonteCarloFiles
(
FileId INT UNSIGNED NOT NULL,
PRIMARY KEY (FileId) USING HASH
) ENGINE=Memory
AS
(
SELECT
FileId
FROM
ObservationTime
LEFT JOIN
BinningTheta ON .theta=bin
LEFT JOIN
factmc.RunInfoMC
ON
(ThetaMin>=lo AND ThetaMin<hi) OR (ThetaMax>lo AND ThetaMax<=hi)
WHERE
PartId=1 AND
FileId%%2=0
ORDER BY
FileId
)


### Get Zenith Angle Histogram

The following table creates a temporaray table (EventCount) internally which bins the MonetCarlo files from the file list in MonteCarloFiles in zenith angle bins. This temporary table ois then joined with table containing the binning for the data files (EventCount) and for each bin, the ratio (ZdWeight) and the corresponding error (ErrZdWeight) is calculated (assuming an error on the on-time of 1s per 5min). To have also the in edges in the same table, the binning is joined as well.

CREATE TEMPORARY TABLE ThetaHist
(
.theta    SMALLINT UNSIGNED NOT NULL,
lo          DOUBLE            NOT NULL COMMENT 'Lower edge of zenith distance bin in degree',
hi          DOUBLE            NOT NULL COMMENT 'Upper edge of zenith distance bin in degree',
CountN      INT UNSIGNED      NOT NULL,
OnTime      FLOAT             NOT NULL,
ZdWeight    DOUBLE            NOT NULL COMMENT 'tau(delta theta)',
ErrZdWeight DOUBLE            NOT NULL COMMENT 'sigma(tau)',
PRIMARY KEY (.theta) USING HASH
) ENGINE=Memory
AS
(
WITH EventCount AS
(
SELECT
INTERVAL(DEGREES(Theta), %0:bins) AS .theta,
COUNT(*) AS CountN
FROM
MonteCarloFiles
LEFT JOIN
factmc.OriginalMC USING(FileId)
GROUP BY
.theta
)
SELECT
.theta, lo, hi,
CountN,
OnTime,
OnTime/CountN AS ZdWeight,
(OnTime/CountN)*SQRT(POW(1/300, 2) + 1/CountN) AS ErrZdWeight
FROM
ObservationTime
LEFT JOIN
EventCount USING(.theta)
LEFT JOIN
BinningTheta ON .theta=bin
ORDER BY
.theta
)


%0:bins is a placeholder for the bin boundaries, e.g. 5, 10, 15, 20, 25, 30 (five bins between 5° and 30° plus underflow and overflow). It should be identical to the binning used for data files in DataFiles.

### Analysis Query

The analysis of Data and MontaCarlo files must be done totally identical to produce reasonable results. Therefore, the exact same query (or code) should be used for the analysis of both. In this example, the query has to provide two columns, Weight and LogEnergyEst for all gamma-line events. The weight must be +1 for events in the on-region and -0.2 for an event in the off-region (corresponding to the number of five wobble positions). LogEnergyEst must contain $$\log_{10}$$ of the estimated energy for each event. Only events surviving background suppression and spatial (theta) cuts should be considered.

WITH Table0 AS
(
SELECT
%0:columns  -- this could be removed if we can join events via the same columns (without CorsikaNumResuse)
Weight,
Size,
NumUsedPixels,
NumIslands,
Leakage1,
MeanX,
MeanY,
CosDelta,
SinDelta,
M3Long,
SlopeLong,
Width/Length      AS WdivL,
PI()*Width*Length AS Area,
cosa*X - sina*Y   AS PX,
cosa*Y + sina*X   AS PY
FROM
%1:files
LEFT JOIN
%2:runinfo USING (FileId)
LEFT JOIN
%3:events USING (FileId)  -- This could be replaced by a user uploaded temporary table
LEFT JOIN
%4:positions USING (FileId, EvtNumber)
CROSS JOIN
Wobble
WHERE
NumUsedPixels>5.5
AND
NumIslands<3.5
AND
Leakage1<0.1
),

Table1 AS
(
SELECT
%0:columns
Weight,
Size, CosDelta, SinDelta, M3Long, SlopeLong, Leakage1, WdivL,
MeanX - PX/1.02e0 AS DX,
MeanY - PY/1.02e0 AS DY
FROM
Table0
WHERE
Area < LOG10(Size)*898e0 - 1535e0
),

Table2 AS
(
SELECT
%0:columns
Weight,
Size, CosDelta, SinDelta, DX, DY, M3Long, SlopeLong, Leakage1, WdivL,
SQRT(DX*DX + DY*DY) AS Norm
FROM
Table1
),

Table3 AS
(
SELECT
%0:columns
Weight,
Size, M3Long, SlopeLong, Leakage1, WdivL, Norm,
LEAST(GREATEST((CosDelta*DY - SinDelta*DX)/Norm, -1), 1) AS LX,
SIGN(CosDelta*DX + SinDelta*DY) AS Sign
FROM
Table2
),

Table5 AS
(
SELECT
%0:columns
Weight,
Size, Leakage1, WdivL, LX,
Norm          *0.0117193246260285378e0 AS Dist,
M3Long   *Sign*0.0117193246260285378e0 AS M3L,
SlopeLong*Sign/0.0117193246260285378e0 AS Slope
FROM
Table3
),

Table6 AS
(
SELECT
%0:columns
Weight,
Size, WdivL, Dist, LX, M3L, Slope,
1.39252e0 + 0.154247e0*Slope + 1.67972e0*(1-1/(1+4.86232e0*Leakage1)) AS Xi
FROM
Table5
),

Table7 AS
(
SELECT
%0:columns
Weight,
Size, Dist, LX,
IF (M3L<-0.07 OR (Dist-0.5e0)*7.2e0-Slope<0, -Xi, Xi) * (1-WdivL) AS Disp
FROM
Table6
)

SELECT
%0:columns
Weight,
(Disp*Disp + Dist*Dist - 2*Disp*Dist*SQRT(1-LX*LX)) AS ThetaSq,
%5:estimator AS LogEnergyEst
FROM
Table7
HAVING
ThetaSq<0.024


The placeholder %0:column is currently used to request MonteCarlo true values (such as Energy) in addition to the anaylsis values only for the analysis of simulated data. The must be no comma behind! In additon, %1:files is a placeholder for the table containing the FileIds to analyze, %2:runinfo for the table with the run info data, %3:events for the table with the image parameters and %4:positions for the table with the source positions in the camera.

### Analyze Data

The previous query is used to create a temporary table (Excess) with a common table expression (CTE). This table is referred to in the following query and produces a summary of the observation which is then stored in another temporary table (AnalysisData).

WITH Table0 AS
(
SELECT
INTERVAL(LogEnergyEst, %6:bins)  AS  .energy,
COUNT(IF(Weight>0, 1, NULL))     AS  Signal,
COUNT(IF(Weight<0, 1, NULL))/5   AS  Background
FROM
Excess
GROUP BY
.energy
ORDER BY
.energy
)
SELECT
*,
Signal - Background        AS Excess,
LiMa(Signal, Background)   AS Significance,
ExcErr(Signal, Background) AS ErrExcess
FROM
Table0


%6:bins is a placeholder for a binning if log-energy, e.g. 2.5, 3.0, 3.5, 4.0, 4.5, 5.0 (five bins between 2.5 and 5.0 plus underflow and overflow).

### Analyze Monte Carlo Data

Similarly to the analysis of Data, another query summarizes the MonteCarlo Analysis. The result is stored in a temporary table (AnalysisMC).

WITH Table0 AS
(
SELECT
Weight,
INTERVAL(Zd,            %6:theta)        AS .theta,
INTERVAL(LogEnergyEst,  %7:energyest)    AS .energyest,
INTERVAL(log10(Energy), %8:energysim)    AS .energysim,
(%9:spectrum)/pow(Energy, SpectralIndex) AS SpectralWeight,
LogEnergyEst - log10(Energy) AS Residual
FROM
Excess
),
Table1 AS
(
SELECT
.energyest,
.energysim,

-- Signal, Background, Excess
SUM(  IF(Weight>0,     ZdWeight*SpectralWeight,       0)) OVER EnergyEst  AS  SignalW,
SUM(  IF(Weight<0,     ZdWeight*SpectralWeight,       0)) OVER EnergyEst  AS  BackgroundW,
SUM(  IF(Weight>0, POW(ErrZdWeight*SpectralWeight,2), 0)) OVER EnergyEst  AS  SignalW2,
SUM(  IF(Weight<0, POW(ErrZdWeight*SpectralWeight,2), 0)) OVER EnergyEst  AS  BackgroundW2,
COUNT(IF(Weight>0, 1, NULL))                              OVER EnergyEst  AS  SignalN,
COUNT(IF(Weight<0, 1, NULL))                              OVER EnergyEst  AS  BackgroundN,

-- Threshold
SUM(    Weight *    ZdWeight*SpectralWeight   ) OVER EnergySim  AS  ThresholdW,
SUM(POW(Weight * ErrZdWeight*SpectralWeight,2)) OVER EnergySim  AS  ThresholdW2,
SUM(    Weight                                ) OVER EnergySim  AS  ThresholdN,

-- Estimators
SUM(IF(Weight>0,                 ZdWeight*SpectralWeight,  0)) OVER EnergySim  AS  SimW,
SUM(IF(Weight>0,     Residual*   ZdWeight*SpectralWeight,  0)) OVER EnergyEst  AS  EstSum,
SUM(IF(Weight>0,     Residual*   ZdWeight*SpectralWeight,  0)) OVER EnergySim  AS  SimSum,
SUM(IF(Weight>0, POW(Residual,2)*ZdWeight*SpectralWeight,  0)) OVER EnergyEst  AS  EstSum2,
SUM(IF(Weight>0, POW(Residual,2)*ZdWeight*SpectralWeight,  0)) OVER EnergySim  AS  SimSum2,

-- Migration
SUM(Weight * ZdWeight*SpectralWeight) OVER Migration  AS  MigrationW,
SUM(Weight                          ) OVER Migration  AS  MigrationN
FROM
Table0
INNER JOIN
ThetaHist USING(.theta)
WINDOW
EnergyEst AS (PARTITION BY .energyest),
EnergySim AS (PARTITION BY .energysim),
Migration AS (PARTITION BY .energysim,.energyest)
)
SELECT DISTINCT
*,
SignalW - BackgroundW/5 AS ExcessW,
SignalN - BackgroundN/5 AS ExcessN,
ExcErr(SignalW2, BackgroundW2/5) AS ErrExcessW,
ExcErr(SignalN,  BackgroundN /5) AS ErrExcessN,
IF(SignalW=0, 0, EstSum / SignalW) AS BiasEst,
IF(SimW   =0, 0, SimSum / SimW)    AS BiasSim,
IF(SignalW=0, 0, SQRT(EstSum2/SignalW - POW(EstSum/SignalW, 2))) AS ResolutionEst,
IF(SimW   =0, 0, SQRT(SimSum2/SimW    - POW(SimSum/SimW,    2))) AS ResolutionSim
FROM
Table1


The placeholders %6:theta, %7:energyest and %8:energysim are the binnings (as used previously) for the zenith angle and the logarithm (base 10) of the estimated and true energy. %9:spectrum is the (unknown) 'true' source spectrum, for example POW(Energy, -2.4).

### Summarize Corsika Production

The following queries produces a summary of the events simulated at first by Corsika. The result is stored in a temporary table (SimulatedSpectrum).

CREATE TEMPORARY TABLE SimulatedSpectrum
(
.energy SMALLINT UNSIGNED NOT NULL COMMENT 'Bin Index [MC Energy]',
CountN    DOUBLE            NOT NULL,
CountW    DOUBLE            NOT NULL,
CountW2   DOUBLE            NOT NULL,
PRIMARY KEY (.energy) USING HASH
) ENGINE=Memory
AS
(
SELECT
INTERVAL(LOG10(Energy), %0:energyest) AS .energy,
COUNT(*) AS CountN,
SUM(    (%1:spectrum)/pow(Energy, SpectralIndex)   ) AS CountW,
SUM(POW((%1:spectrum)/pow(Energy, SpectralIndex),2)) AS CountW2
FROM
MonteCarloFiles
LEFT JOIN
factmc.RunInfoMC USING (FileId)
LEFT JOIN
factmc.OriginalMC USING (FileId)
GROUP BY
.energy
ORDER BY
.energy
)


The placeholder %0:energyest is the binnings (as used previously) for the logarithm (base 10) of the estimated energy. %1:spectrum is the (unknown) 'true' source spectrum, for example POW(Energy, -2.4).

### Result (Spectrum)

This query combines the results from the data analysis (AnalysisData), the MonteCarlo analysis (!AnalysisMC) and the simulated data (SimulatedSpectrum) to calculate the final result. The result is stored in a temporary table (Spectrum). For convenience, bin edged are joined as well.

CREATE TEMPORARY TABLE Spectrum
(
.energy      SMALLINT UNSIGNED NOT NULL COMMENT 'Bin Index [Energy]' PRIMARY KEY,
lo             DOUBLE            NOT NULL COMMENT 'Lower edge of energy bin in lg(E/GeV)',
hi             DOUBLE            NOT NULL COMMENT 'Upper edge of energy bin in lg(E/GeV)',
Signal       DOUBLE            NOT NULL COMMENT 'Number of signal events',
Background   DOUBLE            NOT NULL COMMENT 'Average number of background events',
Excess       DOUBLE            NOT NULL COMMENT 'Number of excess events',
ErrSignal      DOUBLE            NOT NULL COMMENT 'Poisson error on number of signal events',
ErrBackground  DOUBLE            NOT NULL COMMENT 'Poisson error on number of background events',
ErrExcess    DOUBLE            NOT NULL COMMENT 'Error of excess events',
Significance DOUBLE            NOT NULL COMMENT 'Li/Ma sigficance',
ExcessN      DOUBLE            NOT NULL COMMENT 'Number of excess events in simulated data',
ExcessW      DOUBLE            NOT NULL COMMENT 'Weighted number of excess events in simulated data',
ErrExcessN   DOUBLE            NOT NULL COMMENT 'Error or number of excess events in simulated data',
ErrExcessW   DOUBLE            NOT NULL COMMENT 'Error of weighted number of excess events in simulated data',
SignalW        DOUBLE            NOT NULL COMMENT 'Weighted number of signal events in simulated data',
BackgroundW    DOUBLE            NOT NULL COMMENT 'Weighted number of background events in simulated data',
ErrSignalW     DOUBLE            NOT NULL COMMENT 'Error of weighted number of signal events in simulated data',
ErrBackgroundW DOUBLE            NOT NULL COMMENT 'Error of weighted number of background events in simulated data',
Flux           DOUBLE            NOT NULL COMMENT 'dN/dA/dt [m^-2 s-^1]',
ErrFlux        DOUBLE            NOT NULL COMMENT 'dN/dA/dt [m^-2 s-^1]',
Bias           DOUBLE            NOT NULL COMMENT 'Energy Bias, average residual in lg(E)',
Resolution     DOUBLE            NOT NULL COMMENT 'Energy resolution, standard divation of residual in lg(E)',
EfficiencyN    DOUBLE            NOT NULL COMMENT 'Simulated cut efficiency (weighted)',
EfficiencyW    DOUBLE            NOT NULL COMMENT 'Simulated cut efficiency (unweighted)',
ErrEfficiencyN DOUBLE            NOT NULL COMMENT 'Error of simulated cut efficiency (weighted)',
ErrEfficiencyW DOUBLE            NOT NULL COMMENT 'Error of simulated cut efficiency (unweighted)'
) ENGINE=Memory
AS
(
WITH ThetaSums AS
(
SELECT
SUM(CountN) AS CountSim,
SUM(OnTime) AS ObsTime
FROM
ThetaHist
),
ResultMC AS
(
SELECT
.energyest             AS .energy,
ANY_VALUE(SignalW)       AS SignalW,
ANY_VALUE(SignalW2)      AS SignalW2,
ANY_VALUE(BackgroundW)   AS BackgroundW,
ANY_VALUE(BackgroundW2)  AS BackgroundW2,
ANY_VALUE(SignalN)       AS SignalN,
ANY_VALUE(BackgroundN)   AS BackgroundN,
ANY_VALUE(ExcessW)       AS ExcessW,
ANY_VALUE(ExcessN)       AS ExcessN,
ANY_VALUE(ErrExcessW)    AS ErrExcessW,
ANY_VALUE(ErrExcessN)    AS ErrExcessN,
ANY_VALUE(BiasEst)       AS Bias,
ANY_VALUE(ResolutionEst) AS Resolution
FROM
AnalysisMC
GROUP BY
.energy
ORDER BY
.energy
)
SELECT
.energy, lo, hi,
Signal, Background/5 AS Background, Excess, ErrExcess, Significance,
SQRT(Signal)         AS ErrSignal,
SQRT(SignalW2)       AS ErrSignalW,
SQRT(Background)/5   AS ErrBackground,
SQRT(BackgroundW2)/5 AS ErrBackgroundW,
ExcessN, ExcessW, ErrExcessN, ErrExcessW, SignalW, BackgroundW,
AnalysisData.Excess/ResultMC.ExcessW*SimulatedSpectrum.CountW * 1000/(POW(10,hi)-POW(10,lo)) /(%0:area)/ObsTime / CountSim*ObsTime AS Flux,
AnalysisData.Excess/ResultMC.ExcessW*SimulatedSpectrum.CountW * 1000/(POW(10,hi)-POW(10,lo)) /(%0:area)/ObsTime / CountSim*ObsTime
* SQRT(
+ POW(AnalysisData.ErrExcess / AnalysisData.Excess, 2)
+ POW(ResultMC.ErrExcessW    / ResultMC.ExcessW,    2)
+ SimulatedSpectrum.CountW2  / POW(SimulatedSpectrum.CountW,2)
) AS ErrFlux,
Bias,
Resolution,
ResultMC.ExcessW/SimulatedSpectrum.CountW * CountSim/ObsTime AS EfficiencyW,
ResultMC.ExcessN/SimulatedSpectrum.CountN AS EfficiencyN,
( POW(ResultMC.ErrExcessW/ResultMC.ExcessW, 2) + POW(SQRT(SimulatedSpectrum.CountW2)/SimulatedSpectrum.CountW, 2) )
* POW(ResultMC.ExcessW/SimulatedSpectrum.CountW, 2) * CountSim/ObsTime AS ErrEfficiencyW,
( POW(ResultMC.ErrExcessN, 2) + POW(ResultMC.ExcessN, 2)/SimulatedSpectrum.CountN)/POW(SimulatedSpectrum.CountN, 2) AS ErrEfficiencyN
FROM
AnalysisData
INNER JOIN
ResultMC USING(.energy)
INNER JOIN
SimulatedSpectrum USING(.energy)
INNER JOIN
BinningEnergyEst ON .energy=bin
CROSS JOIN
ThetaSums
WHERE
AnalysisData.Excess>0
ORDER BY
.energy
)


%0:area is a placeholder for the maximum simulated area.

### Result (Threshold)

Similar to the previous query, the following query summarized results based on the simulated spectrum in bins of the simulated energy not the estimated energy.

CREATE TEMPORARY TABLE Threshold ENGINE=Memory AS
(
WITH
ThetaSums AS
(
SELECT
SUM(CountN) AS CountSim,
SUM(OnTime) AS ObsTime
FROM
ThetaHist
),
ResultMC AS
(
SELECT
.energysim             AS .energy,
ANY_VALUE(ThresholdW)    AS ThresholdW,
ANY_VALUE(ThresholdW2)   AS ThresholdW2,
ANY_VALUE(ThresholdN)    AS ThresholdN,
ANY_VALUE(BiasSim)       AS Bias,
ANY_VALUE(ResolutionSim) AS Resolution
FROM
AnalysisMC
GROUP BY
.energy
)
SELECT
.energy, lo, hi,
ThresholdW,
SQRT(ThresholdW2) AS ErrThresholdW,
ThresholdN,
SQRT(ThresholdN)  AS ErrThresholdN,
ThresholdW        * 1000/(POW(10,hi)-POW(10,lo)) / (%0:area) / CountSim*ObsTime AS Flux,
SQRT(ThresholdW2) * 1000/(POW(10,hi)-POW(10,lo)) / (%0:area) / CountSim*ObsTime AS ErrFlux,
Bias,
Resolution
FROM
ResultMC
INNER JOIN
BinningEnergySim ON .energy=bin
CROSS JOIN
ThetaSums
WHERE
ThresholdW>0 AND ThresholdW2>0
ORDER BY
.energy
)


%0:area is again a placeholder for the maximum simulated area.

### Result (Migration)

Similar to the previous queries, this one extracts what is called the 'Migration Matrix' in bins of simulated and estimated energy.

CREATE TEMPORARY TABLE Migration ENGINE=Memory AS
(
SELECT
.energyest,
.energysim,
BinningEnergySim.lo   AS EsimLo,
BinningEnergySim.hi   AS EsimHi,
BinningEnergyEst.lo   AS EestLo,
BinningEnergyEst.hi   AS EestHi,
ANY_VALUE(MigrationW) AS MigrationW,
ANY_VALUE(MigrationN) AS MigrationN
FROM
AnalysisMC
INNER JOIN
BinningEnergyEst ON .energyest=BinningEnergyEst.bin
INNER JOIN
BinningEnergySim ON .energysim=BinningEnergySim.bin
GROUP BY
.energyest, .energysim
ORDER BY
.energyest, .energysim
)