uncompressed FITS file:

human readable ASCII HEADER.
80 character lines - no endline character

line which looks like:
END   <space chars fill up to 80>

marks the end of the header, but then 
still I see more empty lines...

funny enough, the first non space characters
start at adress: 0x00004ec0 = 20160 = 19kB
(this was for file 20120117_016.fits.gz)
so it seems, after the 'END    ' line the header is filled with
spaces so, the data starts at the next full kB address.

an event with ROI 1024 has the size:
2952510 bytes
this is made from the following fieds:
event_num       4 byte
Trigger_num     4 byte
trigger_type    2 byte
numboards       4 byte
erorrs          4 x 1 byte
softtrig        4 byte
unixtime        2x 4byte
boardtime       40x 4byte 
startcells      1440x 2byte
startcellTIM    160x 2byte
data            1474560 x 2byte  = 1440 x 1024 x 2byte
timeMarker      0x 2byte ????

the sum is 2952510 bytes, so the Timemarker field is really zero bytes.
the header size is 3390 bytes, while the data size is 2880kB
So the header is about 1.15 permille of the data.

in case of roi = 300 this is about 4 permille.

For the example of an pedestal file. All uncompressed 
header information is about 3329.5 kb for 
a file size of 2.75Gb.

all the header information should be copied uncompressed
into the output file.
**BUT** since the compression algo does deliver 
unequal sized events the size of an event needs to be stored
as well, so that we can still jump around in the file.

We have two choices:
1
we store a table of addresses in the header of an event.
A file might store something like 1000000 events 
and a jump adress is 4 bytes long, so this table 
might be as large as 4 Mb or so,
2
we prepend each event, with the address of its successor.
This means in order to find event 10045 one has to read 10044 events first :-(


Now the compression algorithm should work like this:
open a calibration file and retrieve only the mean offset 
data. convert back to ADC counts.
size of offset = 1440 x 1024 x 2 byte = 2880kB 

subtract offset from raw data.

calculate (signed) diffs of subsequent slices.
analyse diffs:
    find groups of |diffs| < 127
    find groups of |diffs| < 31
define which groups are ideal :-)

store compression group header:
start value: 2byte
size of followers = 4bits, 6 bits, 8 bits, or 16bits. --> 1byte
number of followers = 0...1023 --> 2 bytes

The last follower should maybe be the start value
of the next group, so we can check for sanity.
but on the other hand, this is just stupid.

shift diff bits together.

copy diff bits of followers behind the group header.

The grouping algo should be smart, so that there are not 
a lot of small groups. There will be always some 
spikes of jumps in the data, so we can't assume to survive 
with only one group.

I guess it will look like this
5-byte header: 16bit group 5 followers
10byte following.
5-byte header: 6bit group 600 followers
450 byte following.
5-byte header: 16bit group 3 followers -- spike
6byte following
5-byte header: 8bit group 80 followers
80 byte following.
5-byte header: 16bit group 7 followers -- jump
14byte following
5-byte header: 6bit group 323 followers
243 byte following.

total event size: 833 bytes
this is 41% of the uncompressed file size.

thus one could end at 1.2GB compared to 1.7GB,

but this has to be checked of course, and therefor
the grouping algo needs to be implemented.

shit




