uncompressed FITS file: human readable ASCII HEADER. 80 character lines - no endline character line which looks like: END marks the end of the header, but then still I see more empty lines... funny enough, the first non space characters start at adress: 0x00004ec0 = 20160 = 19kB (this was for file 20120117_016.fits.gz) so it seems, after the 'END ' line the header is filled with spaces so, the data starts at the next full kB address. an event with ROI 1024 has the size: 2952510 bytes this is made from the following fieds: event_num 4 byte Trigger_num 4 byte trigger_type 2 byte numboards 4 byte erorrs 4 x 1 byte softtrig 4 byte unixtime 2x 4byte boardtime 40x 4byte startcells 1440x 2byte startcellTIM 160x 2byte data 1474560 x 2byte = 1440 x 1024 x 2byte timeMarker 0x 2byte ???? the sum is 2952510 bytes, so the Timemarker field is really zero bytes. the header size is 3390 bytes, while the data size is 2880kB So the header is about 1.15 permille of the data. in case of roi = 300 this is about 4 permille. For the example of an pedestal file. All uncompressed header information is about 3329.5 kb for a file size of 2.75Gb. all the header information should be copied uncompressed into the output file. **BUT** since the compression algo does deliver unequal sized events the size of an event needs to be stored as well, so that we can still jump around in the file. We have two choices: 1 we store a table of addresses in the header of an event. A file might store something like 1000000 events and a jump adress is 4 bytes long, so this table might be as large as 4 Mb or so, 2 we prepend each event, with the address of its successor. This means in order to find event 10045 one has to read 10044 events first :-( Now the compression algorithm should work like this: open a calibration file and retrieve only the mean offset data. convert back to ADC counts. size of offset = 1440 x 1024 x 2 byte = 2880kB subtract offset from raw data. calculate (signed) diffs of subsequent slices. analyse diffs: find groups of |diffs| < 127 find groups of |diffs| < 31 define which groups are ideal :-) store compression group header: start value: 2byte size of followers = 4bits, 6 bits, 8 bits, or 16bits. --> 1byte number of followers = 0...1023 --> 2 bytes The last follower should maybe be the start value of the next group, so we can check for sanity. but on the other hand, this is just stupid. shift diff bits together. copy diff bits of followers behind the group header. The grouping algo should be smart, so that there are not a lot of small groups. There will be always some spikes of jumps in the data, so we can't assume to survive with only one group. I guess it will look like this 5-byte header: 16bit group 5 followers 10byte following. 5-byte header: 6bit group 600 followers 450 byte following. 5-byte header: 16bit group 3 followers -- spike 6byte following 5-byte header: 8bit group 80 followers 80 byte following. 5-byte header: 16bit group 7 followers -- jump 14byte following 5-byte header: 6bit group 323 followers 243 byte following. total event size: 833 bytes this is 41% of the uncompressed file size. thus one could end at 1.2GB compared to 1.7GB, but this has to be checked of course, and therefor the grouping algo needs to be implemented. shit