= Connecting to the Database = == Host == The database is hosted at '''ihp-pc45.ethz.ch''' == User == First, you need a user. For the moment, a user 'fact' is available with the standard password. The user 'fact' is allowed to connect from everywhere if an encrypted connection is used. Usually, all reasonably recent mysql clients are using encrypted connections. So in most cases, a simple should be enough {{{> mysql -C -h ihp-pc45.ethz.ch -u fact -p factdata}}} To enforce encryption, --ssl (oder clients) or --ssl-mode=REQUIRED can be used. If you have problems with the connection, you can also try --protocol=TCP. If you access the database from outside of ETH, it is wise to enable compression with the -C option. Inside ETH (in particular on ihp-pc45), enabling -C is certainly a performance drawback and should be avoided. Note that the mysql client libraries at ISDC are too old and do not allow for encrypted connections. Thus no connection from ISDC is possible without tunnel. How to tunnel your connection is explained in the following. Note that it requires an account on ihp-pc45 (which I think should not be generally available). Thus this is mainly meant as a solution for automatic processes running at ISDC, for example, to update the database. == Forward Tunnel == If you are logged in at ISDC as 'user' and you have an account 'ethz' at ihp-pc45, you can use a tunnel. To setup a tunnel use {{{ISDC> ssh -x -C -n -N -q -L 10000:localhost:3306 ethz@ihp-pc45.ethz.ch}}} (It is wise to enable compression of the connection with the -C option) Note that after log-in this process seems to stall (nothing happens anymore). This is correct. The tunnel is open. It will forward the local port 10000 from the ISDC machine to the port 3306 on a machine which is accessible as 'locahost' from ihp-pc45. The mysql call would now look like {{{> mysql -h 127.0.0.1 -P 10000 -u fact -p factdata}}} Note that you need to use the IP address instead of localhost, otherwise the mysql client tries to use a socket connection (which will fail). You could also use --protocol=TCP. As the mysql connection now comes via the loopback interface and not via the external IP, the connection of the mysql client is allowed to be unencrypted. == Backward Tunnel == Assume that you are already logged into ihp-pc45.ethz.ch and want to execute a mysql at ISDC accessing ihp-pc45, a backward tunnel can be used: {{{ihp-pc45> ssh -x -C -n -N -q -R 10000:localhost:3306 user@isdc-nx.isdc.unige.ch}}} (It is wise to enable compression of the connection with the -C option) This command will log you into isdc-nx and (in parallel) create a tunnel from port 10000 at isdc-nx to port 3306 of a machine which is called 'localhost' from where you started the ssh connection (ihp-pc45). You can now do {{{ISDC> mysql -h localhost -P 10000 -u fact -p factdata}}} Note that you need to use the IP address instead of localhost, otherwise the mysql client tries to use a socket connection (which will fail). You could also use --protocol=TCP. As the mysql connection now comes via the loopback interface and not via the external IP, the connection of the mysql client is allowed to be unencrypted. == rootifysql == A convenient way to retrieve data is the ''rootifysql'' tool which is part of the FACT++ package. More details can be found either calling it with the --help option or at https://www.fact-project.org/logbook/showthread.php?tid=4192. The same access rules as if the native mysql client is used apply. To simplify the usage, it is wise to write a rootifysql.rc file with the following contents: {{{uri=fact:password@iph-pc45.ethz.ch/factdata}}} The following tutorial assumes that such a file exists. == Other alternatives == Many possibilities exist to access a mysql database as a C API, MySQL++, Python (MySQL.Connector), PHP and others. You are free to use whatever tool you like. In the following, an analysis will be outlined using the ''rootifysql'' client and because it is most convenient. == PhpMyAdmin == To get a fast glimpse on the accessible databases and tables, you can log-in to PhpMyAdmin at http://iph-pc45.ethz.ch/phpmyadmin = The Analyis = == Data Selection == For data selection only run-wise information should be relevant which are stored in the table '''RunInfo'''. The reason is that if you select data on a more fine grained level (e.g. event-wise zenith angle), right now there is no easy method to determine the corresponding observation time. So whenever data is selected event-wise make sure that you do not cut the data in a variable which cuts out events systematically and not randomly. For example, an event-wise cut in zenith angel usually keeps or discards two consecutive events because their zenith angle is correlated. For a cut in any image parameter (Width, Length, Size, ...), the result on two consecutive events is random because their image parameters are not correlated. As an example we analyse the Crab data from our public sample (01/11/2013 - 06/11/2013). Let's first have a look at the total observation time of all Crab data in this period: {{{ SELECT COUNT(*), SUM(TIME_TO_SEC(TIMEDIFF(fRunStop,fRunStart))*fEffectiveOn/3600) AS EffOnTime, MIN(fZenithDistanceMin) AS MinZd, MAX(fZenithDistanceMax) AS MaxZd, MIN(fR750Cor/fR750Ref) AS MinQ, MAX(fR750Cor/fR750Ref) AS MaxQ FROM RunInfo WHERE fSourceKey=5 AND fRunTypeKey=1 AND FileId BETWEEN 131101000 AND 131107000 }}} The result (in mysql) is {{{ +----------+-------------+-------+-------+---------+---------+ | COUNT(*) | EffOnTime | MinZd | MaxZd | MinQ | MaxQ | +----------+-------------+-------+-------+---------+---------+ | 435 | 32.53992354 | 6.36 | 67.89 | 0.01477 | 1.10584 | +----------+-------------+-------+-------+---------+---------+ 1 row in set (0.01 sec) }}} So we have 435 data runs from Crab with a total effective observation time of 32.5 hours in a zenith angle range between 6° and 68° and a bad weather factor between 0.01 (really bad) to 1.1 (extremely good). Taking only good data by adding "{{{AND fR750Cor/fR750Ref>0.9}}}" to the WHERE-clause gives us {{{ +----------+-------------+-------+-------+---------+---------+ | COUNT(*) | EffOnTime | ZdMin | ZdMax | MinQ | MaxQ | +----------+-------------+-------+-------+---------+---------+ | 328 | 24.58955887 | 6.36 | 67.86 | 0.90084 | 1.10584 | +----------+-------------+-------+-------+---------+---------+ 1 row in set (0.00 sec) }}} But we also want to restrict ourselves to "good" zenith angles (zenith angles at which there is no significnat efficiency loss). So we add "{{{AND fZenithDistanceMax<35}}}" to the WHERE-clause which yields {{{ +----------+-------------+-------+-------+---------+---------+ | COUNT(*) | EffOnTime | ZdMin | ZdMax | MinQ | MaxQ | +----------+-------------+-------+-------+---------+---------+ | 244 | 19.06608557 | 6.36 | 34.90 | 0.90084 | 1.10584 | +----------+-------------+-------+-------+---------+---------+ 1 row in set (0.00 sec) }}} Now we need to get a list of these runs with {{{#!sql SELECT FileId FROM RunInfo WHERE fSourceKey=5 AND fRunTypeKey=1 AND FileId BETWEEN 131101000 AND 131107000 AND fR750Ref/fR750Cor>0.9 AND fZenithDistanceMax<35 }}} This can later be '''JOIN'''ed with the following queries. Let's write the list of runs into a file. There are plenty of options. Here are a few (assuming the query is in a file query.sql) {{{ mysql -N [...] < query.sql > Crab.txt cat query.sql | mysql -N [...] > Crab.txt rootifysql [...] -n -v0 -d query.sql > Crab.txt rootifysql [...] -n -w Crab.txt query.sql }}} The [...] is placeholder for additional options, in particular the login credentials (ideally they are kept in a file which can not be read by everyone). An alternative is to put {{{ #!/path/to/rootifysql --config=/path/to/resources.rc }}} in the first line of your query.sql. Make it executable "{{{chmod u+x query.sql}}}" and put your credentials (uri=) into resources.rc. Now you can call it directly {{{ ./query.sql -n -w Crab.txt }}} == Data statistics == Now we want to get some statistics about the Crab data between 01/11/2013 and 06/11/2013 and see if we can do simple plots. For this we write the table into a root file. {{{ rootifysql [...] --query " \ SELECT \ * \ FROM \ RunInfo \ WHERE \ fSourceKey=5 \ AND \ fRunTypeKey=1 \ AND \ FileId BETWEEN 131101000 AND 131107000 \ AND \ fR750Ref/fR750Cor>0.9 \ AND \ fZenithDistanceMax<35 \ " }}} If the file already exists, either give it a different name (see --help for details) or overwrite it with --force. The output should look similar to this {{{ Reading global options from 'fact++.rc'. Reading default options from 'rootifysql.rc'. ------------------------ Rootify SQL ------------------------- Connecting to database... Client Version: 5.7.23 Server Version: 5.7.23-0ubuntu0.18.04.1 Requesting data... Opening file 'rootify.root' [compression=1]... Trying to setup 120 branches... Configured 115 branches. Filling branches... 317 rows fetched. 317 rows skipped due to NULL field. 0 rows filled into tree. 10 kB written to disk. File closed. Execution time: 0.0537751s (169.6 us/row) -------------------------------------------------------------- }}} Per default rows which contain any NULL are not written to the file because all values are converted to a DOUBLE and there is no representation for a NULL-value in double. So, we need to force the output with {{{--ignore-null}}} and will get something like: {{{ Reading global options from 'fact++.rc'. Reading default options from 'rootifysql.rc'. ------------------------ Rootify SQL ------------------------- Connecting to database... Client Version: 5.7.23 Server Version: 5.7.23-0ubuntu0.18.04.1 Requesting data... Opening file 'rootify.root' [compression=1]... Trying to setup 120 branches... Configured 115 branches. Filling branches... 317 rows fetched. 317 rows filled into tree. 86 kB written to disk. File closed. Execution time: 0.072247s (227.9 us/row) -------------------------------------------------------------- }}} Now we can open the file in root and do plots. The easiest ist to use the tree viewer: {{{ root rootify.root root [0] Attaching file rootify.root as _file0... root [1] TTree *T = (TTree*)_file0->Get("Result"); root [2] T->StartViewer(); root [3] }}} Or plot the zenith angle distribution directly: {{{ root rootify.root root [0] Attaching file rootify.root as _file0... root [1] TTree *T = (TTree*)_file0->Get("Result"); root [2] T->Draw("fZenithDistanceMean"); root [3] }}} or zenith angle versus time (Hint: DATETIME columns are converted to Unix-time in seconds): {{{ root rootify.root root [0] Attaching file rootify.root as _file0... root [1] TTree *T = (TTree*)_file0->Get("Result"); root [2] T->Draw("fZenithDistanceMean:fRunStart"); root [3] }}} == Data retrieval == The events themselves are stored in a table named '''Events'''. The position of the source in camera coordinates is stored in '''Position''. To get them, you can run the following query {{{ SELECT Events.*, Position.X, Position.Y FROM RunInfo LEFT JOIN Events USING (FileId) LEFT JOIN Position USING (FileId, EvtNumber) WHERE fSourceKey=5 AND fRunTypeKey=1 AND FileId BETWEEN 131101000 AND 131107000 AND fZenithDistanceMax<35 AND fR750Ref/fR750Cor>0.9 }}} or with the list you wrote before {{{ SELECT Events.*, Position.X, Position.Y LEFT JOIN Events USING (FileId) LEFT JOIN Position USING (FileId, EvtNumber) WHERE FileId IN ($MyList) }}} using {{{--list.MyList=Crab.txt}}} as command-line option to rootifysql. Both queries are similar in execution time. Let's assume the output file is ''crab-data-only.root'' ({{{rootifysql --out=crab-data-only.root}}}). Requesting the data and writing the file took me about 60s. To run an analysis on the data you can use the following root macro "''ganymed.C''". It produces a theta-square plot. Its execution took about 5s ({{{root ganymed.C++}}}) {{{ #include #include #include #include #include void ganymed() { // Create chain for the tree Result // This is just easier than using TFile/TTree TChain c("Result"); // Add the input file to the c.AddFile("crab-data-only.root"); // Define variables for all leaves to be accessed // By definition rootifysql writes only doubles double X, Y, MeanX, MeanY, Width, Length, CosDelta, SinDelta, M3Long, SlopeLong, Leakage1, SlopeSpreadWeighted, Size, ConcCore, ConcCOG, NumIslands, NumUsedPixels; // Connect the variables to the cordesponding leaves c.SetBranchAddress("X", &X); c.SetBranchAddress("Y", &Y); c.SetBranchAddress("MeanX", &MeanX); c.SetBranchAddress("MeanY", &MeanY); c.SetBranchAddress("Width", &Width); c.SetBranchAddress("Length", &Length); c.SetBranchAddress("CosDelta", &CosDelta); c.SetBranchAddress("SinDelta", &SinDelta); c.SetBranchAddress("M3Long", &M3Long); c.SetBranchAddress("SlopeLong", &SlopeLong); c.SetBranchAddress("Leakage1", &Leakage1); c.SetBranchAddress("NumIslands", &NumIslands); c.SetBranchAddress("NumUsedPixels", &NumUsedPixels); c.SetBranchAddress("SlopeSpreadWeighted", &SlopeSpreadWeighted); c.SetBranchAddress("Size", &Size); c.SetBranchAddress("ConcCOG", &ConcCOG); c.SetBranchAddress("ConcCore", &ConcCore); // Set some constants (they could be included in the database // in the future) double mm2deg = +0.0117193246260285378; double abberation = 1.02; // -------------------- Source dependent parameter calculation ------------------- // Create a histogram for on- and off-data TH1F hon("on", "", 55, 0, 1); TH1F hoff("off", "", 55, 0, 1); // Loop over all events TStopwatch clock; for (int i=0; i5.5 && Leakage1<0.1; bool cut0 = SlopeSpreadWeighted<0.68 && SlopeSpreadWeighted>0.18 && log10(area*mm2deg*mm2deg)<(log10(Size)-2)*1.1-1.55 && ConcCore>0.13 && ConcCOG >0.15; if (!cutq || !cut0) continue; // Loop over all wobble positions in the camera for (int angle=0; angle<360; angle+=60) { // -------------------- Source dependent parameter calculation ------------------- double cr = cos(angle*TMath::DegToRad()); double sr = sin(angle*TMath::DegToRad()); double px = cr*X-sr*Y; double py = cr*Y+sr*X; double dx = MeanX - px/abberation; double dy = MeanY - py/abberation; double dist = sqrt(dx*dx + dy*dy); double alpha = asin((CosDelta*dy - SinDelta*dx)/dist); double sgn = TMath::Sign(1., (CosDelta*dx + SinDelta*dy)/dist); // ------------------------------- Application ---------------------------------- double m3l = M3Long*sgn*mm2deg; double slope = SlopeLong*sgn/mm2deg; // --------------------------------- Aanlysis ----------------------------------- double xi = 1.39 + 0.154*slope + 1.679*(1-1/(1+4.86*Leakage1)); double sign1 = m3l+0.07; double sign2 = (dist*mm2deg-0.5)*7.2-slope; double disp = (sign1<0 || sign2<0 ? -xi : xi)*(1-Width/Length)/mm2deg; double thetasq = (disp*disp + dist*dist - 2*disp*dist*cos(alpha))*mm2deg*mm2deg; // Fill the on- and off-histograms if (angle==0) hon.Fill(thetasq); else hoff.Fill(thetasq, 1./5); } } clock.Print(); // Plot the result hon.SetMinimum(0); hon.DrawCopy(); hoff.DrawCopy("same"); } }}} Some statistics: == Simple Analysis ==