wiki:DatabaseBasedAnalysis/rootifysql

Version 9 (modified by tbretz, 6 years ago) ( diff )

--

First get the man page of rootifysql either with rootifysql --help or (if installed) with man rootifysql.

Purpose

The idea of rootifysql is to conveniently request data from a MySQL database and write the contents returned into a tree in a root-file. Root trees offer a lot of nice features how to access the data fast and how to create plots easily.

For details, please refer either to the Draw-function of TTree (e.g. https://root.cern.ch/doc/master/classTTree.html#a73450649dc6e54b5b94516c468523e45) or checkout the root user manual (https://root.cern.ch/guides/users-guide), in particular the section about Trees.

In the following basic knowledge on C++, root and root trees is assumed.

Connection

How a connection to the database is established is outlined in: DatabaseBasedAnalysis/Connection. Per default, your connection will be compressed if the the host-address of the mysql-server is neither localhost nor 127.0.0.1. To force compression use

uri fact:password@ihp-pc45.ethz.ch/factdata?compress=1

Generally, compression should be enable if connecting from the world. To test compression, the "Hello World''" example can be used.

In the following it is assumed that the database connection is working and tested, so that the resources (uri=...) are omitted.

Hello World!

To test if everything is working correctly, we run a "Hello World''" example. Credentials should be in the rootifysql.rc which should be write protected.

user@machine> rootifysql -d -n --query="SELECT 'Hello World\!'"
Reading global  options from 'fact++.rc'.
Reading default options from 'rootifysql.rc'.

------------------------ Rootify SQL -------------------------
Connecting to database...
Client Version: 5.7.23
Server Version: 5.7.23-0ubuntu0.18.04.1
Requesting data...
Opening file '/dev/null' [compression=1]...
Writing data to tree 'Result'
Trying to setup 1 branches...
Configured 0 branches.
Filling branches...

# Hello World!
Hello World!


1 rows fetched.
1 rows filled into tree.
0 B written to disk.
File closed.
Execution time: 0.058136s (58.1 ms/row)
--------------------------------------------------------------

Voila!

The -d option (a shortcut for --display) prints the result on standard out, the -n option (a shortcut to --null) redirects all output to /dev/null instead to a file. For benchmarking your query, you can also add --no-fill which does not only write to a root tree at /dev/null but does not fill the root tree at all.

Query in the file

Instead of typing your query on the command line, you can store in a text-file, for example

user@machine> echo "SELECT 'Hello World\!'" > query.sql
user@machine> rootifysql -d -n query.sql
[...]
------------------------ Rootify SQL -------------------------
Reading query from file 'query.sql'.
[...]

# Hello World!
Hello World!

[...]
--------------------------------------------------------------

A nice feature of bash is that you can make the query.sql file executable. First add the executable in the first line preceded by #!. It might then look like:

#!/path/to/rootifysql
SELECT 'Hello World!'

Now make it executable for yourself

user@machine> chmod u+x query.sql

and you can call your query like

user@machine> ./query.sql [options]

with the [options] from rootifysql. If you prefix your query file with

#!/path/to/rootifysql -C /path/to/resources.rc

you can even have a dedicated resource file.

Connection information

To get more information on the connection to the database, use --print-connection, for example

user@machine> user@machine> rootifysql -d -n --print-connection query.sql
[...]
------------------------ Rootify SQL -------------------------
[...]
Compression of database connection is ON
Connection to databases is ENCRYPTED (DHE-RSA-AES256-SHA)
[...]
--------------------------------------------------------------

Bytes_sent       398 
Bytes_received   546 

Note that this data is retrieved through additional requests, which in principle slow down the execution time and add additional traffic to the connection.

Writing to an ASCII file

We can redirect the result of the query to an ASCII file by

rootifysql -v 0 -d -n query.sql > output.txt

A verbosity level of 0 (-v 0) is required to turn off all the other output to standard out.

If you want to keep it you can directly write to a file by

rootifysql -n query.sql -w output.txt

Here -w is a shortcut to --write.

Writing to a root-file

Basics

Let's try our previous example

fact@ihp-pc45:~/FACT++/build> rootifysql -f -v 2 --query="SELECT 'Hello World\!'"
[...]
------------------------ Rootify SQL -------------------------
[...]
Opening file 'rootify.root' [compression=1]...
Writing data to tree 'Result'
Trying to setup 1 branches...

 - Hello World! [VARCHAR NOT NULL] {V}

Configured 0 branches.
Filling branches...
1 rows fetched.
1 rows filled into tree.
4 kB written to disk.
File closed.
Execution time: 0.0775619s (77.6 ms/row)
--------------------------------------------------------------

The option -f (which is a shortcut for --force) actually overwrites an existing root file (RECREATE). The verbosity option -v 2 prints some information on how the table columns are translated. As the output suggests, 1 branch could have been setup (one column was returned), but no branch was configured. Our "Hello World''" string is returned as a VARCHAR (or CHAR), but only values are supported. The leading - tells us that the column was skipped, the V is an internal abbreviation for the action (and C for CHAR). So second try:

fact@ihp-pc45:~/FACT++/build> rootifysql -f -v 2 --query="SELECT 'Hello World\!', 1"
[...]
------------------------ Rootify SQL -------------------------
[...]
Opening file 'rootify.root' [compression=1]...
Writing data to tree 'Result'
Trying to setup 2 branches...

 - Hello World! [VARCHAR NOT NULL] {V}
 + 1 [BIGINT NOT NULL] {n}

Configured 1 branches.
Filling branches...
1 rows fetched.
1 rows filled into tree.
5 kB written to disk.
File closed.
Execution time: 0.064065s (64.1 ms/row)
--------------------------------------------------------------

Now one branch was setup successfully (indicated by the +) because it of a basic (numerical) type (indicated by the n).

All columns of a numerical type (n) get converted to a branch of type double in the root-file. This keeps reading the root-file simple. All SQL columns of types DATE, DATETIME and TIME are converted to UNIX time and also written as a double-branch.

To have another way to skip columns, all variables defined in a query are skipped:

fact@ihp-pc45:~/FACT++/build> rootifysql -f -v 2 --query="SELECT 'Hello World\!', 1, @a:=5"
[...]
------------------------ Rootify SQL -------------------------
[...]
Trying to setup 3 branches...

 - Hello World! [VARCHAR NOT NULL] {V}
 + 1 [BIGINT NOT NULL] {n}
 - @a:=5 [BIGINT NOT NULL] {@}

1 branches skipped due to name starting with @.
Configured 1 branches.
[...]
--------------------------------------------------------------

Some general root file options

The output file name can be changed with --out and the tree name with --tree.

fact@ihp-pc45:~/FACT++/build> rootifysql -f --query="SELECT 1" --out newfile.root --tree MyTree
[...]
Opening file 'newfile.root' [compression=1]...
Writing data to tree 'MyTree'
[...]

To update en existing file use --update.

fact@ihp-pc45:~/FACT++/build> rootifysql --update --query="SELECT 1" --out newfile.root --tree MyNewTree
[...]
Opening file 'newfile.root' [compression=1]...
Writing data to tree 'MyTree'
[...]

Note that this will always create a new tree in an existing file. You can not update an existing tree.

The compression can be changed with -c (short for --compression)

fact@ihp-pc45:~/FACT++/build> rootifysql --update --query="SELECT 1"
[...]
Opening file 'rootifysql.root' [compression=9]...
[...]

Changing the branch name

This can be circumvented by giving it a name

fact@ihp-pc45:~/FACT++/build> rootifysql -f -v 2 --query="SELECT 'Hello World\!', 1, @a:=5 AS MyName"
[...]
------------------------ Rootify SQL -------------------------
[...]
Trying to setup 3 branches...

 - Hello World! [VARCHAR NOT NULL] {V}
 + 1 [BIGINT NOT NULL] {n}
 + MyName [BIGINT NOT NULL] {n}

Configured 2 branches.
[...]
--------------------------------------------------------------

Usually the name of the root-branch coincides with the name of the column, but it can easily be overwritten in the query with the AS directive as illustrated above.

Ignoring columns

Now assuming you want to create a very general query but then skip some column in the root-file (usually, it is more efficient to just don't query them, because they are transferred from the server nevertheless). In this case you can do:

fact@ihp-pc45:~/FACT++/build> rootifysql -f -v 2 --query="SELECT 'Hello World\!', 1, @a:=5 AS MyName" --ignore MyName
[...]
------------------------ Rootify SQL -------------------------
[...]
Trying to setup 3 branches...

 - Hello World! [VARCHAR NOT NULL] {V}
 + 1 [BIGINT NOT NULL] {n}
 - MyName [BIGINT NOT NULL] {-}

1 branches skipped due to ignore list.
Configured 1 branches.
[...]
--------------------------------------------------------------

The --ignore option can be specified more than once and it takes regular expression. So --ignore M.* would skip all columns starting with My.

A simple request

rootifysql - Converts the result of a mysql query into a root file

For convenience, this documentation uses the extended version of the options, refer to the output below to get the abbreviations.

Writes the result of a mysql query into a root file. For each column, a branch is created of type double with the field name as name. This is usually the column name if not specified otherwise by the AS mysql directive.

Columns with CHAR or VARCHAR as field type are ignored. DATETIME, DATE and TIME columns are converted to unix time (time_t). Rows containing any file which is NULL are skipped if not suppressed by the --ignore-null option. Ideally, the query is compiled in a way that no NULL field is returned. With the --display option the result of the request is printed on the screen (NULL skipping still in action). This can be useful to create an ascii file or to show results as 'SHOW DATABASES' or 'EXPLAIN table'. To redirect the contents into an ascii file, the option -v0 is useful. To suppredd writing to an output file --null can be used.

The default is to read the query from a file called rootify.sql. Except if a different filename is specified by the --file option or a query is given with --query.

As a trick, the rootify.sql file can be made excutable (chmod u+x rootify.sql). If the first line contains '#!rootifysql', the script can be executed directly.

Columns whose name start with @ are skipped. If you want them in your output file give them a name using AS, e.g. 'SELECT @A:=5 AS A'.

You can use variables in your sql query like @MyVar and define them on the command line. In this example with --var.MyVar=5

You can use environment definitions for substitutions in your SQL query. For example --env.TEST=5 would replace $TEST or ${TEST} in your query by 5.If you specifiy one environmentvariable more than once, a list is created. For example --env.TEST=1 --env.TEST=2 --env.TEST=3 would substitute $TEST or ${TEST} by '1, 2, 3'. This is useful for the SQL IN keyword. You can also read the values for an enviroment substitution from a file (one element per line), e.g. --env.TEST=file.txt. Empty lines and lines starting with a # are skipped.

Comments in the query-file can be placed according to the SQL standard inline /*comment*/ or introduced with # (shell script style) or -- (SQL style).

In case of succes, 0 is returned, a value>0 otherwise.

Note: See TracWiki for help on using the wiki.