Changes between Version 1 and Version 2 of DatabaseBasedAnalysis/table2sql


Ignore:
Timestamp:
08/07/18 16:32:51 (6 years ago)
Author:
tbretz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DatabaseBasedAnalysis/table2sql

    v1 v2  
    133133Sometimes it might also be convenient to skip a leaf. This can be done with the `--ignore regex` option. If the given regular expression yields a match, the leaf will be ignored.
    134134
    135 Note that the regular expression works on the raw-name of the leaf not the readily mapped SQL column names. Example:
    136    --ignore=ThetaSq\..*
    137 will skip all leaved which start with 'ThetaSq.'. This option can be usedmore than once.
     135Note that the regular expression works on the raw-name of the leaf not the readily mapped SQL column names.
     136{{{
     137--ignore=ThetaSq\..*
     138}}}
     139will skip all leaves which name contains with 'ThetaSq.'. This option can be given more than once.
     140
     141The corresponding C++ call is `boost::regex_match(leafname, regex)` for `--ignore regex`.
    138142
    139143=== Column Data Types ===
    140144
    141 The data type of each column is kept as close as possible to the leaves' basic data types. If for some reason this is not wanted, the data type of the SQL column can be overwritten with `--sql-type sql-column/sql-type`. This works like the following:
    142 
    143    --sql-type=FileId/UNSIGNED INT
    144 while the first argument of the name of the SQL column to which the data type should be applied. The second column is the basic SQL data type. The option can be given more than once.
    145 
    146 
    147 
    148 
    149 == Changing column names ==
    150 
    151 Sometimes the leaves' names might be quite inconvenient like MTime.fTime.fMilliSec or just MHillas.fWidth. To allow to simplify column names, regular expressions (using boost's regex) can be defined to change the names. Note, that these regular expressions are applied one by one on each leaf's name. A valid expression could be:
    152    --map=MHillas\.f/
    153 which would remove all occurances of 'MHillas.f'. This option can be used more than once. They are applied in sequence. A single match does not stop the sequence.
    154 
    155 Sometimes it might also be convenient to skip a leaf. This can be done with the --ignore resource. If the given regular expresion yields a match, the leaf will be ignored. Note that the regular expression works on the raw-name of the leaf not the readily mapped SQL column names. Example:
    156    --ignore=ThetaSq\..*
    157 will skip all leaved which start with 'ThetaSq.'. This option can be usedmore than once.
    158 
    159 The data type of each column is kept as close as possible to the leaves' data types. If for some reason this is not wanted, the data type of the SQL column can be overwritten with --sql-type sql-column/sql-ytpe, for example:
    160    --sql-type=FileId/UNSIGNED INT
    161 while the first argument of the name of the SQL column to which the data type should be applied. The second column is the basic SQL data type. The option can be given more than once.
    162 
    163 Database interaction:
    164 
    165 To drop an existing table, --drop can be used.
    166 
    167 To create a table according to theSQL  column names and data types, --create can be used. The query used can be printed with --print-create even --create has not been specified.
    168 
    169 To choose the columns which should become primary keys, use --primary, for exmaple:
    170    --primary=col1
    171 To define more than one column as primary key, the option can be given more than once. Note that the combination of these columns must be unique.
    172 
    173 All columns are created as NOT NULL as default. To force a database engine and/or a storage format, use --engine and --rot-format.
    174 
    175 Usually, the INSERT query would fail if the PRIMARY key exists already. This can be avoided using the 'ON DUPLICATE KEY UPDATE' directive. With the --duplicate, you can specify what should be updated in case of a duplicate key. To keep the row untouched, you can just update the primary key with the identical primary key, e.g. --duplicate='MyPrimary=VALUES(MyPrimary)'. The --duplicate resource can be specified more than once to add more expressions to the assignment_list. For more details, see the MySQL manual.
    176 
    177 For debugging purpose, or to just create or drop a table, the final insert query can be skipped using --no-insert. Note that for performance reason, all data is collected in memory and a single INSERT query is issued at the end.
    178 
    179 Another possibility is to add the IGNORE keyword to the INSERT query by --ignore-errors, which essentially ignores all errors and turns them into warnings which are printed after the query succeeded.
     145The data type of each column is kept as close as possible to the leaves' basic data types. If for some reason this is not wanted, the data type of the SQL column can be overwritten with `--sql-type column/type`. This works like the following:
     146{{{
     147--sql-type=FileId/UNSIGNED INT
     148}}}
     149
     150In this example, ''FileId'' is the name of the SQL column for which the data type ''UNSIGNED INT'' should be used. The second column (UNSIGNED INT) is a basic SQL data type. The option can be given more than once.
     151
     152== Table Creation ==
     153
     154It is not very handy if the SQL table has to be created manually beforehand. Therefore, `root2sql` can automatically create the table. To do this, specify `--create`. The query which would be used to create the table can be printed with `--print-create` even when `--create` has not been specified. Tables are created only if the do not yet exist (`CREATE TABLE IF NOT EXISTS`). To ensure recreation of a table, the old table can be dropped with `--drop`. As any insert operation would fail if the required table is dropped and no new one is created, `--drop` always implies `--create`.
     155
     156All columns of the table are created as `NOT NULL` as default.
     157
     158=== Database Engine ===
     159
     160The default database engine is to create the table with the default of the database. If a database engine (`ENGINE=`) should be forced for the table the `--engine ENGINE` option can be used. Similarly, the row-format (`ROW_FORMAT=`) can be defined using `--row-format FORMAT`. For example:
     161
     162{{{
     163--engine InnoDB --row-format=COMPRESSED
     164}}}
     165
     166
     167=== Primary Keys ===
     168
     169For convenience, a primary key can be defined by the the columns which belong to the primary key.
     170
     171To choose the columns which should become primary keys, use `--primary`, for example:
     172{{{
     173--primary=column1 --primary=column2
     174}}}
     175
     176Note that the combination of these columns must be unique.
     177
     178== Ignore duplicate entries ==
     179
     180Usually, the `INSERT` query would fail if another row with the same primary key exists already.  This can be avoided adding the `IGNORE` keyword to the `INSERT` query by `--ignore-errors`, which essentially ignores all errors and turns them into warnings which are printed after the query succeeded.
     181
     182== Update duplicate entries ==
     183
     184Another option is to update duplicate entries. This can be done using the `ON DUPLICATE KEY UPDATE` directive. Giving `-duplicate`, you can specify what should be updated in case of a duplicate key. To keep the row untouched, you can just update the primary key with the identical primary key, e.g.
     185{{{
     186--duplicate='MyPrimary=VALUES(MyPrimary)'
     187}}}.
     188
     189The `--duplicate` resource can be specified more than once to add more expressions to the assignment_list. For more details, see the MySQL manual.
     190
     191== Debugging ==
     192
     193For debugging purpose, or to just create or drop a table, the final insert query can be skipped using `--no-insert`. All operation which would alter anything in the database are skipped using `--dry-run` (note that this might create consequent errors).
     194
     195Additional information about the connection are printed using `--print-connection`. Note that this implies additional queries and more traffic and is therefore slightly slower.
    180196
    181197Using a higher verbosity level (-v), an overview of the written columns or all processed leaves is printed depending on the verbosity level. The output looks like the following
    182    Leaf name [root data type] (SQL name)
     198{{{
     199Leaf name [root data type] (SQL name)
     200}}}
    183201for example
    184    MTime.fTime.fMilliSec [Long64_t] (MilliSec)
    185 which means that the leaf MTime.fTime.fMilliSec is detected to be a Long64_t which is filled into a column called MilliSec. Leaves with non basic data types are ignored automatically and are marked as (-n/a-). User ignored columnd are marked as (-ignored-).
     202{{{
     203MTime.fTime.fMilliSec [Long64_t] (MilliSec)
     204}}}
     205which means that the leaf MTime.fTime.fMilliSec is detected to be a Long64_t which is filled into a column called MilliSec. Leaves with non basic data types are ignored automatically and are marked as (-n/a-). User ignored columns are marked as (-ignored-).
     206
     207The complete `INSERT` query can be printed with `--print-insert`. Note that this can be a quite long output.
     208
     209== Performance ==
     210
     211Note that for performance reason, all data is collected in memory and a single INSERT query is issued at the end.
     212
     213== In case of failure ==
    186214
    187215If a query failed, the query is printed to stderr together with the error message. For the main INSERT query, this is only true if the verbosity level is at least 2 or the query has less than 80*25 bytes.
    188216
    189 In case of succes, 0 is returned, a value>0 otherwise.
    190 
    191 
    192 
    193 
    194 
     217== Returns ==
     218
     219In case of success, 0 is returned, a value>0 otherwise.
     220
     221
     222
     223
     224