hosted by CEDAR HepForge

Backends

Abstract Writer

This is the main abstract interface that every concrete harvest::writer has to implement.

The AbstractWriter has been introduced to allow for a technology agnostic dataharvester::writer, one of the design goals stated in the section about DesignGoals. Concrete implementations of this class are specific to one technology.

Currently there exist implementations for:

  • Text file (almost CSV) - [hierarchical]
  • HDF5 [flat]
  • dbf [flat] (removed in later versions)
  • ROOT file formats [flat]
  • XML
  • Gzip
  • Sqlite
  • HBook [flat,read-only]

Concrete implementations are written not only for file formats, but also for transport protocols. Currently there are:

  • file://
  • dhtp:// (data harvester transport protocol - currently discontinued)

Desirable other backends are:

  • AIDA
  • ODBC / (My|Postgre)SQL / Frontier / Oracle
  • Special-purpose sequential binary format
  • .xls | .ods

Backends: Status

What features do the different backends implement?

The writers:

respect file modes hierarchic data
Text yes yes
Hdf yes no, desirable
Root yes no, desirable
Xml yes yes
Tnt yes no
Tcp yes yes?
Sqlite no? no
Gzip no, desirable yes

The readers:

hierarchic data
Text yes
Sqlite no
Hdf no, desirable
Root no, desirable
Xml yes
Dbf no
Gzip yes
HBook no

Backends: Subtleties

Various backends have various idiosyncracies that are due to specifics of the different file formats.

idiosyncracy
Python Python backend has no bools. Ints are used instead.
Dbf DBase strings have fixed length. The length is 60,
but can also be given with the column name: "column@length",
or set with 'Helper:MaxStringLength' (?)
Xml '@' are replaced by '_' in column names
Xml If filename contains '[]' brackets, the content of the bracket is used as the 'doctype'.
Default: Harvest
Example: filename[doctype].xml
Xml Filename '--.xml' refers to stdout.
Tnt Filename '--.tnt' refers to stdout.
Text Filenames '--.txt', '--', 'cout', and 'stdout' refer to stdout.
Text Filename 'color_out' refers to stdout, colored version.
Hdf Hierarchic data are 'made flat' currently
Hdf hdf files are pytables-compatible!
Root Hierarchic data are 'made flat' currently
Root Root Branchnames are allowed to have only 64 characters
HBook Uses RootReader - all root configurables are valid