Backends
Abstract Writer
This is the main abstract interface that every concrete harvest::writer has to implement.
The AbstractWriter has been introduced to allow for a technology agnostic dataharvester::writer, one of the design goals stated in the section about DesignGoals. Concrete implementations of this class are specific to one technology.
Currently there exist implementations for:
- Text file (almost CSV) - [hierarchical]
- HDF5 [flat]
- dbf [flat] (removed in later versions)
- ROOT file formats [flat]
- XML
- Gzip
- Sqlite
- HBook [flat,read-only]
Concrete implementations are written not only for file formats, but also for transport protocols. Currently there are:
Desirable other backends are:
- AIDA
- ODBC / (My|Postgre)SQL / Frontier / Oracle
- Special-purpose sequential binary format
- .xls | .ods
Backends: Status
What features do the different backends implement?
The writers:
| respect file modes | hierarchic data | |
| Text | yes | yes |
| Hdf | yes | no, desirable |
| Root | yes | no, desirable |
| Xml | yes | yes |
| Tnt | yes | no |
| Tcp | yes | yes? |
| Sqlite | no? | no |
| Gzip | no, desirable | yes |
The readers:
| hierarchic data | |
| Text | yes |
| Sqlite | no |
| Hdf | no, desirable |
| Root | no, desirable |
| Xml | yes |
| Dbf | no |
| Gzip | yes |
| HBook | no |
Backends: Subtleties
Various backends have various idiosyncracies that are due to specifics of the different file formats.
| idiosyncracy | |
| Python | Python backend has no bools. Ints are used instead. |
| Dbf | DBase strings have fixed length. The length is 60, |
| but can also be given with the column name: "column@length", | |
| or set with 'Helper:MaxStringLength' (?) | |
| Xml | '@' are replaced by '_' in column names |
| Xml | If filename contains '[]' brackets, the content of the bracket is used as the 'doctype'. |
| Default: Harvest | |
| Example: filename[doctype].xml | |
| Xml | Filename '--.xml' refers to stdout. |
| Tnt | Filename '--.tnt' refers to stdout. |
| Text | Filenames '--.txt', '--', 'cout', and 'stdout' refer to stdout. |
| Text | Filename 'color_out' refers to stdout, colored version. |
| Hdf | Hierarchic data are 'made flat' currently |
| Hdf | hdf files are pytables-compatible! |
| Root | Hierarchic data are 'made flat' currently |
| Root | Root Branchnames are allowed to have only 64 characters |
| HBook | Uses RootReader - all root configurables are valid |