-
Couldn't load subscription status.
- Fork 5
Digital Database of Microbial Phenotypes. Like an online Bergey's Manual.
Couldn't load subscription status.
rec3141/ddmp
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This database is composed of phenotypic information on Bacteria and Archaea, as listed in the primary literature and Bergey's Manual of Systematic Bacteriology.
Contributors to date: R. Eric Collins
FILES:
volume2-tables.txt: Bergey's Manual Volume 2, The Proteobacteria
volume3-tables.txt: Bergey's Manual Volume 3, The Firmicutes
volume4-tables.txt: Bergey's Manual Volume 4, The Bacteroidetes, Spirochaetes, Tenericutes (Mollicutes), Acidobacteria, Fibrobacteres, Fusobacteria, Dictyoglomi, Gemmatimonadetes, Lentisphaerae, Verrucomicrobia, Chlamydiae, and Planctomycetes
NOTES:
Volume 1, the Archaea and Deeply-Branching Bacteria, is not available in digital format
Volume 5, the Actinobacteria, will be released in early 2012
PIPELINE:
- get text out of PDFs
-- pdftotext -layout -nopgbrk -enc UTF-8 <file>
- get tables out of text
-- grep -h -A 100 -B 1 -e 'TABLE' <file>
- decode messed up Unicode in Volume 2
-- unicode2not.pl <file>
- fix whitespace
-- tabit.pl <file>
- fix formatting
-- manually in Kate using Block Selection mode
-- examples include cleaning up 1-column tables, multi-page tables
-- multi-line captions were replaced using regex:
--- Find: (TABLE.*)\n([a-z]{2,}.*)\s*
--- Replace: \1 \2
About
Digital Database of Microbial Phenotypes. Like an online Bergey's Manual.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published