This is a simple script to
- analyse httpd log files as daily batch and push into database,
- display via web interface as summary.
This system is consisted of two modules and database.
- log file analysis tool, and push results into DB (used as daily batch)
- web based display tool, just read from database
Database stores raw lines and daily summary count data, on
- page view,
- count on page and referrer pair,
- count on page and browser ID pair.
See dbdef.sql for SQL.
To save size, all string based values are stored by reference ID, for all of accessed page, referrer, and browser ID.
LogBase
__init__(c_db)
:c_db
as object instance of databaseParseFile(fname)
:fname
as target filename, parse all lines withParseLine
and register all into databaseParseLine(line)
: will be inherited by child class,line
as one line and returns hash of line contents, called byParseFile
LogApache
: implementation ofLogBase
for Apache combined log
Several log loading commands are provided. (For now, only Apache 'combined' type log line is supported)
parse_daily.py
: analyze log files (siteconfiglog_fname
) in directories listed in configuration (common/sitelist.json
), to be used as cronparse_init.py
: analyze log files all matching to siteconfigloghead
in directories listed in configuration (common.sitelist.json
), to be used for bulk initparse_file.py
:parse_file.py <log-file-name> <sitename>