Wednesday, April 16, 2008

Python: Parse Apache log to sqlite database

This Python script was written for a friend in Australia, as part of his Ph.D project. The script will parse the Apache server log into sqlite3 database.

Apache server log like this: - - [26/Sep/2007:21:20:36 +0800]
“GET /forum/Themes/BlueStory/images/bbc/ftp://ftp.gif HTTP/1.1? 200 191
“Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:
Gecko/20070914 Firefox/
could be parse to an sqlite database, with the class below.
class Parser:
    def __init__(self, serverLog, db):
        if db.strip() == "":
        db = "log.db"
            conn = sqlite3.connect(db)
            cursor = conn.cursor()
            cursor.execute("create table if not exists log
            (ip, date, time, gmt, request,
            errorcode, bytes, referel, osa)" )
        except sqlite3.Error, error:
            wx.MessageBox(str(error), "Info")

        numLog = len(open(serverLog).readlines())
        for line in open(serverLog, "r"):
            data = []
            a = line.split(‘"‘)
            line = line.split()


        data.append(line[3][line[3].index(":") + 1:])


        data.append(line[5][1:] + " " + line[6])


        data.append(line[9]) #bytestr data.append(line[10][1:-1])


            cursor.execute( "insert into log values(
             ?, ?, ?, ?, ?, ?, ?, ?, ?)",
            (data[0], data[1], data[2], data[3],
            data[4], data[5], data[6],
            data[7], data[8]) )
        except sqlite3.Error, error:
            wx.MessageBox(str(error), 'Info')