Tuesday, January 6, 2009

Writing webbots using Python

If you ever wanted to write a webbot and didn't know how, it could be easily achieved using the mechanize module.

There is one thing that mechanize does which doesn't always suit my needs, which is to pay attention to the robots.txt file, so I just disable it.

Mechanize allows you to easily browse, extract data and submit forms.

import mechanize

br = mechanize.Browser()
br.set_handle_robots(False)
br.open("http://www.google.com")

for link in br.links():
print link

Using Python and MySQL

I use the MySQLPython module, which is very easy to use.
To return results as a dictionary or as a tuple, we use the DictCursor to fetch rows.

import MySQLdb
import MySQLdb.cursors

conn = MySQLdb.connect(
host = "localhost",
user = "xxx",
passwd = "xxx",
db = "xxx",
cursorclass = MySQLdb.cursors.DictCursor)

cur = conn.cursor()

cur.execute("select id, address from phonebook where city_id = 0")

data = cur.fetchall()