Tuesday, January 6, 2009

Writing webbots using Python

If you ever wanted to write a webbot and didn't know how, it could be easily achieved using the mechanize module.

There is one thing that mechanize does which doesn't always suit my needs, which is to pay attention to the robots.txt file, so I just disable it.

Mechanize allows you to easily browse, extract data and submit forms.

import mechanize

br = mechanize.Browser()
br.set_handle_robots(False)
br.open("http://www.google.com")

for link in br.links():
print link

No comments: