Scraping the entire internet is a chore, I’d imagine. I hear Google has a couple PIIs running FreeBSD that are doing this very task. With that said, a spider that detects security threats before they happen is brilliant. The paper’s available here.
Speaking of Google, it’s probably one of the most dangerous tools out there for hackers just because it creates a database of web-content. The number of junk servers that fail to implement even the simplest precautions (like an .htaccess file) is astounding. Simply searching for a vulnerable plugins files will often lead you to a plethora of soon to be infected machines.