Friday, September 2, 2005

Screen scrapping targets

We want to get as many missing persons records into a single database as possible.

(1) Get techies to build software to "scrape" info from structured sites into a standard datamodel.

(2) Get volunteers to read forums and postings and manually enter that data into a standard datamodel.

Here are some of the places we need to have scraped/ entered into our standard datamodel. We'll of course be syndicating the unified database.

Lists of lists:
http://www.packrat-pro.com/katrina.htm
http://www.blessingsontheweb.com/

Red Cross has got their database up which is huge, so that is another good scrapping opportunity.
http://www.familylinks.icrc.org/katrina/people

The big list so far has been the Gulf Coast News site.
http://wx.gulfcoastnews.com/katrina/status.aspx

Our goal, again, is to combine all these databases into a single data model, offering a unified view of who is OK so refugees don't have to spend their time searching message boards and databases.

3 comments:

mrscake said...

Hi, we're interested in doing something similar. Can we join forces?

https://www.chosenspotdesign.com/mamaloca/bb/viewforum.php?f=5

mrscake said...

BTW, as far as the forums and message boards, it might be better to just automatically cache the relevant pages and write scripts to search them. I'm concerned that depending on volunteers to manually enter the information might become problematic.

marnie webb said...

David,

Is there a place people can go to volunteer?