Tuesday, September 6, 2005

Overall PeopleFinder Project Goals

The Katrina PeopleFinder Project is a massively parallel volunteer effort to solve the problem of dispirate refugee/ missing persons databases and forums.

  1. Enter unstructured data on refugees from forums across the web to the highest data quality standards possible with volunteers giving a little as one hour of their time.
  2. Enter data from databases across the web into the central database via the PeopleFinder Interchange Format
  3. Minimize duplicate records
  4. Support other organizations in implementing the PeopleFinder Interchange Format
  5. Make the central database avaliable to be searched
  6. Use the Salesforce API to implement innovative technology solutions to the missing persons problem

(1) Is currently implemented by dividing forums like Craigslist into "chunks" of about 25 records through software and/or volunteers. Volunteer data entry people "claim" a "chunk" and enter it into the central database. They follow instructions on how to enter the data to maximize the data quality. This effort entered over 68,000 records in less than 36 hours.

(2) Is currently implemented by software engineers either scraping or transforming existing databases into that PeopleFinder Interchange Format (PFIF) and then loading that data into the central database OR by missing persons database owners implementing the full ProjectFinder Interchange Format (PFIF) wich allows an RSS feed of refugee data to be passed from database to database.

(3) Is handled by trying to coordinate among all the different teams and volunteers so that duplication of effort is minimized. The "chunking" and record claiming process is also critical.

(4) Is handled by contacting organizations to make them aware of the PFIF format and helping them decide whether to implement it. We might also try to provide some volunteer assistance to sites in implementing PFIF.

(5) Is being handled by a search interface into our main Salesforce.com data repository.

(6) Once data is in the repository it should be processed to try to match missing people to found people, facilitate communication and just generally help refugees out in any way possible via data and technology.

This is a massively parallel volunteer effort. Please figure out a small part to play in these very large goals. Enlist some people to help, make sure other people know what you are doing and just go to it.

No comments: