Of these 412 cases 83 were missing latitude-longitude data. Although many of the 412 cases were listed with location information down to the village level, a large proportion was listed only down to the location selleck bio or sublocation level. A cursory analysis also revealed that an irregular mosaic of spatial information existed in the data set, in accordance with the notations detailed in the metadata. For instance, the same latitude-longitude was assigned to several individuals who had a heterogeneous set of division and sublocation data; or an individual could be listed with a village name but no latitude-longitude. A more detailed analysis revealed individuals assigned to villages but with the wrong higher-level division assignment and revealed that some latitude-longitude data, when plotted in a Geographic Information System (GIS), placed the individual in a location incongruous with the assigned division.
As far as possible, we corrected these irregularities and verified all locations with a detailed, cross-checked survey of digital/GIS gazetteer data obtained from the International Livestock Research Institute (ILRI), NASA WorldWind 1.4, and Google Earth 4.2. Detailed data regarding the ultimate source(s) of information for all cases were recorded in the spreadsheet throughout the process. Ultimately, many cases could not be spatially resolved below the centroid of division or location/sublocation, and of the 412 cases requiring spatial data, 61 could not be spatially resolved with confidence and were excluded, leaving 339 case locations.
When the 339 locations were collated by latitude-longitude we produced a data set with 94 points each representing one to 35 cases, distributed among the seven Kenya districts. A similar set of human RVF case data was obtained from the WHO in May 2007, describing 64 points each representing one to 25 cases from early December 2006 to late April 2007 across Somalia, Kenya, and Tanzania. This dataset was a compilation of the epidemiological data that were shared by national authorities with WHO during field investigation and comprise only a confirmed RVF case or probable case with no laboratory results; cases with negative laboratory results were not included in the data set provided by WHO. In this data set, relatively more information was provided to the village level than the MoH-K/CDC-K data set, but in some instances no latitude-longitude data were present.
We carried out a similar iterative geocoding search as was done for the CDC data set using several digital/Global Positioning System (GPS) gazetteers and we were able to assign latitude-longitude data to all cases. The two data sets were integrated for post outbreak evaluation. The Sudan Ministry AV-951 of Health and WHO data set for Sudan, IPM data set for Madagascar, and NCID data for South Africa were evaluated for geo-coding accuracy in a similar manner.