|
Geocoding is the process of linking data to a geographic location by
comparing data representations contained in two or more files. In the
INDEA geocoding application developed for EBC, the aim is to identify
the Electoral District (ED) and Voter Area (VA) corresponding to an
input address, by associating this address with a road segment in the
INDEA.
Two types of geocoding are implemented, namely address geocoding and
point geocoding. This was necessary to handle the two types of road
layers (standard and source) in the INDEA. These differ in that each
road segment in the standard layer has an explicit ED/VA code as well
as address ranges associated with each side, while those in the source
layer have address ranges but no ED/VA codes.
For roads in the standard layer, address geocoding is realized by
first identifying the INDEA coverage road segment whose address range
contained the input address. Since each road in the standard layer has
an associated ED/VA for each side, address geocoding for roads in this
layer then simply involves returning the ED/VA code corresponding to
the appropriate side of the identified segment. For roads in the source
layer, the road segment containing the address is first identified, and
interpolation is then used to estimate the location (i.e., latitude and
longitude) corresponding to the address. A point-in-polygon function is
then used to determine the ED/VA polygon containing this location.
In order to geocode, the input address must first be matched with
its counterpart in the database. A major problem here is that street
names, street directions, etc., may not always have exactly the same
spelling, and so the process for finding matches must allow for "fuzzy"
matching. The INDEA geocoding application achieves a high degree of
success in the matching, even when the address data is inconsistent or
inexact, by using the commercial address-matching product
MatchWare/PACE (MatchWare Technologies, Inc., recently acquired by
Vality Technology Inc.) to perform address standardization and matching
based on probabilistic schemes. (Integration of MatchWare/PACE into the
geocoding required the previous generation of two MatchWare databases,
each containing standardized addresses.) The matching also involves the
use of special-purpose algorithms for using MatchWare/PACE to perform
matching under various conditions, and to use its output to identify
optimal and/or reliable matches.
The use of MatchWare/PACE, in conjunction with the strategies for
matching and the techniques for address and point geocoding, provides
an effective system for satisfying the geocoding requirements of EBC.
Previous: INDEA Functionality
Next: INDEA Data Management
Back to About INDEA
|