IPND audit | ACMA

IPND audit

The ACMA commissions periodic audits of the IPND. The 2009–10 IPND audit is the fourth in the series of audits conducted, under contract, by Gibson Quai-AAS and Data Analysis Australia. The previous three audits were conducted in 2004, 2005 and 2006 under the same contract.

The following provides an overview of the aggregate industry-wide findings of the 2009–10 audit, and compares the results with previous audits.

Quantity of records

As of November 2009, the IPND contained 54,768,428 records, an increase of 8,131,425 records from November 2006 (see Table 1).

Table 1: Number of fixed and mobile services for 2004, 2005, 2006 and 2009–10

2004

2005

2006

2009–10

Fixed services

22,928,297

57.6%

23,818,875

54.9%

25,058,824

53.7%

28,144,989

51.4%

Mobile services

16,971,703

42.4%

19,524,840

45.1%

21,578,179

46.3%

26,623,439

48.6%

All services

39,900,00

100%

43,343,715

100%

46,637,003

100%

54,768,428

100%

Overall address matching to Geo-coded National Address File (G-NAF) database

A scoring algorithm was developed with Gibson Quai-AAS and Data Analysis Australia, where using a search and match process, each IPND record is compared to all G-NAF records and the best matching G-NAF record is identified. This address matching is run on the service address only.

The score for a particular IPND record is determined by how well each address component of that IPND record matches the corresponding G-NAF record. Points are given for various matching fields; for example, exactly matching street names accrue 110 points while street names that merely sound alike (using the 'Soundex' coding system) accrue only 40 points. The Soundex coding system attempts to give the same value to words that have similar or readily confused sounds. For example, the misspelling of street name Elizabeth as Elizabth would not score an exact match (110 points) but may score 40 points for a soundex match on Elizabeth.

A high score indicates an accurate match to G-NAF for address fields. A low score indicates an inaccurate address, perhaps where a match could only be made at the suburb level or little address information was available in the IPND.

The paragraphs below explain in more detail and show how address matching scores have been grouped into four categories for ease of presentation:

  1. 300+ = highly accurate useabilityScores above 300 use most of the address fields and have perfect matches in those fields. Variations in scores at the top of the scale are due to additional useful address information like street number suffix being available and matching accurately.

  2. 200–299 = high/good useabilityAddresses obtaining a score between 200 and 299 would generally be considered to have sufficient information to adequately or uniquely describe a location. Key address fields like street name and locality might have a soundex match. Soundex matching is used when a perfect match failed, and can be attributed to spelling errors.

  3. 100–199 = reasonable useabilityImportant fields for identifying the exact location of an address start to fail when a match is attempted to G-NAF.

  4. 0–99 = very poor useability/unusable

Addresses with scores in this range have poor matches and would have insufficient information to adequately or uniquely describe a location. For example, locality and street name do not match a G-NAF address in this score range.

Modifications to audit processing

Prior to the 2006 audit, the ACMA found that in the past years, the scoring algorithm had been applying a score for 'Street Number 1' and 'Street Number 2' in circumstances where the field was blank or numbered zero. As a result, following consultation between the ACMA and the industry, the following changes were made to the 'service address' full address matching scoring algorithms used for the 2006 audit, and subsequently used for the 2009–10 audit:

  1. Points are no longer scored by the full address matching algorithm for missing 'Street Numbers' or where 'Street Numbers' are zero (applicable for both 'street number 1' and 'street number 2').

  2. Different uses of Mt/Mount and St/Saint and 'City' extensions in the 'Locality' field are accepted, as well as removing whitespace and punctuation (for example, OCONNOR compared to O'CONNOR).

  3. Validly missing 'street types' are scored.

Results presented for 2004, 2005 and 2006B are reported in accordance with the original methodology. Results for 2006 and 2009–10 use the latest methodology and are directly comparable.

Complete address match accuracy – service address

The following tables show the aggregate industry-wide audit results for full address matching of 'service address' fields for 2009–10, 2006, 2006B, 2005 and 2004. The results demonstrate an overall improvement for both fixed and mobile services since the audits began in 2004.

The major finding for the 2009–10 audit is that there is significant improvement in the quality of records, evidenced by the improvement in the 'service address' full address matching score. The result of 96% scoring 200+ compared to 89% in 2006 is an improvement of 7 percentage points since the last audit.

Table 2: Industry-wide 2009–10 audit results

Service Address Complete Address match to G-NAF (matching score)

Observed 2009–10

Fixed services

Mobile services Total services

Highly accurate (>=300)

70.1%

69.7%

69.9%

High/good useability (>=200)

96.1%

94.8%

95.5%

Reasonable useability (>=100)

99.7%

99.4%

99.6%

Very poor useability/unusable (<100)

0.3%

0.6%

0.4%

Table 3: Industry-wide 2006 audit results

Service Address Complete Address match to G-NAF (matching score)

Observed 2006

Fixed services

Mobile services

Total services

Highly accurate (>=300)

67.5%

57.9%

63.1%

High/good useability (>=200)

92.4%

85.4%

89.2%

Reasonable useability (>=100)

99.3%

98.6%

99.0%

Very poor useability/Unusable (<100)

0.7%

1.4%

1.0%

Note: The following 2006B, 2005 and 2004 results presented are directly comparable but not to those above due to changes in audit methodology.

Table 4: Industry-wide 2006B audit results

Service Address Complete Address match to G-NAF (matching score)

Observed 2006B

Fixed Services

Mobile Services

Total Services

Highly accurate (>=300)

82.1%

61.9%

72.8%

High/good useability (>=200)

96.8%

91.8%

94.5%

Reasonable useability (>=100)

99.7%

99.6%

99.6%

Very poor useability/Unusable (<100)

0.1%

0.1%

0.1%

Table 5: Industry-wide 2005 audit results

Service Address Complete Address match to G-NAF (matching score)

Observed 2005

Fixed services

Mobile services Total services

Highly accurate (>=300)

79.3%

35.2%

59.1%

High/good useability (>=200)

96.8%

85.2%

91.5%

Reasonable useability (>=100)

99.8%

98.9%

99.4%

Very poor useability/Unusable (<100)

0.2%

1.1%

0.6%

Table 6: Industry-wide 2004 audit results

Service Address Complete Address match to G-NAF (matching score)

Observed 2004

Fixed services

Mobile services

Total services

Highly accurate (>=300)

79.6%

34.6%

60.2%

High/good useability (>=200)

84.6%

83.2%

89.7%

Reasonable useability (>=100)

99.6%

98.8%

99.3%

Very poor useability/Unusable (<100)

0.4%

0.8%

0.6%

Addresses for fixed line services continue to generally obtain better address matching results than mobile services, but the gap is closing. In 2009–10, around 70.1% (2006: 67.5%) of fixed service addresses obtained a score of at least 300 (suggesting a highly accurate address), whereas 69.7% (2006: 57.9%) of mobile service addresses obtained the same score.

Testing of individual address fields

In addition to full address matching, validity tests were conducted on the individual components of the address. While the validity tests are run independently of the matching process of the scoring algorithm, they can be used to highlight areas where data is of a poor quality and would therefore be unlikely to achieve a good match with G-NAF. Validity tests are run on service addresses and some are run on directory addresses.

Validity tests check that address components (i.e. locality, state, etc) for each IPND record

  1. are in the correct format and structure including:

  2. exist in a list of valid addresses stored in G-NAF, including whether:

Non- address fields

The 2009–10 audit has identified that non-address fields continue to be generally correctly populated, particularly where a prescribed code list is provided. The exception to this remains, as with previous audits, the 'Type of Service' field, where an example code list is available but not necessarily adopted by the data providers.

These non-mandatory fields also include those that have fewer restrictions on what the contents should be. In general the audit is unable to test the validity of the contents, only that a field is correctly populated in accordance with the IPND code and guidelines. The 'Service Type' field (Business, Govt, Residential, etc) is an example where the field can be checked to ensure it is correctly populated with agreed codes, however in this example it is observed that one of the values occurring most frequently (that is, for 30.7% of records) is 'N' for not known.

Key findings of 2009–10 audit

Solid overall improvement

A number of carriage service providers have improved the quality of their data over that observed in the 2006 audit, leading to an overall improvement in the scores achieved.

Table 7 compares the results of key fields that are either missing or invalid over the 2005, 2006 and 2009–10 audits, showing clear improvement over time.

Table 7: Comparison of key missing/invalid fields for 2005, 2006 and 2009–10 audits

Audit Test

2009–10

2006

2005

Invalid 'Street Number 1'

5,344

367,424

1,423,031

Missing 'Street Type'

3.6%

12.6%

26.1%

Invalid 'Locality'

1.5%

2.7%

8.2%

Invalid 'State'

0.1%

5.4%

24.3%

Street name and locality pair

An improvement has been made to the most important information being the 'Street Name' and 'Locality' pair. The audit measurement of this pair showed a reduction in the percentage scored as invalid from 24.1% in 2006 to 11.6% in 2009–10.

Service address data is more often achieving a direct match

Overall the IPND service address data quality is much improved, requiring the Soundex matching process in far fewer instances to record a score.




Back to top