V.2 Address Numbers (‘House’ Numbers), Normalization and Formats: HNI, HNS and HND
Address numbers identify buildings, and are combined with street names and addressable place names (see Chapter III.6) or with street codes (as surrogates of street names or place names) to form addresses. Address numbers are commonly called ‘house’ numbers (although this term is a misnomer, since many addresses refer to buildings other than houses). To be consistent with common parlance and with other Geosupport documentation, the term ‘house number’ will be used instead of ‘address number’ in the remainder of this document, except in literal citations of Geosupport reject messages, since those messages use the term ‘address number’.
Applications can pass a house number to any of the address-processing functions in character form, in the 12-byte WA1 input House Number field for MSW and the 16-byte WA1 input House Number field for COW. A house number passed in this manner need not be in any particular format, but could be a ‘raw’, un-formatted house number. Alternatively, house numbers can be passed in a 6-byte WA1 input field in a special Geosupport format called the House Number in Internal format (HNI), which presumes, the application will have obtained the HNI from a previous Geosupport call. HNIs are only used with MSW. A newer Geosupport format called House Number in Sort Format (HNS) is used for COWs.
When a house number is passed to Geosupport in the 12-byte WA1 for MSW or the 16-byte WA1 for COW input House Number field, Geosupport normalizes it. The house number normalization algorithm is complex, and a full description of it is beyond the scope of this document, but some aspects are discussed below. If normalization is successful, an output normalized house number is produced in two standard formats, the 12-byte or 16-byte output House Number in Display format (HND) and the 6-byte output House Number in Internal format (HNI) or the 11-byte House Number in Sort format (HNS), and both of these are returned to the application in WA1. The HND is in character form and is suitable for display, for example, on application screens, reports and mailing labels. While the HNS format contains character data, it is intended for Geosupport internal use. To conserve space, users may store this value in their files.
The HNI format contains packed decimal data, and is the format that Geosupport uses internally to perform its address-matching routines. The HNI is not documented in detail herein, and is of little direct relevance to most users. However, to conserve disk space in application files in which house numbers must be stored in some form, users can store the 6-byte HNI in their files rather than the 12-byte HND for MSW or the 11-byte HNS in their files rather than the 16-byte HND for COW, and then use any of the display functions, Functions D, DG and DN, to obtain the house number in HND format for display, as described below.
Processing of HNIs or HNSs by the Display Functions
The processing of an input HNI or HNS by a display function consists only of forming and outputting the HND. The successful processing of an input HNI or HNS by a display function implies that the HNI or HNS conforms to Geosupport’s format requirements for HNIs or HNSs, but does not imply that the HNI or the HNS forms part of a valid address.
The display functions can process up to two input HNI or HNS values in a single call, using the two input HNI or HNS fields and two output HND fields in WA1. If two input HNIs or HNSs are supplied, they are processed independently of each other and are not treated as forming an address range. If only one input HNI or HNS is supplied, it may be passed in either of the input HNI or HNS fields.
The display functions return one output HND for each validly formatted input HNI or HNS. For each input HNI or HNS that is invalid, the display functions return all question marks (the character ‘?’) in the corresponding output HND field. In addition, if at least one input HNI or HNS is invalid, the GRC value ‘13’, Reason Code value ‘9’ and corresponding Message are issued.
The display functions can also be used to obtain street names corresponding to input street codes. (The processing of street codes by the display functions is discussed in detail in Chapter IV.6.) In a single call, the display functions can process input HNIs or HNSs without input street codes, input street codes without HNIs or HNSs or both types of input. If both HNIs or HNSs and street codes are provided as input data to a display function call, they are processed independently of each other and are not treated as forming an address. In particular, the display functions perform no address validation.
HNIs or HNSs as Input to the Address-Processing Functions
The user has the option of providing input house numbers to the address-processing functions in the form of an HNI or HNS instead of a ‘raw’ unprocessed house number. This feature is useful for processing an application file that already contains house numbers in HNI or HNS format from a previous pass through Geosupport. The use of this feature slightly improves execution efficiency by allowing Geosupport to circumvent the house number normalization routine.
House Number Format Standards
'Raw' (un-normalized) input house numbers must conform to certain Geosupport standards, which are based on the characteristics of New York City’s addresses. If an input house number does not satisfy these standards, Geosupport is unable to normalize it and rejects the call. The house number standards include the following, among others:
• Conformance to a set of allowable characters
• A limitation on the total length of the ‘basic house number’ (this term and the term ‘house number suffix’ are defined below)
• Limitations on the number of digits and maximum numeric values of the basic house number, if it does not contain a hyphen; or such limitations on the portions of the basic house number preceding and following the hyphen, if a hyphen is present
• Validity of the house number suffix (discussed below), if one is present
Every valid New York City house number conforms to the above standards.
The ability of Geosupport to normalize an input house number does not by itself signify that that house number, together with the input borough and street, form in combination a valid New York City address. Successful normalization signifies only that the input house number conforms to Geosupport’s house number format criteria. Only the successful completion of a two-work-area call to one of the address-processing functions has significance with respect to the geographic validity of the input address. (See Chapter II.4 for a discussion of the distinction between the validations performed by one- and two-work-area calls.)
New York City house numbers consist of a ‘basic house number’, possibly followed by a ‘house number suffix’. (Note: the basic house number and house number suffix are not to be confused with the digits to the left and right of the hyphen in a hyphenated house number. For example, in the Queens address ‘240-55 1/3 DEPEW AVENUE’, ‘240-55’ is the basic house number, and is hyphenated; ‘1/3’ is the house number suffix.) A dash character may appear in the input house number field between the basic house number and the house number suffix, e.g. 22-GARAGE. Geosupport replaces the dash with a blank and processing continues. No message is generated for this situation.
Only a small percentage of New York City addresses have house number suffixes. The following are some examples of valid New York City addresses containing house number suffixes (highlighted in bold type):
519 Front East 12th Street (Manhattan)
625 Rear Smith Street (Brooklyn)
120 1/2 First Avenue (Manhattan)
240-55 1/3 Depew Avenue (Queens)
469 1/4 Father Capodanno Boulevard (Staten Island)
470 A West 43rd Street (Manhattan)
171C Auburn Avenue (Staten Island)
20-29 Garage 120th Street (Queens)
Input basic house numbers may contain a dash(the character ‘-‘), which can serve either as a hyphen, as with most house numbers in Queens and some house numbers in other boroughs, or as a range separator.
• House Number Ranges: Addresses in New York City are often expressed in ranges, using a dash to separate the low and high house numbers of the range. For example, 22-28 Reade Street in Manhattan represents the range of even addresses consisting of 22 Reade Street, 24 Reade Street, 26 Reade Street and 28 Reade Street, all of which are valid individual addresses for the same building. In other words, in this example, the character string ‘22-28’ is not an individual house number, but represents a range of house numbers, in which the dash serves as a range separator, and the number to the left of the dash, 22, as well as that to the right of the dash, 28, constitute by themselves valid individual house numbers for Reade Street.
• Hyphenated House Numbers: Consider the Queens address 22-28 36th Street. The house number portion of the address, 22-28, consists of the same character string as the above Reade Street example, but it has a very different meaning in the two cases. In the Reade Street case, 22-28 represents a range of even house numbers; in the 36th Street case, 22-28 is a single hyphenated house number, not a range of several unhyphenated house numbers. In a hyphenated house number, the digits to the left and to the right of the hyphen in combination form a single house number; the digits on one side of the hyphen are not by themselves geographically meaningful. For example, 22 36th Street and 28 36th Street are not valid Queens addresses. In addition, the position of the hyphen within a hyphenated house number is significant. For example, consider the addresses 13-103 41st Avenue and 131-03 41st Avenue. These are two distinct addresses on the same Queens street, even though the house numbers consist of the same sequence of digits and differ only in the position of the hyphen.
Geosupport’s house number normalization algorithm interprets a dash encountered in an input house number either as a hyphen or as a range separation character, depending on the borough, the street (some streets do not conform to the norm for their borough with respect to house number hyphenation) and other criteria.
• When Geosupport interprets the dash as a range separation character: In normalizing the input house number, both the dash itself and the portion of the basic house number to the right of the dash are deleted. As one consequence of this, when the input to a two-work-area call is an address range, only the address formed from the house number to the left of the dash is validated; the house number to the right of the dash is ignored and no conclusion can be drawn about its validity from the success or failure of the call. For example, 22-28 Reade St in Manhattan is normalized as 22 READE STREET; the ‘28’ is ignored during normalization, and is not validated as an individual house number in a two-work-area call.
• When Geosupport interprets the dash as a hyphen: In normalizing the input house number, the digits on both sides of the hyphen are retained, as is the hyphen itself.
If Geosupport determines that an input house number in character form has a missing or inappropriately present dash, then whenever it is feasible, Geosupport modifies the house number to correct the error before normalizing it. (Geosupport never modifies input HNIs or HNSs.) Geosupport will make such a modification automatically (without user request), but only if the intended address is clear and unambiguous and is valid for the function being called, and a valid address could not be formed by normalizing the input house number in a different fashion. Two types of such dash-related modifications are as follows:
• When an input house number does not contain a dash, but Geosupport determines that the house number should be hyphenated: Geosupport inserts a hyphen, provided it can determine the proper position of the hyphen unambiguously so that a valid address results. For example, the input address 6603 Booth Street in Queens is normalized as 66-03 BOOTH STREET; the input address 63101 Alderton Street in Queens is normalized as 63-101 ALDERTON STREET.
• When an input house number contains a dash, but Geosupport determines that the presence of the dash is erroneous (i.e., the house number is invalid whether the dash is interpreted as a hyphen or as a range separator): Geosupport concatenates the digits to the left and right of the dash without retaining the dash itself, provided that this results in a valid address. For example, 10-22 38th Street in Brooklyn is normalized as 1022 38 STREET.
Whenever the house number normalizer makes an assumption about, or a dash-related modification to, an input house number, Geosupport informs the calling application by issuing a warning condition. A warning is issued, for example, when Geosupport assumes that an input dash is a range separator and then normalizes the house number by deleting the dash and digits following it, or when it assumes that a required hyphen is missing and inserts one.
When Geosupport is unable to normalize an input house number without making a dash-related modification so that a valid address results, and there is more than one type of dash-related modification that would result in a valid address, the input is considered ambiguous. For such a rejection, the Message would list the possible valid forms of the input address. This assists the user to determine how the input house number should be modified to make it valid. For example, consider the input 10-14 Lexington Avenue in Manhattan. Lexington Avenue has unhyphenated addresses only. There are two reasonable interpretations of the user’s intended input in this example. These are 10 Lexington Avenue, which assumes the input is an address range, and 1014 Lexington Avenue, which assumes the dash is an inappropriately present hyphen. All of the address-processing functions consider both of these to be valid addresses. Initially, 10-14 Lexington Avenue in Manhattan was rejected as ambiguous, but, at user request, the first successful house number is accepted; i.e. 10 Lexington Avenue in Manhattan.
In the borough of Queens, the great majority of streets have hyphenated house numbers only; a few streets have unhyphenated house numbers only, and a few streets have ‘mixed hyphenation’(i.e., both hyphenated and unhyphenated house numbers). In the other four boroughs, all but a few streets have unhyphenated house numbers only, a few streets have hyphenated house numbers only, and a few streets have mixed hyphenation. Riverside Drive in Manhattan is an example of a mixed-hyphenation street. A small stretch of Riverside Drive running north from West 156th Street has hyphenated even addresses ranging from 156-00 to 159- 34 (with some gaps). The remainder of Riverside Drive has unhyphenated addresses only.
Information on the address hyphenation status of each of the city’s streets is maintained internally within Geosupport. The house number normalizer makes use of this information when analyzing an input house number that contains a dash character. Dash analysis is particularly complex for mixed-hyphenation streets, for which a dash could be either a hyphen or a range separator. For example, 156-158 Riverside Drive is a valid range of unhyphenated addresses assigned to a building located near West 88th Street, while 156-10 Riverside Drive is a valid single hyphenated address assigned to a building located near West 156th Street.
When there are more than 3 digits following the dash in an input address number on a street having unhyphenated or mixed hyphenated house numbers, Geosupport treats the dash as a range separation character and issues a warning message that the address number has been altered (GRC 01 / Reason 1). When this input occurs on a street having only hyphenated house numbers, the call is rejected and Geosupport issues an error message (GRC 13 / Reason 2).