III.2 Street Name Normalizing and the SNL Parameter
Street name normalizing is governed by a user-controllable parameter called the Street Name Normalization Length Limit (SNL), which sets an upper limit to the lengths of output normalized street names. The SNL feature is particularly useful in applications that have a restricted amount of space for the display of street names, such as when addresses must be visible through transparent envelope windows, or when a screen display or printed report line is crowded.
The user specifies an SNL value using the two-byte WA1 input SNL field. The permissible range of SNL values is 4 through 32, inclusive. The setting of an SNL value is optional. If the user specifies no SNL value, the default value of 32 is in effect for that call to Geosupport. Every call to Geosupport is an independent event, even within a single execution of a user program, so if an SNL value other than 32 is desired in a particular call, it must be explicitly specified in that call; Geosupport does not ‘remember’ an SNL value specified in a previous call.
Geosupport attempts to normalize each input street name in such a way that the result has a length in bytes that does not exceed the SNL value in effect. The SNL also governs the length of the normalized street name output returned by the display functions (Functions D, DG and DN). However, the SNL does not limit the lengths of input street names. Regardless of the SNL value, the maximum length of an input street name is 32 bytes, which is the length of the WA1 street name input fields.
The smaller the SNL value that the user specifies is, the more difficult it is for Geosupport to normalize input street names within that length limit, and therefore the greater the proportion of input street names that are likely to be rejected as not normalizable. Consequently, users who must limit the lengths of normalized street names should specify the largest possible SNL value that can satisfy the needs of their application. An SNL value of 32 (the default) insures that virtually all New York City street names can be normalized. It is recommended that in the design of new applications, 32 bytes be allocated for street name fields in files, programs, screens, reports and manual forms whenever possible.
The following is a simplified description of the street name normalizing algorithm:
-
Parsing the input name: The normalizing algorithm logically separates the input name into ‘words’ delimited by blanks. Any sequences of consecutive blanks are consolidated to single blanks. If any numeric characters (the digits ‘0’ through ‘9’) and non-numeric characters are adjacent to each other, they are separated by the insertion of blanks. For example, W2PLACE becomes W 2 PLACE.
To improve readability, normalization processing deletes any blanks that appear before and/or after a slash (/) or a dash (-) in a street name. The normalization process also does not generate any such blanks. In the case where there is a numeric before or after the slash or dash, the numeric is treated as alphabetic. For example, ‘I - 25’ becomes ‘I-25’ and ABC / DEF becomes ABC/DEF. See Chapter III.3 for a discussion of Street Name Sorting and how a numeric is normalized in a street name. -
Deleting ordinal suffixes: Numeric words in input street names are often expressed as ordinal numbers (integers formatted to specify order, consisting of numeric digits followed by ordinal suffixes, such as ‘1st’, ‘2nd’, ‘3rd’, ‘4th’). The normalizing algorithm deletes the ordinal suffixes (the endings ‘st’, ‘nd’, ‘rd’ and ‘th’) from such words. For example, WEST 3RD STREET is converted to WEST 3 STREET. Note, however, that numeric words that are expressed alphabetically (such as WEST THIRD STREET) are not modified.
-
Handling special characters: The normalizing algorithm deletes any periods (the character ‘.’) at the ends of words. For example, ST. MARKS PLACE becomes ST MARKS PLACE. Any periods not at the ends of words are replaced by blanks, which will usually cause rejection. Special characters other than periods are left unaltered, and will cause rejection unless those special character(s) are specifically valid for the given street name. (Currently, the only special characters that appear in specific street names accepted by Geosupport are: ‘ (apostrophe), ( (open parenthesis) and ) (closed parenthesis), & (ampersand), / (forward slash) and – (dash or hyphen). . Currently, the only special characters that appear in specific street names accepted by Geosupport are: apostrophes, open and closed parentheses, ampersands, forward slashes, dashes and hyphens, viz, ’ ( ) & / -. In general, if Geosupport accepts a street name with a special character, it will also accept that street name without the special character. For example, in Manhattan, both SAINT MARK’S PLACE and SAINT MARKS PLACE are accepted. In the Bronx, O’BRIEN AVENUE, OBRIEN AVENUE and O BRIEN AVENUE are all accepted. In Manhattan, BEN-GURION PLACE, BEN GURION PLACE and BENGURION PLACE are all accepted.
-
Expanding and abbreviating standard words under SNL constraint: There are certain standard words that appear frequently in street names, either fully spelled out, such as EAST, AVENUE and BOULEVARD, or in the form of standard abbreviations, such as E, AV or AVE, and BL or BLVD, respectively. If the input name is shorter than the SNL value in effect, then to the extent permitted by that SNL value, the normalizing algorithm expands standard abbreviations to their full spellings. Conversely, if the input name is longer than the SNL value in effect, then the normalizing algorithm attempts to shorten the name to the extent required by that SNL value, by replacing fully spelled out standard words with standard abbreviations.
-
Suppressing expansion in special cases: The normalizing algorithm recognizes certain special cases in which a character string normally treated as a standard abbreviation is not to be so treated, that is, is not to be expanded under any circumstances. For example, ST is expanded to STREET only when it occurs as the last word of the input name; this prevents the conversion, for example, of ST MARKS PLACE into STREET MARKS PLACE. Certain character strings that are treated as standard abbreviations in most street names are not so treated in specific street names; for example, the ‘S’ in the Brooklyn street name AVENUE S and in the Bronx street name S STREET is not expanded into SOUTH; the ‘E’ in the Manhattan street name ABRAHAM E KAZAN STREET is not expanded into EAST; the ‘DR’ in the Manhattan street name DR MARTIN L KING JR BOULEVARD is not expanded into DRIVE.