Many geographic databases come as boundary files with longitude and latitude. Is is then up to the user to project these suitable into rectangular coordinates with "northing" and "easting". But sometimes this is rather cumbersome when all one wants to do is measure the size of areas. I have encountered such a problem when I needed to measure the size of postal Forward Sortation Areas (FSAs) in Canada. Statistics Canada provides a digital boundary file free of charge on their web site in a number of different formats.

Canadian postal codes are made up of six alphanumeric letters, such as V6T 1Z2. The first three characters are the Forward Sortation Area (FSA). FSAs vary hugely in size but tend to cover comparable number of people. For Canada Post, FSAs are useful delineations of service areas.

With SAS, it is quite easy to read in the ArcGIS version of the boundary file and extract the polygon sequences, which are sometimes several segments for any given FSA. The SAS code below shows how to process the data, which is identified in the "geofsa" output data set by the FSA code (CFSAUID), the province number (PRUID) and the province name (PRNAME). I find it more convenient to work with two-letter province codes (e.g., ON for Ontario), and thus the first step processing the data was to convert province numbers to province codes, and renaming some of the variables more conveniently.

The critical processing step is the fsa_area data step. Here, we need to be careful to process one FSA segment at a time. For that, some house keeping is necessary to determine the boundaries of segments. Also, we need to keep a memory of the last set of longitudes and latitudes so that we can calculate partial areas that combine two longitudes and two latitudes, points (x0, y0) and (x1, y1). This necessitates a little bit of juggling. When we encounter the first data row in a new segment, we first check if we have to finish up a previous segment. If yes, we close the segment by using the saved location of the first point in the segment. We calculate the size of the area and output it. Then we initialize the memory-keeping variables and save the first point as (x9, y9). Next we run through all points in the segment and calculate the partial areas, and when done save the current data point (x1, y1) to (x0, y0). SAS applies trigonometric functions to values expressed in radians, so all longitudes and latitudes must be converted from degrees to radians.

The heart of the calculation happens in the macro "NEXTPOINT". It is useful to put the code in a macro statement because it is used twice in the data step. This is because we have to skip the first data row in the calculation and we have to use NEXTPOINT twice in the last data row. The area calculation are based on a spherical polygon, which is a generalization of the spherical triangle. There are several implementation of suitable algorithms available, and the one shown here make use of haversine functions.

The last data step takes the sorted output from fsa_area and tries to make sense of the data by classifying segments either as as urban or rural. Without looking at a map or using data on population densities, this is actually quite tricky and a bit of a judgment call. I treat all FSAs as rural if they have a zero in the middle, which identifies FSAs with rural delivery by Canada Post. That is easy. However, sometimes FSAs contain both urban and rural segments. For example, V6T encompasses the University of British Columbia but also rural areas that are many kilometers away. For that reason, large areas (over 50 square kilometers) are assumed to be rural in nature. Essentially, segment size corresponds inversely to population density. Some FSAs are huge, such as Y0B in the Yukon: more than 312,000 square kilometers. On the other hand, many urban FSAs are less than one square kilometre in size.

Perhaps you are not interested at all in how to calculate areas but you are just keen on seeing the output or using it for your own research. In that case I have Stata dictionary file for you: fsa_area_2011.dct. Please feel free to download and use this file. The file has columns for the two-letter province code and three-letter FSA code, and numeric columns for the total FSA area, urban area, and rural area. If you have suggestions on how to improve the urban/rural designation, I encourage you to contact me.