Zip Codes
From PikaDocs
By Andrew Cameron
Legal Services Lawline of Vermont (http://www.lawlinevt.org/) and Vermont Legal Aid (http://www.vtlegalaid.org/)
| Table of contents |
Introduction
Pika can have a lookup, where if you type in the zip code of a person on the address screen, it will fill in the city and state fields.
How to install zip codes for a site that doesn't have them
First you have to download a file of zip codes. (Where do you download it? Links to different versions are on the Community Edits page. But I suggest you finish reading here before you go download). It's usually a zipped file. You unzip and you then have probably a text file or a .csv file.
Method #1 of importing the file: According to Aaron: the zip codes file is a CSV file and can be loaded using the mysqlimport tool that comes with mysql (link to mysqlimport tool (http://dev.mysql.com/doc/mysql/en/mysqlimport.html))
Method #2 (what I actually did): I used phpMyAdmin. The steps were:
click on your pika database click on the zip_codes table click on the Browse button click on "Insert data from a text file into the table" (near bottom of screen) specify where the text file is with the zip codes check the box for "Replace" fields terminated by \t fields enclosed by nothing leave everything else as is (may need to tweak the end of line characters for windows/linux differences) the sql looked something like this: LOAD DATA LOCAL INFILE 'C:\\DOCUME~1\\APACHE~1.000\\LOCALS~1\\Temp\\php23B.tmp' REPLACE INTO TABLE `zip_codes` FIELDS TERMINATED BY '\t' ESCAPED BY '\\' LINES TERMINATED BY '\r\n' Run the import. When it's done, check on the zip_codes table to make sure it looks right.
Problems
Originally, Pika software had a zip code file available with 281908 records, most of which were duplicates.
A newer version cleaned up by Aaron, available since October 2005, has only 42721 records with duplicate records removed.
However, there are still problems. The relationship of towns to zip codes is many to one. For example, in Vermont, parts of Burlington, South Burlington, Queen City, and Queen City Park have the zip code 05403. However, a given zip code can only resolve to one town name. If you type in 05403, which town will Pika pick?
In the original list, as Aaron explains: Pika will take the matching city that appears first in the database (the one at the top of the list if you display them unordered.) The original database was weighted with the most-likely cities listed first.
But in Aaron's database as of October 2005, he got rid of the extra entries, and instead made a "best guess" for which town is most likely to match the zip code.
Here in Vermont, we found that many of the best guesses were wrong, perhaps one third.
Making Corrections
If you are going to use zip code lookups, no matter which zip code list you have, you are probably going to need to make corrections. How can you do the corrections? There are several ways:
- Bryan from Iowa: I usually use phpMyAdmin and do a query that brings up all the entries for a given ZIP, then delete the entries that are wrong.
- Once the zip codes are loaded on your site, if you have the zip code editor, you can fix broken ones. If you have the zip code editor, it will be at http://yoursite/zipcode.php (An extra note from Aaron about how this zip code editor actually functions: When you enter a ZIP code or edit a ZIP code through the Pika ZIP code screen, however, it will automatically delete any other records with the same ZIP code so you only have one record per ZIP code, to make things unambiguous.)
- In Vermont, we had the luxury of checking out the list before we started using Pika. I imported the file into Access (make sure to import all fields at text, not numeric). We used these two websites to look up the zip codes and see what town (I'm using "town" and "city" here interchangeably) the post office would return: [[1] (http://zip4.usps.com/zip4/citytown_zip.jsp)] [[2] (http://acg.media.mit.edu/people/fry/zipdecode/)] (this latter one is quicker if you are doing lots of lookups)
After doing this our zip code list was correct for the Vermont zip codes at least, which is what we truly care about. If you are a whiz with perl or some other linux tool you can probably do this much more smoothly than actually looking up the zip codes one by one.
Keeping Current
Pika software isn't willing to guarantee the zip code file is correct and up to date, and that seems fine to me. What if the community were to undertake an effort to keep the data up to date? Click here for some ideas on the topic.
Thought for the Future?
What if we could keep the full zip code list with the many cities matching each zip code, and alter the Pika zip code lookup. If more than 1 city matched that zip, some type of popup or pulldown would let the user see all the possibilities and pick the right one.
