I am creating a tool which depends on addresses. For the purposes of testing, I'd like to create a large number of valid US addresses. I have the GeoNames postal code data and I would like to generate some number of real addresses for each of the ~41,000 zip codes in the United States.
I've found sites like FakeAddressGenerator and FakeName which claim to generate random, valid US addresses. How do these sites work? How can I do the same thing without relying on scraping these websites?
Ideally, I'd like to be able to do this in Python; utilizing a web service is fine (it doesn't seem that either FakeAddressGenerator or FakeName provide such a web service).
Thanks!
Place the recipient's name on the first line. On the second line, write the building number and street name. Include the city, state and ZIP code on the final line.
Googling your issue I found 2 links of interest:
I don't recommend scraping the fake address generators as they do not guarantee existence. I would not go sampling in google maps either as you will surely get blacklisted.
Extracting data from downloaded zip file in 2 is easy: they are zip files containing csv files with full address, zip, lat, lon, etc...
The two above data sets "guarantee" the existence of the address. I don't know how hard your other conditions are, namely having at least one valid address for each of the 41k zip codes. If this is a hard constraint, I doubt you will get such data set open source.
EDIT:
If you have a list of all postcodes in the US, a fully automatable solution is by using a service called nominatim of openstreetmap(subject to their TOCs!)
1) get the lat, lon (centre point or default address) of each post code:
https://nominatim.openstreetmap.org/search/?format=xml&addressdetails=1&limit=1&country_codes=us&postalcode=35051
2) get the related address of this lat, lon:
https://nominatim.openstreetmap.org/reverse?format=xml&lat=33.178764&lon=-86.619038&zoom=18&addressdetails=1
trying this example for Columbiana in Alabama (postcode 35051) yields 397 West College Street.
Nominatim documentation is at: https://wiki.openstreetmap.org/wiki/Nominatim
You can install random-address:
pip install random-address
And then use random_address.real_random_address_by_postal_code:
>>> import random_address
>>> random_address.real_random_address_by_postal_code('32409')
{'address1': '711 Tashanna Lane', 'address2': '', 'city': 'Southport', 'state': 'FL', 'postalCode': '32409', 'coordinates': {'lat': 30.41437699999999, 'lng': -85.676568}}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With