Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rake task to download and unzip

I would like to update a cities table every week to reflect changes in cities across the world. I am creating a Rake task for the purpose. If possible, I would like to do this without adding another gem dependency.

The zipped file is a publicly available zipped file at geonames.org/15000cities.zip.

My attempt:

require 'net/http'
require 'zip'

namespace :geocities do
  desc "Rake task to fetch Geocities city list every 3 days"
  task :fetch do

    uri = URI('http://download.geonames.org/export/dump/cities15000.zip')
    zipped_folder = Net::HTTP.get(uri) 

    Zip::File.open(zipped_folder) do |unzipped_folder| #erroring here
      unzipped_folder.each do |file|
        Rails.root.join("", "list_of_cities.txt").write(file)
      end
    end
  end
end

The return from rake geocities:fetch

rake aborted!
ArgumentError: string contains null byte

As detailed, I'm trying to unzip the file and save it to a list_of_cities.txt file. Once I the methodology down for accomplishing this, I believe I can figure out how to update my db, based on the file. (But if you have opinions on how best to handle the actual db update, other than my planned way, I'd love to hear them. But that seems like a different post entirely.)

like image 355
Cole Bittel Avatar asked Sep 19 '15 17:09

Cole Bittel


2 Answers

This will save zipped_folder to disk, then unzip it and save its contents:

require 'net/http'                                                              
require 'zip'                                                                   

namespace :geocities do                                                         
  desc "Rake task to fetch Geocities city list every 3 days"                    
  task :fetch do                                                                

    uri = URI('http://download.geonames.org/export/dump/cities15000.zip')                          
    zipped_folder = Net::HTTP.get(uri)                                          

    File.open('cities.zip', 'wb') do |file|                                      
      file.write(zipped_folder)                                                 
    end                                                                         

    zip_file = Zip::File.open('cities.zip')                                     
    zip_file.each do |file|                                                     
      file.extract
    end                                                                         
  end                                                                           
end

This will extract all files inside the zip file, in this case cities15000.txt.
You can then read the contents of cities15000.txt and update your database.

If you want to extract to a different file name, you can pass it to file.extract like this:

zip_file.each do |file|                                                     
    file.extract('list_of_cities.txt')
end 
like image 192
Omid Kamangar Avatar answered Nov 07 '22 07:11

Omid Kamangar


I think it can be done more easily without ruby, just using wget and unzip:

namespace :geocities do
  desc "Rake task to fetch Geocities city list every 3 days"
  task :fetch do
     `wget -c --tries=10 http://download.geonames.org/export/dump/cities15000.zip | unzip`
  end
end
like image 1
Alexey Shein Avatar answered Nov 07 '22 08:11

Alexey Shein