Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java download all files and folders in a directory

I am trying to download all the files from this directory. However, I can only get it to download the url as one file. What can I do? I tried searching for this problem and it was confusing and people were starting to suggest using httpclients instead. Thanks for your help, this is my code so far. It has been suggested that I use an input stream to attain all the files in the directory. Would that then go into an array? I tried the tutorial here http://docs.oracle.com/javase/tutorial/networking/urls/ but it didn't help me understand.

//ProgressBar/Install
            String URL_LOCATION = "http://www.futureretrogaming.tk/gamefiles/ProfessorPhys/";
            String LOCAL_FILE = filelocation.getText() + "\\ProfessorPhys\\";
            try {
                java.net.URL url = new URL(URL_LOCATION);
                HttpURLConnection connection = (HttpURLConnection) url.openConnection(); 
                connection.addRequestProperty("User-Agent", "Mozilla/4.76"); 
                //URLConnection connection = url.openConnection();
                BufferedInputStream stream = new BufferedInputStream(connection.getInputStream());
                int available = stream.available();
                byte b[]= new byte[available];
                stream.read(b);
                File file = new File(LOCAL_FILE);
                OutputStream out  = new FileOutputStream(file);
                out.write(b);
            } catch (Exception e) {
                System.err.println(e);
            }

I also found this code which will return a List of files to download. Can someone help me combine the two codes?

public class GetAllFilesInDirectory {

public static void main(String[] args) throws IOException {

    File dir = new File("dir");

    System.out.println("Getting all files in " + dir.getCanonicalPath() + " including those in subdirectories");
    List<File> files = (List<File>) FileUtils.listFiles(dir, TrueFileFilter.INSTANCE, TrueFileFilter.INSTANCE);
    for (File file : files) {
        System.out.println("file: " + file.getCanonicalPath());
    }

}

}

like image 342
Kyle Avatar asked Jun 14 '13 04:06

Kyle


1 Answers

You need to download the page, which is the directory listing, parse it and then download the inidiviudal files linked in the page...

You could do something like...

URL url = new URL("http:www.futureretrogaming.tk/gamefiles/ProfessorPhys");
InputStream is = null;
try {
    is = url.openStream();
    byte[] buffer = new byte[1024];
    int bytesRead = -1;
    StringBuilder page = new StringBuilder(1024);
    while ((bytesRead = is.read(buffer)) != -1) {
        page.append(new String(buffer, 0, bytesRead));
    }
    // Spend the rest of your life using String methods
    // to parse the result...
} catch (IOException ex) {
    ex.printStackTrace();
} finally {
    try {
        is.close();
    } catch (Exception e) {
    }
}

Or, you can download Jsoup and use it to do all the hard work...

try {
    Document doc = Jsoup.connect("http:www.futureretrogaming.tk/gamefiles/ProfessorPhys").get();
    Elements links = doc.getElementsByTag("a");
    for (Element link : links) {
        System.out.println(link.attr("href") + " - " + link.text());
    }
} catch (IOException ex) {
    ex.printStackTrace();
}

Which outputted...

?C=N;O=D - Name
?C=M;O=A - Last modified
?C=S;O=A - Size
?C=D;O=A - Description
/gamefiles/ - Parent Directory
Assembly-CSharp-Editor-firstpass-vs.csproj - Assembly-CSharp-Edit..>
Assembly-CSharp-Editor-firstpass.csproj - Assembly-CSharp-Edit..>
Assembly-CSharp-Editor-firstpass.pidb - Assembly-CSharp-Edit..>
Assembly-CSharp-firstpass-vs.csproj - Assembly-CSharp-firs..>
Assembly-CSharp-firstpass.csproj - Assembly-CSharp-firs..>
Assembly-CSharp-firstpass.pidb - Assembly-CSharp-firs..>
Assembly-CSharp-vs.csproj - Assembly-CSharp-vs.c..>
Assembly-CSharp.csproj - Assembly-CSharp.csproj
Assembly-CSharp.pidb - Assembly-CSharp.pidb
Assembly-UnityScript-Editor-firstpass-vs.unityproj - Assembly-UnityScript..>
Assembly-UnityScript-Editor-firstpass.pidb - Assembly-UnityScript..>
Assembly-UnityScript-Editor-firstpass.unityproj - Assembly-UnityScript..>
Assembly-UnityScript-firstpass-vs.unityproj - Assembly-UnityScript..>
Assembly-UnityScript-firstpass.pidb - Assembly-UnityScript..>
Assembly-UnityScript-firstpass.unityproj - Assembly-UnityScript..>
Assembly-UnityScript-vs.unityproj - Assembly-UnityScript..>
Assembly-UnityScript.pidb - Assembly-UnityScript..>
Assembly-UnityScript.unityproj - Assembly-UnityScript..>
Assets/ - Assets/
Library/ - Library/
Professor%20Phys-csharp.sln - Professor Phys-cshar..>
Professor%20Phys.exe - Professor Phys.exe
Professor%20Phys.sln - Professor Phys.sln
Professor%20Phys.userprefs - Professor Phys.userp..>
Professor%20Phys_Data/ - Professor Phys_Data/
Script.doc - Script.doc
~$Script.doc - ~$Script.doc
~WRL0392.tmp - ~WRL0392.tmp
~WRL1966.tmp - ~WRL1966.tmp

You would then need to build a new URL for each file and read as you have already done...

For example, the href for Assembly-CSharp-Edit..> is Assembly-CSharp-Editor-firstpass-vs.csproj, which appears to a relative link, so you would need prefix this with http://www.futureretrogaming.tk/gamefiles/ProfessorPhys to make a new URL of http://www.futureretrogaming.tk/gamefiles/ProfessorPhys/Assembly-CSharp-Editor-firstpass-vs.csproj

You would need to do this for each element you want to grab

like image 148
MadProgrammer Avatar answered Sep 28 '22 09:09

MadProgrammer