Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I fetch the last-modified value of a remote file?

I'd like to know the last-modifed date of a remote file (defined via url).
And only download it, if it's newer than my locally stored one.

I managed to do that for local files, but can't find a solution to do that for remote files (without downloading them)

working:

Dim infoReader As System.IO.FileInfo = My.Computer.FileSystem.GetFileInfo("C:/test.txt")
MsgBox("File was last modified on " & infoReader.LastWriteTime)  

not working:

        Dim infoReader As System.IO.FileInfo = My.Computer.FileSystem.GetFileInfo("http://google.com/robots.txt")
        MsgBox("File was last modified on " & infoReader.LastWriteTime)  

I'd love to have a solution which will only have to download the headers of a file

like image 681
Wurstbro Avatar asked Jul 14 '14 18:07

Wurstbro


2 Answers

You can use the System.Net.Http.HttpClient class to fetch the last modified date from the server. Because it's sending a HEAD request, it will not fetch the file contents:

Dim client = New HttpClient()
Dim msg = New HttpRequestMessage(HttpMethod.Head, "http://google.com/robots.txt")
Dim resp = client.SendAsync(msg).Result
Dim lastMod = resp.Content.Headers.LastModified

You could also use the If-Modified-Since request header with a GET request. This way the response should be 304 - Not Modified if the file has not been changed (no file content sent), or 200 - OK if the file has been changed (and the contents of the file will be sent in the response), although the server is not required to honor this header.

Dim client = New HttpClient()
Dim msg = New HttpRequestMessage(HttpMethod.Get, "http://google.com/robots.txt")
msg.Headers.IfModifiedSince = DateTimeOffset.UtcNow.AddDays(-1) ' use the date of your copy of the file
Dim resp = client.SendAsync(msg).Result
Select Case resp.StatusCode
    Case HttpStatusCode.NotModified
        ' Your copy is up-to-date
    Case HttpStatusCode.OK
        ' Your copy is out of date, so save it
        File.WriteAllBytes("C:\robots.txt", resp.Content.ReadAsByteArrayAsync.Result)
End Select

Note the use of .Result, since I was testing in a console application - you should probably await instead.

like image 199
Mark Avatar answered Sep 18 '22 19:09

Mark


If the server offers it, you can get it through the HTTP header Last-Modified property. But your still stuck at downloading the full file.

You could get it through FTP.
See if the server allows you to see the list of files in a folder.
If the website offer the date somewhere that you could pull through screen scrapping.

like image 1
the_lotus Avatar answered Sep 17 '22 19:09

the_lotus