Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ArgumentException when instantiating bitmap from stream

Tags:

c#

.net

c#-4.0

I've written some code to import content from my Blogger blog. Once I've downloaded all of the HTML content, I go through the image tags and download the corresponding images. In a significant number of cases, System.Drawing.Bitmap.FromStream is throwing an ArgumentException. The URL I'm downloading from looks good and it serves up an image as expected (here's the URL for one of the problem images: http://4.bp.blogspot.com/_tSWCyhtOc38/SgIPcctWRZI/AAAAAAAAAGg/2LLnVPxsogI/s1600-h/IMG_3590.jpg).

    private static System.Drawing.Image DownloadImage(string source)
    {
        System.Drawing.Image image = null;

        // used to fetch content
        var client = new HttpClient();

        // used to store image data
        var memoryStream = new MemoryStream();

        try
        {
            // fetch the image
            var imageStream = client.GetStreamAsync(source).Result;

            // instantiate a system.drawing.image from the data
            image = System.Drawing.Bitmap.FromStream(imageStream, false, false);

            // save the image data to a memory stream
            image.Save(memoryStream, image.RawFormat);
        }
        catch (IOException exception)
        {
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        catch (ArgumentException exception)
        {
            // sometimes, an image will link to a web page, resulting in this exception
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        catch (AggregateException exception)
        {
            // sometimes, an image src will throw a 404
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        finally
        {
            // clean up our disposable resources
            client.Dispose();
            memoryStream.Dispose();
        }

        return image;
    }

Any idea why an ArgumentException is getting thrown here?

EDIT: It occurred to me that it could be a proxy issue, so I added the following to my web.config:

<system.net>
  <defaultProxy enabled="true" useDefaultCredentials="true">
    <proxy usesystemdefault="True" />
  </defaultProxy>
</system.net>

Adding that section hasn't made any difference, however.

EDIT: This code is called from the context of an EF database initializer. Here's a stack trace:

Web.dll!Web.Models.Initializer.DownloadImage(string source) Line 234 C# Web.dll!Web.Models.Initializer.DownloadImagesForPost.AnonymousMethod__5(HtmlAgilityPack.HtmlNode tag) Line 126 + 0x8 bytes C# [External Code] Web.dll!Web.Models.Initializer.DownloadImagesForPost(Web.Models.Post post) Line 119 + 0x34 bytes C# Web.dll!Web.Models.Initializer.Seed(Web.Models.FarmersMarketContext context) Line 320 + 0xb bytes C# [External Code] App_Web_l2h4tcej.dll!ASP._Page_Views_Home_Index_cshtml.Execute() Line 28 + 0x15 bytes C# [External Code]

like image 979
Jim Lamb Avatar asked Mar 30 '26 19:03

Jim Lamb


2 Answers

OK, I found the issue. It turns out that, in some cases, Blogger references an HTML page that renders an image rather than referencing the image itself. So, the response in that case isn't a valid image. I've added code to check the response headers before attempting to save the image data and that's fixed the problem. For the benefit of anyone else who hits this issue, here's the updated code:

    private static System.Drawing.Image DownloadImage(string source)
    {
        System.Drawing.Image image = null;

        // used to fetch content
        var client = new HttpClient();

        // used to store image data
        var memoryStream = new MemoryStream();

        try
        {
            // Blogger tacks on a -h to an image Url to link to an HTML page instead
            if (source.Contains("-h/"))
                source = source.Replace("-h/", "/");

            // fetch the image
            var response = client.GetAsync(source).Result;
            response.EnsureSuccessStatusCode();

            var contentType = response.Content.Headers.ContentType.MediaType;

            if (!contentType.StartsWith("image/"))
            {
                Debug.WriteLine(contentType);
                throw new ArgumentException("Specified source did not return an image");
            }

            var imageStream = response.Content.ReadAsStreamAsync().Result;

            // instantiate a system.drawing.image from the data
            image = System.Drawing.Bitmap.FromStream(imageStream, true, true);

            // save the image data to a memory stream
            image.Save(memoryStream, image.RawFormat);
        }
        catch (HttpRequestException exception)
        {
            // sometimes, we'll get a 404 or other unexpected response
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        catch (IOException exception)
        {
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        catch (ArgumentException exception)
        {
            // sometimes, an image will link to a web page, resulting in this exception
            Debug.WriteLine("{0} {1}", exception.Message, source);
        }
        finally
        {
            // clean up our disposable resources
            client.Dispose();
            memoryStream.Dispose();
        }

        return image;
    }
like image 134
Jim Lamb Avatar answered Apr 02 '26 02:04

Jim Lamb


You are dealing with another issue, I think you fixed it by accident. Unfortunately, the GDI+ exceptions are not very good and they often don't tell you what the real problem is.

One of the obscure tidbits in the Image.FromStream() implementation is that GDI+ uses the stream's Seek() method while loading the bitmap from the stream. This will however only work well when the stream permits seeking, its CanSeek property must return true. This is in general not the case for network streams, not enough buffering is provided to allow arbitrary seeks.

Which is an issue with HttpClient.GetStreamAsync(), it's MSDN Library document says:

This method does not buffer the stream

While the working version you wrote uses HttpContent.ReadAsStreamAsync(), it's MSDN Library documentation says:

The returned Task object will complete after all of the content has been written as a byte array

So your first version doesn't work because the stream's CanSeek property is false, the second version works because the entire response is read into a byte array which permits seeking. The universal solution is to slurp the stream into a MemoryStream first.

like image 34
Hans Passant Avatar answered Apr 02 '26 03:04

Hans Passant



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!