I would like to create a function in C# that takes a specific webpage and coverts it to a JPG image from within ASP.NET. I don't want to do this via a third party or thumbnail service as I need the full image. I assume I would need to somehow leverage the webbrowser control from within ASP.NET but I just can't see where to get started. Does anyone have examples?
Ok, this was rather easy when I combined several different solutions:
These solutions gave me a thread-safe way to use the WebBrowser from ASP.NET:
http://www.beansoftware.com/ASP.NET-Tutorials/Get-Web-Site-Thumbnail-Image.aspx
http://www.eggheadcafe.com/tutorials/aspnet/b7cce396-e2b3-42d7-9571-cdc4eb38f3c1/build-a-selfcaching-asp.aspx
This solution gave me a way to convert BMP to JPG:
Bmp to jpg/png in C#
I simply adapted the code and put the following into a .cs:
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
using System.Threading;
using System.Windows.Forms;
public class WebsiteToImage
{
private Bitmap m_Bitmap;
private string m_Url;
private string m_FileName = string.Empty;
public WebsiteToImage(string url)
{
// Without file
m_Url = url;
}
public WebsiteToImage(string url, string fileName)
{
// With file
m_Url = url;
m_FileName = fileName;
}
public Bitmap Generate()
{
// Thread
var m_thread = new Thread(_Generate);
m_thread.SetApartmentState(ApartmentState.STA);
m_thread.Start();
m_thread.Join();
return m_Bitmap;
}
private void _Generate()
{
var browser = new WebBrowser { ScrollBarsEnabled = false };
browser.Navigate(m_Url);
browser.DocumentCompleted += WebBrowser_DocumentCompleted;
while (browser.ReadyState != WebBrowserReadyState.Complete)
{
Application.DoEvents();
}
browser.Dispose();
}
private void WebBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
// Capture
var browser = (WebBrowser)sender;
browser.ClientSize = new Size(browser.Document.Body.ScrollRectangle.Width, browser.Document.Body.ScrollRectangle.Bottom);
browser.ScrollBarsEnabled = false;
m_Bitmap = new Bitmap(browser.Document.Body.ScrollRectangle.Width, browser.Document.Body.ScrollRectangle.Bottom);
browser.BringToFront();
browser.DrawToBitmap(m_Bitmap, browser.Bounds);
// Save as file?
if (m_FileName.Length > 0)
{
// Save
m_Bitmap.SaveJPG100(m_FileName);
}
}
}
public static class BitmapExtensions
{
public static void SaveJPG100(this Bitmap bmp, string filename)
{
var encoderParameters = new EncoderParameters(1);
encoderParameters.Param[0] = new EncoderParameter(System.Drawing.Imaging.Encoder.Quality, 100L);
bmp.Save(filename, GetEncoder(ImageFormat.Jpeg), encoderParameters);
}
public static void SaveJPG100(this Bitmap bmp, Stream stream)
{
var encoderParameters = new EncoderParameters(1);
encoderParameters.Param[0] = new EncoderParameter(System.Drawing.Imaging.Encoder.Quality, 100L);
bmp.Save(stream, GetEncoder(ImageFormat.Jpeg), encoderParameters);
}
public static ImageCodecInfo GetEncoder(ImageFormat format)
{
var codecs = ImageCodecInfo.GetImageDecoders();
foreach (var codec in codecs)
{
if (codec.FormatID == format.Guid)
{
return codec;
}
}
// Return
return null;
}
}
And can call it as follows:
WebsiteToImage websiteToImage = new WebsiteToImage( "http://www.cnn.com", @"C:\Some Folder\Test.jpg");
websiteToImage.Generate();
It works with both a file and a stream. Make sure you add a reference to System.Windows.Forms to your ASP.NET project. I hope this helps.
UPDATE: I've updated the code to include the ability to capture the full page and not require any special settings to capture only a part of it.
Good solution by Mr Cat Man Do.
I've needed to add a row to suppress some errors that came up in some webpages (with the help of an awesome colleague of mine)
private void _Generate()
{
var browser = new WebBrowser { ScrollBarsEnabled = false };
browser.ScriptErrorsSuppressed = true; // <--
browser.Navigate(m_Url);
browser.DocumentCompleted += WebBrowser_DocumentCompleted;
}
...
Thanks Mr Do
Here is my implementation using extension methods and task factory instead thread:
/// <summary>
/// Convert url to bitmap byte array
/// </summary>
/// <param name="url">Url to browse</param>
/// <param name="width">width of page (if page contains frame, you need to pass this params)</param>
/// <param name="height">heigth of page (if page contains frame, you need to pass this params)</param>
/// <param name="htmlToManipulate">function to manipulate dom</param>
/// <param name="timeout">in milliseconds, how long can you wait for page response?</param>
/// <returns>bitmap byte[]</returns>
/// <example>
/// byte[] img = new Uri("http://www.uol.com.br").ToImage();
/// </example>
public static byte[] ToImage(this Uri url, int? width = null, int? height = null, Action<HtmlDocument> htmlToManipulate = null, int timeout = -1)
{
byte[] toReturn = null;
Task tsk = Task.Factory.StartNew(() =>
{
WebBrowser browser = new WebBrowser() { ScrollBarsEnabled = false };
browser.Navigate(url);
browser.DocumentCompleted += (s, e) =>
{
var browserSender = (WebBrowser)s;
if (browserSender.ReadyState == WebBrowserReadyState.Complete)
{
if (htmlToManipulate != null) htmlToManipulate(browserSender.Document);
browserSender.ClientSize = new Size(width ?? browser.Document.Body.ScrollRectangle.Width, height ?? browser.Document.Body.ScrollRectangle.Bottom);
browserSender.ScrollBarsEnabled = false;
browserSender.BringToFront();
using (Bitmap bmp = new Bitmap(browserSender.Document.Body.ScrollRectangle.Width, browserSender.Document.Body.ScrollRectangle.Bottom))
{
browserSender.DrawToBitmap(bmp, browserSender.Bounds);
toReturn = (byte[])new ImageConverter().ConvertTo(bmp, typeof(byte[]));
}
}
};
while (browser.ReadyState != WebBrowserReadyState.Complete)
{
Application.DoEvents();
}
browser.Dispose();
}, CancellationToken.None, TaskCreationOptions.None, TaskScheduler.FromCurrentSynchronizationContext());
tsk.Wait(timeout);
return toReturn;
}
There is a good article by Peter Bromberg on this subject here. His solution seems to do what you need...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With