I have a web application that I want to get the h1 tag and the first image and the first few lines of text from an external web page. I have never done this before and think that it is done best using jquery but I'm not sure. Can you please point me in the right direction or give a coding example in .net and jquery? Thanks.
I am thinking kinda of like Facebook pulls out the picture and a few lines when you type a url in the post box as a new post.
You cannot fetch any URLs markup using AJAX due to CORS (cross-origin resource sharing) and most sites on the web won't permit just anyone to use their content. What you should do in your case is use a proxy method on your server.
Create an action which receives a URL and fetches its markup on your server, then use AJAX to request the pages HTML using your new action.
From there you have two options. Either parsing the HTML on the server, extracting all the data you need, then sending it back to the client OR send all of the HTML back to the client. I highly recommend using the server to do the parsing, it will use less bandwidth and your server probably has better performance and speed than most browsers provide.
If you decided to analyze the markup on the client end, the most simple way to do so would be passing the HTML into a root element, then querying for the data using regular methods.
i.e.
var $root = $('<div>').html(response.html);
console.log($root.find('h1')); // all h1 tags in response's html
The downside here is that once you've allowed the browser to parse your markup it will automatically load any resources that were present, such as images.
I don't use .Net
so I am unable to provide you with the exact tools you may need, but I do suggest that you look up yourself for ways to accomplish these two tasks on the server.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With