How to detect the main article tag like Evernote clipper did

Tags:

When I tried with Evernote clipper extension, I see a very useful feature. When I clicked at "article", It gives me a really correct main content of page. Let see the result when I used Evernote Clipper with page https://developer.chrome.com/extensions/api_index extract article in a page

I looked at the main article that evernote field out, in several pages, the article is infact extracted from the first article tag. However evernote clipper still work well with pages doesn't use that kind of tag.

I wonder how Evernote clipper can do that ? Is there any js library support to detect the main tag containing the main content of pages. Could you give me some advises to do it.

Thank you in advance!

256

asked Jul 21 '14 04:07

yelliver

1 Answers

From my knowledge, there is no universal js lib to do that. The Evernote clipper uses its own method to extract the "interesting" content from a web page. You can access the code of the Evernote clipper to try to understand the process.

On my mac, the path to the chrome extension is :

~/Library/Application Support/Google/Chrome/Default/Extensions/pioclpoplcdbaefihamjohnefbikjilc/6.2_0/

Here's another tool that works pretty much the same : https://www.readability.com/

You can also check this thread : What algorithm does Readability use for extracting text from URLs?

or search on google for terms like 'content extraction js lib' for example. (Found this one : https://github.com/hatena/extract-content-javascript)

Hope this helps

118

answered Sep 28 '22 21:09

Laurent Sarrazin

Related questions
                            
                                How would I represent this workflow as a JavaScript data structure?
                            
                                Why do Bootstrap functions start with a + sign? [duplicate]
                            
                                Is it possible to use a DIV instead of an iFrame so that content looks like in the same page?
                            
                                how to know whether modal boxes (alert, prompt, confirm...) have been disabled in javascript?
                            
                                Capture video from several webcams with getUserMedia
                            
                                how to process/parse/read a "multipart/mixed; boundary=batch" response
                            
                                What is describe() in Mocha
                            
                                How to take a screenshot of a web page by using Javascript
                            
                                Font-Awesome icon preventing click in parent button
                            
                                Why is the function in angular's DI inline annotation a array element?
                            
                                How can I format time durations exactly using Moment.js?
                            
                                Wrong value after multiplication by 100 [duplicate]
                            
                                Use of conditional statements in Jasmine tests
                            
                                custom a tooltipContent of tooltips with datum in discreteBarChart nvd3.js
                            
                                How does 'Number' in Array.prototype.filter(Number) work?
                            
                                SyntaxError: Invalid regular expression: missing / [closed]
                            
                                What is a clean front-end workflow with bower and gulp?
                            
                                How does proxyquire handle second level (indirect) requires of proxies modules?
                            
                                Migrating GeometryUtils.merge() to geometry.merge()
                            
                                AngularJS: ng-model switching int to string

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to detect the main article tag like Evernote clipper did

Tags:

javascript

html

evernote

yelliver

People also ask

1 Answers

Laurent Sarrazin

Recent Activity

Donate For Us