Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I cut HTML so that the closing tags are preserved?

How can I create a preview of a blog post stored in HTML? In other words, how can I "cut" HTML, making sure the tags close properly? Currently, I'm rendering the whole thing on the frontend (with react's dangerouslySetInnerHTML) then setting overflow: hidden and height: 150px. I would much prefer a way where I could cut the HTML directly. This way I don't need to send the entire stream of HTML to the frontend; if I had 10 blog post previews, that would be a lot of HTML sent that the visitor would not even see.

If I had the HTML (say this was the entire blog post)

<body>
   <h1>Test</h1>
   <p>This is a long string of text that I may want to cut.. blah blah blah foo bar bar foo bar bar</p>
</body>

Trying to slice it (to make a preview) wouldn't work because the tags would become unmatched:

<body>
   <h1>Test</h1>
   <p>This is a long string of text <!-- Oops! unclosed tags -->

Really what I want is this:

<body>
   <h1>Test</h1>
   <p>This is a long string of text</p>
</body>

I'm using next.js, so any node.js solution should work fine. Is there a way I can do this (e.g. a library on the next.js server-side)? Or will I just have to parse the HTML myself (server-side) and then fix the unclosed tags?

like image 415
Monolith Avatar asked Dec 26 '20 16:12

Monolith


People also ask

How do you enclose HTML tags?

Tags are always enclosed in angle brackets: < >. Tags are comprised of elements and attributes. An element is an object on a page (such as a heading, paragraph, or image), and attributes are qualities that describe that element (such as width and height). Tags usually travel in pairs.

What happens if you don't close HTML tag?

Not closing tags can lead to browser incompatibilities and improperly rendered pages. That alone should be enough reason to properly close your tags.

Does HTML have self closing tags?

A self-closing tag is an element of HTML code that has evolved in the language. Typically, the self-closing tag makes use of a “/” character in order to effectively close out a beginning tag enclosed in sideways carets.


Video Answer


1 Answers

post-preview


It was a challenging task and made me struggle for about two days and made me publish my first NPM package post-preview which can solve your problem. Everything is described in its readme, but if you want to know how to use it for your specific problem:

First of all install the package using NPM or download its source code from GitHub

Then you can use it before the user posts their blogpost to the server and send its result (preview) with the full post to the backend and validate its length and sanitize its html and save it to your backend storage (DB etc.) and send it back to users when you want to show them a blog post preview instead of the full post.

example:

The following code will accept the .blogPostContainer HTMLElement as input and returns the summarized HTML string version of it with *maximum 200 characters length.

You can see the preview in the 'previewContainer' .preview:

js:

import  postPreview  from  "post-preview";
const  postContainer = document.querySelector(".blogPostContainer");
const  previewContainer = document.querySelector(".preview");
previewContainer.innerHTML = postPreview(postContainer, 200);

html (complete blog post):

<div class="blogPostContainer">
  <div>
    <h2>Lorem ipsum</h2>
    <p>
      Lorem ipsum, dolor sit amet consectetur adipisicing elit. Neque, fugit hic! Quas similique
      cupiditate illum vitae eligendi harum. Magnam quam ex dolor nihil natus dolore voluptates
      accusantium. Reprehenderit, explicabo blanditiis?
    </p>
  </div>
  <p>
    Lorem ipsum dolor sit amet consectetur adipisicing elit. Ipsam non incidunt, corporis debitis
    ducimus eum iure sed ab. Impedit, doloribus! Quos accusamus eos, incidunt enim amet maiores
    doloribus placeat explicabo.Eaque dolores tempore, quia temporibus placeat, consequuntur hic
    ullam quasi rem eveniet cupiditate est aliquam nisi aut suscipit fugit maiores ad neque sunt
    atque explicabo unde! Explicabo quae quia voluptatem.
  </p>
</div>

<div class="preview"></div>

result (blog post preview):

<div class="preview">
  <div class="blogPostContainer">
    <div>
      <h2>Lorem ipsum</h2>
      <p>
        Lorem ipsum, dolor sit amet consectetur adipisicing elit. Neque, fugit hic! Quas similique
        cupiditate illum vitae eligendi ha
      </p>
    </div>
  </div>
</div>

It's a synchronous task so if you want to run it against multiple posts at once, you've better run it in a worker for better performance.

Thank you for making me do some research!

Good luck!

like image 155
Abbas Hosseini Avatar answered Oct 16 '22 14:10

Abbas Hosseini