Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a shorter way or a pythonic way to generate custom html that follows a pattern using BeautifulSoup?

I am constructing HTML as part of a bigger project. The construction works, no issues with that. However I fear that the code is too verbose or that I am not using the full power of BeautifulSoup.

For Example: I am generating a div tag of class editorial that wraps a div of class editorial-title, editorial-image, editorial-subtitle, editorial-article in that order.

Sample HTML-

<div class="editorial">
    <div class="editorial-title">Hello</div>
    <div class="editorial-image"><img src="https://images.dog.ceo/breeds/collie-border/n02106166_2595.jpg"></div>
    <div class="editorial-subtitle">world</div>
    <div class="editorial-article">Yeah. But Parasite? It should have been Gone with the Wind!</div>
</div>

Here is the long code that works for small demo version of what I am trying to do -

from bs4 import BeautifulSoup

title = "Hello"
subtitle = "world"
image_url = "https://images.dog.ceo/breeds/collie-border/n02106166_2595.jpg"
article = "But Parasite? It should have been Gone with the Wind!"

editorial_container = BeautifulSoup('', 'html.parser')
editorial_container_soup = editorial_container.new_tag('div', attrs={"class": "editorial"})

editorial_soup = BeautifulSoup('', 'html.parser')

editorial_title = editorial_soup.new_tag('div', attrs={"class": "editorial-title"})
editorial_image = editorial_soup.new_tag('div', attrs={"class": "editorial-image"})
image = editorial_soup.new_tag('img', src=image_url)
editorial_subtitle = editorial_soup.new_tag('div', attrs={"class": "editorial-subtitle"})
editorial_article = editorial_soup.new_tag('div', attrs={"class": "editorial-article"})

editorial_title.append(title)
editorial_image.append(image)
editorial_subtitle.append(subtitle)
editorial_article.append(article)

editorial_soup.append(editorial_title)
editorial_soup.append(editorial_image)
editorial_soup.append(editorial_subtitle)
editorial_soup.append(editorial_article)

editorial_container_soup.append(editorial_soup)
editorial_container.append(editorial_container_soup)
print(editorial_container.prettify())

It does the job but I feel its too long. Is there a more elegant way to achieve this?

like image 426
jar Avatar asked Oct 15 '22 05:10

jar


1 Answers

For the task that you are doing I would strongly consider using Jinja template instead of BeautifulSoup.

If you used Jinja, you just need to pass a dictionary with the editorial information to a editorial.html that could look like this:

<!-- reusable editorial.html -->
<div class="editorial">
    <div class="editorial-title">{{ title }}</div>
    <div class="editorial-image"><img src="{{ image }}"></div>
    <div class="editorial-subtitle">{{ subtitle }}</div>
    <div class="editorial-article">{{ article }}</div>
</div>

Include the editorial.html in the following html file, that will be loaded by flask. This will serve as your base template in this example.

<!-- template.html -->
<html>
    <head>
        <title>Jinja Sample</title>
    </head>
<body>
    {% include "editorial.html" %} 
</body>
</html>

Using Flask

Start a flask app like the following:

from flask import Flask, render_template
app = Flask(__name__)


@app.route("/")
def editorial_test():
    editorial_info = {
        "title" : "Hello",
        "image" : "https://images.dog.ceo/breeds/collie-border/n02106166_2595.jpg",
        "subtitle" : "world",
        "article" : "Yeah. But Parasite? It should have been Gone with the Wind!"
    }

    return render_template('template.html', editorial=editorial_info)


if __name__ == '__main__':
    app.run(debug=True)

I haven't tested the code above. Have a look at this excellent tutorial for further clarification.

Render files directly

If you do not want to use Flask, you could render the webpage directly like this (Im assuming all files are in the same directory):

import jinja2

editorial_info = {
        "title" : "Hello",
        "image" : "https://images.dog.ceo/breeds/collie-border/n02106166_2595.jpg",
        "subtitle" : "world",
        "article" : "Yeah. But Parasite? It should have been Gone with the Wind!"
    }

templateLoader = jinja2.FileSystemLoader(searchpath="./")
templateEnv = jinja2.Environment(loader=templateLoader)
TEMPLATE_FILE = "template.html"
template = templateEnv.get_template(TEMPLATE_FILE)
outputText = template.render(editorial_info) 

print(outputText)

Output

<html>
    <head>
        <title>Jinja Sample</title>
    </head>
<body>
    <div class="editorial">
    <div class="editorial-title">Hello</div>
    <div class="editorial-image"><img src="https://images.dog.ceo/breeds/collie-border/n02106166_2595.jpg"></div>
    <div class="editorial-subtitle">world</div>
    <div class="editorial-article">Yeah. But Parasite? It should have been Gone with the Wind!</div>
</div>
</body>
</html>
like image 122
Philip Avatar answered Nov 02 '22 04:11

Philip