Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Go: How would you "Pretty Print"/"Prettify" HTML?

In Python, PHP, and many other languages, it is possible to convert a html document and "prettify" it. In Go, this is very easily done for JSON and XML (from a struct/interface) using the MarshIndent function.

Example for XML in Go:

http://play.golang.org/p/aBNfNxTEG1

package main

import (
    "encoding/xml"
    "fmt"
    "os"
)

func main() {
    type Address struct {
        City, State string
    }
    type Person struct {
        XMLName   xml.Name `xml:"person"`
        Id        int      `xml:"id,attr"`
        FirstName string   `xml:"name>first"`
        LastName  string   `xml:"name>last"`
        Age       int      `xml:"age"`
        Height    float32  `xml:"height,omitempty"`
        Married   bool
        Address
        Comment string `xml:",comment"`
    }

    v := &Person{Id: 13, FirstName: "John", LastName: "Doe", Age: 42}
    v.Comment = " Need more details. "
    v.Address = Address{"Hanga Roa", "Easter Island"}

    output, err := xml.MarshalIndent(v, "  ", "    ")
    if err != nil {
        fmt.Printf("error: %v\n", err)
    }

    os.Stdout.Write(output)
}

However, this only works for converting struct/interface into a []byte. What I want is convert a string of html code and indent automatically. Example:

Raw HTML

<!doctype html><html><head>
<title>Website Title</title>
</head><body>
<div class="random-class">
<h1>I like pie</h1><p>It's true!</p></div>
</body></html>

Prettified HTML

<!doctype html>
<html>
    <head>
        <title>Website Title</title>
    </head>
    <body>
        <div class="random-class">
            <h1>I like pie</h1>
            <p>It's true!</p>
        </div>
    </body>
</html>

How would this be done just using a string?

like image 503
Xplane Avatar asked Jan 14 '14 15:01

Xplane


People also ask

What is pretty print HTML?

The HTML pretty print refers to the process of making your syntax more visually appealing by applying specific stylistic conventions. There are many reasons why doing this is necessary and one of them is creating a correct HTML print format.

Which of the following statements is used to get pretty HTML as output?

html. soupparser uses for parsing HTML. BeautifulSoup has a prettify method that does exactly what it says it does. It prettifies the HTML with proper indents and everything.


2 Answers

EDIT: Found a great way using the XML parser:

package main

import (
    "encoding/xml"
    "fmt"
)

func main() {
    html := "<html><head><title>Website Title</title></head><body><div class=\"random-class\"><h1>I like pie</h1><p>It's true!</p></div></body></html>"
    type node struct {
        Attr     []xml.Attr
        XMLName  xml.Name
        Children []node `xml:",any"`
        Text     string `xml:",chardata"`
    }
    x := node{}
    _ = xml.Unmarshal([]byte(html), &x)
    buf, _ := xml.MarshalIndent(x, "", "\t")
    fmt.Println(string(buf))
}

will output the following:

<html>
    <head>
        <title>Website Title</title>
    </head>
    <body>
        <div>
            <h1>I like pie</h1>
            <p>It&#39;s true!</p>
        </div>
    </body>
</html>
like image 160
Not_a_Golfer Avatar answered Sep 28 '22 22:09

Not_a_Golfer


I faced a same problem and I just solved it by creating an HTML formatting package in Go by myself.

Here it is:

GoHTML - HTML formatter for Go

Please check this package out.

Thanks,

Keiji

like image 41
Keiji Yoshida Avatar answered Sep 28 '22 22:09

Keiji Yoshida