As per title, is it possible -- and if so, how -- to produce "clean" HTML code from pandas.DataFrame.to_html()
?
I have found that border=...
and justify=...
parameters control what is displayed there, but apparently no matter what value you put in there, you seems to always get them.
Here is a minimal working example:
import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.arange(3 * 4).reshape(3, 4))
df.to_html(border=0, justify='inherit')
which produces:
<table border="0" class="dataframe">
<thead>
<tr style="text-align: inherit;">
...
However, I would have been expecting that:
import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.arange(3 * 4).reshape(3, 4))
df.to_html(classes=None, border=None, justify=None)
would / should produce:
<table class="dataframe">
<thead>
<tr>
...
instead of:
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
...
since the introduction both border
and style
are appearance rather than structure tags and should be included through the css
.
So, is there any way to get rid of border
from table
and style
from tr
inside the thead
?
As you've already observed, df.to_html(classes=None, border=None, justify=None)
disregards settings of None
, inserting the default values regardless. There are open requests to modify this, but they are not yet in place. As it stands, the only way to remove these hard-coded styles is to manipulate the output string, like so:
html = re.sub(r'<tr.*>', '<tr>', df.to_html().replace('border="1" ', ''))
Removing class="dataframe"
can be accomplished the same way, but this shouldn't impact most CSS if left in place.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With