I have converted a Jupyter/IPython notebook to HTML format and subsequently lost the original ipynb file.
Is there a simple way to generate the original notebook file from the converted HTML file?
Notebooks may be exported to a range of static formats, including HTML (for example, for blog posts), reStructuredText, LaTeX, PDF, and slide shows, via the nbconvert command. Furthermore, any . ipynb notebook document available from a public URL can be shared via the Jupyter Notebook Viewer <nbviewer>.
I recently used BeautifulSoup and JSON to convert html notebook to ipynb. the trick is to look at the JSON schema of a notebook and emulate that. The code selects only input code cells and markdown cells
here is my code
from bs4 import BeautifulSoup import json import urllib.request url = 'http://nbviewer.jupyter.org/url/jakevdp.github.com/downloads/notebooks/XKCD_plots.ipynb' response = urllib.request.urlopen(url) # for local html file # response = open("/Users/note/jupyter/notebook.html") text = response.read() soup = BeautifulSoup(text, 'lxml') # see some of the html print(soup.div) dictionary = {'nbformat': 4, 'nbformat_minor': 1, 'cells': [], 'metadata': {}} for d in soup.findAll("div"): if 'class' in d.attrs.keys(): for clas in d.attrs["class"]: if clas in ["text_cell_render", "input_area"]: # code cell if clas == "input_area": cell = {} cell['metadata'] = {} cell['outputs'] = [] cell['source'] = [d.get_text()] cell['execution_count'] = None cell['cell_type'] = 'code' dictionary['cells'].append(cell) else: cell = {} cell['metadata'] = {} cell['source'] = [d.decode_contents()] cell['cell_type'] = 'markdown' dictionary['cells'].append(cell) open('notebook.ipynb', 'w').write(json.dumps(dictionary))
here is part of print(soup.div)
output
div class="container"> <div class="navbar-header"> <button class="navbar-toggle collapsed" data-target=".navbar-collapse" data-toggle="collapse" type="button"> <span class="sr-only">Toggle navigation</span> <i class="fa fa-bars"></i> </button> <a class="navbar-brand" href="/"> <img src="/static/img/nav_logo.svg?v=479cefe8d932fb14a67b93911b97d70f" width="159"/> </a> </div> <div class="collapse navbar-collapse"> <ul class="nav navbar-nav navbar-right"> <li> <a class="active" href="http://jupyter.org">JUPYTER</a> </li> <li> <a href="/faq" title="FAQ"> <span>FAQ</span>
A screen shot of the resulting ipynb file, loaded on my local jupyter and after running all the cells
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With