Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jupyter (IPython) notebook: Convert an HTML notebook to ipynb

I have converted a Jupyter/IPython notebook to HTML format and subsequently lost the original ipynb file.

Is there a simple way to generate the original notebook file from the converted HTML file?

like image 359
foglerit Avatar asked Mar 10 '15 19:03

foglerit


People also ask

Can we use HTML in Jupyter Notebook?

Notebooks may be exported to a range of static formats, including HTML (for example, for blog posts), reStructuredText, LaTeX, PDF, and slide shows, via the nbconvert command. Furthermore, any . ipynb notebook document available from a public URL can be shared via the Jupyter Notebook Viewer <nbviewer>.


1 Answers

I recently used BeautifulSoup and JSON to convert html notebook to ipynb. the trick is to look at the JSON schema of a notebook and emulate that. The code selects only input code cells and markdown cells

here is my code

from bs4 import BeautifulSoup import json import urllib.request url = 'http://nbviewer.jupyter.org/url/jakevdp.github.com/downloads/notebooks/XKCD_plots.ipynb' response = urllib.request.urlopen(url) #  for local html file # response = open("/Users/note/jupyter/notebook.html") text = response.read()  soup = BeautifulSoup(text, 'lxml') # see some of the html print(soup.div) dictionary = {'nbformat': 4, 'nbformat_minor': 1, 'cells': [], 'metadata': {}} for d in soup.findAll("div"):     if 'class' in d.attrs.keys():         for clas in d.attrs["class"]:             if clas in ["text_cell_render", "input_area"]:                 # code cell                 if clas == "input_area":                     cell = {}                     cell['metadata'] = {}                     cell['outputs'] = []                     cell['source'] = [d.get_text()]                     cell['execution_count'] = None                     cell['cell_type'] = 'code'                     dictionary['cells'].append(cell)                  else:                     cell = {}                     cell['metadata'] = {}                      cell['source'] = [d.decode_contents()]                     cell['cell_type'] = 'markdown'                     dictionary['cells'].append(cell) open('notebook.ipynb', 'w').write(json.dumps(dictionary)) 

here is part of print(soup.div) output

div class="container"> <div class="navbar-header"> <button class="navbar-toggle collapsed" data-target=".navbar-collapse" data-toggle="collapse" type="button"> <span class="sr-only">Toggle navigation</span> <i class="fa fa-bars"></i> </button> <a class="navbar-brand" href="/"> <img src="/static/img/nav_logo.svg?v=479cefe8d932fb14a67b93911b97d70f" width="159"/> </a> </div> <div class="collapse navbar-collapse"> <ul class="nav navbar-nav navbar-right"> <li> <a class="active" href="http://jupyter.org">JUPYTER</a> </li> <li> <a href="/faq" title="FAQ"> <span>FAQ</span> 

A screen shot of the resulting ipynb file, loaded on my local jupyter and after running all the cells

enter image description here

like image 193
sgDysregulation Avatar answered Sep 23 '22 18:09

sgDysregulation