Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Search Jupyter notebook markdown cells from command line

I use ag to search through my notes. My notes are written down in Markdown files and Markdown cells contained within Jupyter notebooks.

I can search the Markdown files conveniently with ag --markdown .... It would be very handy if something similar could be done with the Jupyter notebook files. But this would require that ag understands the format of these notebooks.

My question: is there a way to search only the Markdown cells for a given string in a Jupyter notebook file? Any pattern matcher used in the solution is acceptable for me (ag, grep, ack, ...).

p.s. The notebooks are composed in JSON. Here's a sample:

$ head notebook.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "THIS IS A MARKDOWN STRING"
   ]
  },
  {
like image 258
marnix Avatar asked May 09 '18 14:05

marnix


2 Answers

I'd look to use jq to filter out all markdown cells of a python notebook. For instance, if you just wanted to spit out all the markdown source, you could use the following:

$< notebook.ipynb | jq '.cells[]|select(.cell_type == "markdown")|.source[]'

jq is fast, and used for far more elaborate solutions when saving ipython notebooks to git, for example: Using IPython notebooks under version control

like image 52
gregory Avatar answered Oct 11 '22 13:10

gregory


I don't know if ag can be interfaced with a filter, but to get the Markdown out of a notebook file the following Python code will suffice

import nbformat
from sys import argv
nb = nbformat.read(argv[1], nbformat.NO_CONVERT)
for cell in nb.cells:
    if cell.cell_type == 'markdown' : print(cell.source)
like image 24
gboffi Avatar answered Oct 11 '22 12:10

gboffi