Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to clear Jupyter Notebook's output and metadata when using git commit?

The output and metadata is not for code reviewing and it is annoying if committed. How to clear Jupyter Notebook's output and metadata when using git commit?

like image 720
ocean11 Avatar asked Sep 02 '25 10:09

ocean11


1 Answers

This answer is based on these 2 posts:

  • Gist by 33eyes
  • Similar approach in Stack Overflow by dirkjot

My approach includes cleaning metadata at the same time.

Add this to your local .git/config

[filter "strip-notebook-output"]
clean = "jupyter nbconvert --ClearOutputPreprocessor.enabled=True --ClearMetadataPreprocessor.enabled=True --to=notebook --stdin --stdout --log-level=ERROR"

Create a .gitattributes file in your directory with notebooks, with this content:

*.ipynb filter=strip-notebook-output

Further updates worth reading (updated on 2025-02-19)

The Previous Setting for .git/config requires you have a default python env containing jupyter nbconvert.

The Following Setting (Recommended) for the config can bypass the default env.

  1. Install an isolated conda env named git
    conda create -n git --override-channels --strict-channel-priority -c conda-forge --yes python=3.12
    conda activate git
    pip install nbconvert
    
  2. Suppose your personal env path is D:/conda/win/envs/. Then you can set the config file as following:
    [filter "strip-notebook-output"]
    clean = "D:/conda/win/envs/git/Scripts/jupyter-nbconvert.exe --ClearOutputPreprocessor.enabled=True --ClearMetadataPreprocessor.enabled=True --to=notebook --stdin --stdout --log-level=ERROR"
    
like image 58
ocean11 Avatar answered Sep 04 '25 23:09

ocean11