Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which options do exist for defining a Python package with node.js dependencies?

Currently, I have a few (unpublished) Python packages in local use, which I install (for development purposes) with a Bash script on Linux into an activated (otherwise "empty") virtual environment in the following manner:

cd /root/of/python/package
pip install -r requirements_python.txt # includes "nodeenv"
nodeenv -p # pulls node.js and integrates it into my virtual environment
npm i -g npm # update npm ...
cat requirements_node.txt | xargs npm install -g
pip install -e .

The background is that I have a number of node.js dependencies, JavaScript CLI scripts, which are called by my Python code.

Pros of current approach:

  • dead simple: relies on nodeenv for all required plumbing
  • can theoretically be implemented within setup.py with subprocess.Popen etc

Cons of current approach:

  • Unix-like platforms with Bash only
  • "hard" to distribute my packages, say on PyPI
  • requires a virtual environment
  • has potentially "interesting" side effects if a package is installed globally
  • potentially interferes with a pre-existing configuration / "deployment" of nodeenv in the current virtual environment

What is the canonical (if there is any) or just a sane, potentially cross-platform approach of defining node.js dependencies for a Python package, making it publishable?

Why is this question even relevant? JavaScript is not just for web development (any more). There are also interesting (relevant) data processing tools out there. If you do not want to miss / ignore them, well, welcome to this particular form of hell.


I recently came across calmjs, which appears to be what I am looking for. I have not experimented much with it yet and it also appears to be a relatively young project.

I started an issue there asking a similar question.


EDIT (1): Interesting resource: JavaScript versus Research Computing - A Brief Guide for Those Who Regret That This Has Become Necessary


EDIT (2): I started an issue against nodeenv, asking how I could make a project depend on it.

like image 225
s-m-e Avatar asked Apr 21 '18 08:04

s-m-e


People also ask

Which is used to include a package in a NodeJS application?

js Package Manager (npm) is the default and most popular package manager in the Node. js ecosystem, and is primarily used to install and manage external modules in a Node. js project. It is also commonly used to install a wide range of CLI tools and run project scripts.

What are package dependencies Python?

Dependencies are all of the software components required by your project in order for it to work as intended and avoid runtime errors. You can count on PyPI (the Python Package Index) to provide packages that can help you get started on everything from data manipulation to machine learning to web development, and more.

What is the best way to manage dependencies in Python?

Using venv and pipenv are two methods of managing dependencies in Python. They are simple to implement and, for most users, adequate solutions for handling multiple projects with different dependencies. However, they are not the only solutions. Other services can complement their use.


2 Answers

(Disclaimer: I am the author of calmjs)

After mulling over this particular issue for another few days, this question actually encapsulates multiple problems which may or may not be orthogonal to each other depending on one's given point of view, given some of the following (the list is not exhaustive)

  1. How can a developer ensure that they have all the information required to install the package when given one.
  2. How does a project ensure that the ground they are standing on is solid (i.e. has all the dependencies required).
  3. How easy is it for the user to install the given project.
  4. How easy is it to reproduce a given build.

For a single language, single platform project, the first question posed is trivially answered - just use whatever package management solution implemented for that language (i.e. Python - PyPI, Node.js - npm). The other questions generally fall into place.

For a multi-language, multi-platform, this is where it completely falls apart. Long story short, this is why projects generally have multiple sets of instructions for whatever version of Windows, Mac or Linux (of various mainstream distros) for the installation of their software, especially in binary form, to address the third question so that it's easy for the end user (which usually end up being doable, but not necessarily easy).

For developers and system integrators, who are definitely more interested in questions 2 and 4, they likely want an automation script for whatever platform they are on. This is kind of what you already got, except it only works on Linux, or wherever Bash is available. Now this also begs the question: How does one ensure Bash is available on the system? Some system administrators may prefer some other form of shell, so we are again back to the same problem, but instead of asking if Node.js is there, we have to ask if Bash is there. So this problem is basically unsolvable unless a line is drawn.

The first question hasn't really been mentioned yet, and I am going to make this fun by asking it in this manner: given a package from npm that requires a Python package, how does one specify a dependency on PyPI? Turns out such a project exists: nopy. I have not use it before, but at a casual glance it provide a specific way to record dependency information in the package.json file, which is the standard method for Node.js packages convey information about itself. Do note that it has a non-standard way of managing Python packages, however given that it does use whatever Python available, it will probably do the right thing if a Python virtual environment was activated. Doing it this way also mean that Node.js package dependants may have a way to figure out the required Python dependencies that have been declared by their Node.js dependencies, but note that without something else on top of it (or some other ground/line), there is no way to assert from within the environment that it will guarantee to do what needs to be done.

Naturally, coming back to Python, this question has been asked before (but not necessarily in a useful way specifically to you as the contexts are all different):

  • javascript dependencies in python project
  • How to install npm package from python script?
  • Django, recommended way to declare and solve JavaScript dependencies in blocks
  • pip: dependency on javascript library

Anyway, calmjs only solves problem 1 - i.e. let developers have the ability to figure out the Node.js packages they need from a given Python package, and to a lesser extent assist with problem 4, but without the guarantees of 2 and 3 it is not exactly solved.

From within Python dependency management point of view, there is no way to guarantee that the required external tools are available until their usage are attempted (it will either work or not work, and likewise from Node.js as explained earlier, and thank you for your question on the issue tracker, by the way). If this particular guarantee is required, many system integrators would make use of their favorite operating system level package manager (i.e. dpkg/apt, rpm/yum, or whatever else on Linux, Homebrew on OS X, perhaps Chocolatey on Windows), but again this does require further dependencies to install. Hence if multiple platforms are to be supported, there is no general solutions unless one were to reduce the scope, or have some kind of standard continuous integration that would generate working installation images that one would then deploy onto whatever virtualisation services the organisation uses (just an example).

Without all the specific baselines, this question is very difficult to provide a satisfactory answer for all parties involved.

like image 161
metatoaster Avatar answered Oct 13 '22 18:10

metatoaster


What you describe is certainly not the simplest problem. For Python alone, companies came up with all kinds of packaging methods (e.g. Twitter's pex, Spotify's dh-virtualenv, or even grocker, which shifts Python deployments into container space) - (plug: I did a presentation at PyCon Balkan '18 on Packaging Python applications).

That said, one very hacky way, I could think of would be:

  • Find a way to compile your Node apps into a single binary. There is pkg (a blogpost about it), which

[...] enables you to package your Node.js project into an executable that can be run even on devices without Node.js installed.

This way the Node tools would be take care of.

  • Next, take these binary blobs and add them (somehow) as scripts to your python package, so that they get distributed along with your package and find their place, where your actual python package can pick them up and execute them.

Upsides:

  • User do not need any nodejs on their machine (which is probably expected, when you just want to pip install something).
  • Your package gets more self-contained by including binaries.

Downsides:

  • Your python package will include binary, which is less common.
  • Containing binaries means that you will have to prepare versions for all platforms. Not impossible, but more work.
  • You will have to expand your package creation pipeline (Makefile, setup.py, or other) a bit to make this simple and repeatable.
  • Your package gets significantly larger (which is probably the least of the problems today).
like image 27
miku Avatar answered Oct 13 '22 19:10

miku