How to get all remote repositories of a maven project hierarchy?



I'm redirecting all maven repository access to an Artifactory with the following ~/.m2/settings.xml:

<?xml version="1.0" encoding="UTF-8"?>
<settings xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.1.0 http://maven.apache.org/xsd/settings-1.1.0.xsd" xmlns="http://maven.apache.org/SETTINGS/1.1.0"
<!-- unclear what version changes -> use 1.1.0 because it's higher -->
          <snapshots />
and thus have to add extra remote repositories which are specified by a project (and its child projects) to the Artifactory instance. I currently use

find . -name pom.xml -exec grep -B 5 -C 5 '<repository>' {} +

which isn't very handy in case an URL is a variable and declared elsewhere and it doesn't skip duplicates. It's not the worst thing in the world, but maybe there's an improvement available.

The following doesn't work:

  • mvn versions:display-dependency-updates doesn't display remote repositories
  • mvn dependency:list-repositories only works until the first dependency can't be fetched if the proxy is enabled so that I have to figure out where to get it from, add the researched remote repository to Artifactory or move ~/.m2/settings.xml aside - less handy than the find command above

The solution should work recursively, i.e. include all repositories in all child projects and childrens child projects, etc.

It makes a lot of sense that a solutions don't require to download the dependencies directly from the remote repository first without the proxy since I'd like to transfer them through the Maven proxy immediately if possible - it's not a requirement, though.

A somewhat hacky approach could be those two steps:

  1. Get the effective POMs. Note that the below goal generates an XML file containing all POMs at once. However, variable names will already be resolved.

    mvn help:effective-pom -Doutput="effective-pom.xml"
  2. Parse the resulting XML file and gather the repositories, e.g., using a Python script gather-repos.py.

    import sys, xml.etree.ElementTree as ET
    root = ET.parse('effective-pom.xml').getroot()
    repositories = dict()
    for node in root.iter('{http://maven.apache.org/POM/4.0.0}repository'):
        repo_id = node.findtext('{http://maven.apache.org/POM/4.0.0}id')
        repositories[repo_id] = node
    for node in repositories.itervalues():
        ET.ElementTree(node).write(sys.stdout, default_namespace='http://maven.apache.org/POM/4.0.0')

Of course, the script can then be run via

chmod +x gather-repos.py
