Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GIT: finding a list of files (e.g. using git ls-files) including submodules

Tags:

git

github

I've been trying to figure out how to get a list of all files in a git repo including those contained within submodules. Currently, git ls-files will provide the top level submodule directory but not the files contained within the submodule. On further investigation, I found that using git submodule, you can recursively find all of the submodules and then go a git ls-files using:

git submodule --quiet foreach --recursive "git ls-files"

The only problem with this is that the results are the path from the submodule but I need the full path from the repo. So for the following

e.g. /some/path/to/gitrepo/source/submodule/[file1, file2]

What I see is:

file1
file2

What I would like to see is:

source/submodule/file1
source/submodule/file2

Is there a way to do this? From the documentation, there are some pre-defined variables ($name, $path, $sha1 and $toplevel) but I'm not sure how to use these to get the desired results.

like image 762
Sheldon Avatar asked Apr 25 '16 18:04

Sheldon


People also ask

How do I get a list of files in git?

This command will list the files that are being tracked currently. If you want a list of files that ever existed use: git log --pretty=format: --name-only --diff-filter=A | sort - | sed '/^$/d'This command will list all the files including deleted files.

How do I use the ls command in git?

Use the terminal to display the . git directory with the command ls -a . The ls command lists the current directory contents and by default will not show hidden files. If you pass it the -a flag, it will display hidden files.

How do submodules work in git?

A git submodule is a record within a host git repository that points to a specific commit in another external repository. Submodules are very static and only track specific commits. Submodules do not track git refs or branches and are not automatically updated when the host repository is updated.

How do I see files in a git repository?

The Git Show command allows us to view files as they existed in a previous state. The version can be a commit ID, tag, or even a branch name. The file must be the path to a file. For example, the following would output a contents of a file named internal/example/module.go file from a tagged commit called “release-23”.


2 Answers

Another approach is possible with Git 2.11+ (Q4 2016)

git ls-files --recurse-submodules

See commit 75a6315, commit 07c01b9, commit e77aa33, commit 74866d7 (07 Oct 2016) by Brandon Williams (mbrandonw).
(Merged by Junio C Hamano -- gitster -- in commit 1c2b1f7, 26 Oct 2016)

ls-files: optionally recurse into submodules

"git ls-files" learned "--recurse-submodules" option that can be used to get a listing of tracked files across submodules (i.e. this only works with "--cached" option, not for listing untracked or ignored files).

This would be a useful tool to sit on the upstream side of a pipe that is read with xargs to work on all working tree files from the top-level superproject.

As shown in this test, the output would include the full path of the file, starting from the main parent repo.

The git ls-files documentation now includes:

--recurse-submodules

Recursively calls ls-files on each submodule in the repository.
Currently there is only support for the --cached mode.


Git 2.13 (Q2 2017) adds to the ls-files --recurse-submodules robustness:

See commit 2cfe66a, commit 2e5d650 (13 Apr 2017) by Jacob Keller (jacob-keller).
(Merged by Junio C Hamano -- gitster -- in commit 2d646e3, 24 Apr 2017)

ls-files: fix recurse-submodules with nested submodules

Since commit e77aa33 ("ls-files: optionally recurse into submodules", 2016-10-07, git 2.11) ls-files has known how to recurse into submodules when displaying files.

Unfortunately this fails for certain cases, including when nesting more than one submodule, called from within a submodule that itself has submodules, or when the GIT_DIR environemnt variable is set.

Prior to commit b58a68c ("setup: allow for prefix to be passed to git commands", 2017-03-17, git 2.13-rc0) this resulted in an error indicating that --prefix and --super-prefix were incompatible.

After this commit, instead, the process loops forever with a GIT_DIR set to the parent and continuously reads the parent submodule files and recursing forever.

Fix this by preparing the environment properly for submodules when setting up the child process. This is similar to how other commands such as grep behave.


As noted with Git 2.29 (Q4 2020), the config submodule.recurse would not work.

See commit 7d15fdb (04 Oct 2020) by Philippe Blain (phil-blain).
(Merged by Junio C Hamano -- gitster -- in commit 9d19e17, 05 Oct 2020)

gitsubmodules doc: invoke 'ls-files' with '--recurse-submodules'

Signed-off-by: Philippe Blain

git ls-files(man) was never taught to respect the submodule.recurse configuration variable, and it is too late now to change that, but still the command is mentioned in 'gitsubmodules(7)' as if it does respect that config.

Adjust the call in 'gitsubmodules(7)' by calling 'ls-files' with the '--recurse-submodules' option.

gitsubmodules now includes in its man page:

git ls-files --recurse-submodules

[NOTE]
git ls-files also requires its own --recurse-submodules flag.


With Git 2.36 (Q2 2022), git ls-files --stage --recurse-submodule is also supported.

like image 81
VonC Avatar answered Oct 12 '22 23:10

VonC


Take a look at the git submodule documentation, which says:

foreach

Evaluates an arbitrary shell command in each checked out submodule. The command has access to the variables $name, $path, $sha1 and $toplevel: $name is the name of the relevant submodule section in .gitmodules, $path is the name of the submodule directory relative to the superproject, $sha1 is the commit as recorded in the superproject, and $toplevel is the absolute path to the top-level of the superproject.

Given the above information, you can do something like:

git submodule foreach 'git ls-files | sed "s|^|$path/|"'

In this example, we're simply taking the output from git ls-files in a submodule and using sed to prepend the value of $path, which is the path of the submodule relative to the parent project's toplevel directory.

like image 33
larsks Avatar answered Oct 13 '22 00:10

larsks