Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split array of file paths into hierarchical object in JavaScript

Using JSZip which when unziping a file gives me a list of folders and files. For example when I run

files.forEach((relativePath, file) => {
  console.log(relativePath);
});

I get:

three-dxf-master/
three-dxf-master/.DS_Store
three-dxf-master/.gitignore
three-dxf-master/LICENSE
three-dxf-master/README.md
three-dxf-master/bower.json
three-dxf-master/bower_components/

Some of these items are directories and some are files. I can tell which ones are directories by checking file.dir. I would like to split this into a hierarchical data structure. I want to split it up like so:

{
  "three-dxf-master": [
    ".DS_Store",
    ".gitignore",
    "LICENSE",
    "README.md",
    "bower.json",
    {
      "bower_components": [
        ".DS_Store",
        {
          "dxf-parser": [...]
        }
      ]
    }
  ]
}

This way I can send it over to Vue and format it in a nice file viewer. I looked through the docs and I don't see an easy way to create a heirarchical data structure for the files. I started looking into this by grabbing the last one in the file path after a split.

like image 644
Johnston Avatar asked Apr 15 '17 22:04

Johnston


3 Answers

Here is a sample code which also handle files at root.

See explanation of the code below snippet.

var paths = [
    "three-dxf-master/",
    "three-dxf-master/.DS_Store",
    "three-dxf-master/.gitignore",
    "three-dxf-master/LICENSE",
    "three-dxf-master/README.md",
    "three-dxf-master/bower.json",
    "three-dxf-master/bower_components/",
    "three-dxf-master/bower_components/.DS_Store",
    "three-dxf-master/bower_components/dxf-parser/",
    "three-dxf-master/bower_components/dxf-parser/foo",
    "three-dxf-master/bower_components/dxf-parser/bar",
    "three-dxf-master/dummy_folder/",
    "three-dxf-master/dummy_folder/foo",
    "three-dxf-master/dummy_folder/hello/",
    "three-dxf-master/dummy_folder/hello/hello",
]

// Extract a filename from a path
function getFilename(path) {
    return path.split("/").filter(function(value) {
        return value && value.length;
    }).reverse()[0];
}

// Find sub paths
function findSubPaths(path) {
    // slashes need to be escaped when part of a regexp
    var rePath = path.replace("/", "\\/");
    var re = new RegExp("^" + rePath + "[^\\/]*\\/?$");
    return paths.filter(function(i) {
        return i !== path && re.test(i);
    });
}

// Build tree recursively
function buildTree(path) {
    path = path || "";
    var nodeList = [];
    findSubPaths(path).forEach(function(subPath) {
        var nodeName = getFilename(subPath);
        if (/\/$/.test(subPath)) {
            var node = {};
            node[nodeName] = buildTree(subPath);
            nodeList.push(node);
        } else {
            nodeList.push(nodeName);
        }
    });
    return nodeList;
}

// Build tree from root
var tree = buildTree();

// By default, tree is an array
// If it contains only one element which is an object, 
// return this object instead to match OP request
if (tree.length == 1 && (typeof tree[0] === 'object')) {
    tree = tree[0];
}

// Serialize tree for debug purposes
console.log(JSON.stringify(tree, null, 2));

Explanation

function getFilename(path) {
    return path.split("/").filter(function(value) {
        return value && value.length;
    } ).reverse()
    [0];
}

To get filename, path is splitted by /.

/path/to/dir/ => ['path', 'to', 'dir', '']

/path/to/file => ['path', 'to', 'file']

Only values with a length are kept, this handle dir path.

The filename is the last value of our array, to get it we simple reverse the array and get the first element.

function findSubPaths(path) {
    // slashes need to be escaped when part of a regexp
    var rePath = path.replace("/", "\\/");
    var re = new RegExp("^" + rePath + "[^\\/]*\\/?$");
    return paths.filter(function(i) {
        return i !== path && re.test(i);
    });
}

To find sub paths of a path, we use a filter on paths list.

The filter use a regular expression (a demo is available here) to test if a path is starting with the parent path and ending with either a / (this is a dir path) or end of line (this is a file path).

If the tested path isn't equal to parent path and match the regexp, then it's accepted by the filter. Otherwise it's rejected.

function buildTree(path) {
    path = path || "";
    var nodeList = [];
    findSubPaths(path).forEach(function(subPath) {
        var nodeName = getFilename(subPath);
        if(/\/$/.test(subPath)) {
            var node = {};
            node[nodeName] = buildTree(subPath);
            nodeList.push(node);            
        }
        else {
            nodeList.push(nodeName);
        }   
    });
    return nodeList;
}

Now that we have methods to extract a filename from a path and to find sub paths, it's very easy to build our tree. Tree is a nodeList.

If sub path ends with / then it's a dir and we call buildTree recursively before appending the node to nodeList.

Otherwise we simply add filename to nodeList.

Additional code

if (tree.length == 1 && (typeof tree[0] === 'object')) {
    tree = tree[0];
}

By default, returned tree is an array.

To match OP request, if it contains only one element which is an object, then we return this object instead.

like image 197
Stephane Janicaud Avatar answered Oct 19 '22 10:10

Stephane Janicaud


You can split the lines into records, then split each record into fields. When processing, determine if a field is a directory or file. If a directory, see if it's a subdirectory and create it if it doesn't exist. Then move into it.

If it's a file, just push into the current directory.

The format in the OP does not allow for files in the root directory so the following throws an error if one is encountered. To allow files in the root, the base object must be an array (but it seems to be an object).

The following also allows paths to be in any order and to be created non-sequentially, e.g. it will accept:

foobar/fum

it doesn't need:

foobar/
foobar/fum

Hopefully the comments are sufficient.

var data = 'three-dxf-master/' +
           '\nfoobar/fumm' +
           '\nthree-dxf-master/.DS_Store' +
           '\nthree-dxf-master/.gitignore' +
           '\nthree-dxf-master/LICENSE' +
           '\nthree-dxf-master/README.md' +
           '\nthree-dxf-master/bower.json' +
           '\nthree-dxf-master/bower_components/' +
           '\nthree-dxf-master/bower_components/.DS_Store' +
           '\nthree-dxf-master/bower_components/dxf-parser/';

function parseData(data) {
  var records = data.split(/\n/);
  var result = records.reduce(function(acc, record) {
    var fields = record.match(/[^\/]+\/?/g) || [];
    var currentDir = acc;
       
    fields.forEach(function (field, idx) {

      // If field is a directory...
      if (/\/$/.test(field)) {
        
        // If first one and not an existing directory, add it
        if (idx == 0) {
          if (!(field in currentDir)) {
            currentDir[field] = [];
          }
          
          // Move into subdirectory
          currentDir = currentDir[field];
          
        // If not first, see if it's a subdirectory of currentDir
        } else {
          // Look for field as a subdirectory of currentDir
          var subDir = currentDir.filter(function(element){
            return typeof element == 'object' && element[field];
          })[0];
          
          // If didn't find subDir, add it and set as currentDir
          if (!subDir) {
            var t = Object.create(null);
            t[field] = [];
            currentDir.push(t);
            currentDir = t[field];
            
          // If found, set as currentDir
          } else {
            currentDir = subDir[field];
          }
        }
        
      // Otherwise it's a file. Make sure currentDir is a directory and not the root
      } else {
        if (Array.isArray(currentDir)) {
          currentDir.push(field);
          
        // Otherwise, must be at root where files aren't allowed
        } else {
          throw new Error('Files not allowed in root: ' + field);
        }
      }
    });
    
    return acc;
    
  }, Object.create(null));
  return result;
}

//console.log(JSON.stringify(parseData(data)));
console.log(parseData(data));
like image 1
RobG Avatar answered Oct 19 '22 12:10

RobG


Info

Was looking for an implementation after trying all the solutions in this page, each had bugs.

Finally I found this

Solution

You will need to add "/" to jszip paths output to use the algorithm, you could use forEach loop.

var paths = [
    '/FolderA/FolderB/FolderC/Item1',
    '/FolderA/FolderB/Item1',
    '/FolderB/FolderD/FolderE/Item1',
    '/FolderB/FolderD/FolderE/Item2',
    '/FolderA/FolderF/Item1',
    '/ItemInRoot'
];

function arrangeIntoTree(paths, cb) {
    var tree = [];

    // This example uses the underscore.js library.
    _.each(paths, function(path) {

        var pathParts = path.split('/');
        pathParts.shift(); // Remove first blank element from the parts array.

        var currentLevel = tree; // initialize currentLevel to root

        _.each(pathParts, function(part) {

            // check to see if the path already exists.
            var existingPath = _.findWhere(currentLevel, {
                name: part
            });

            if (existingPath) {
                // The path to this item was already in the tree, so don't add it again.
                // Set the current level to this path's children
                currentLevel = existingPath.children;
            } else {
                var newPart = {
                    name: part,
                    children: [],
                }

                currentLevel.push(newPart);
                currentLevel = newPart.children;
            }
        });
    });

    cb(tree);
}

arrangeIntoTree(paths, function(tree) {
    console.log('tree: ', tree);
});

I also needed to display the data in an interactive tree, I used angular-tree-control, which accepts the exact format.

like image 1
Gal Margalit Avatar answered Oct 19 '22 10:10

Gal Margalit