Explode a single file script into a project with proper directory layout

Question

The problem

Suppose that I have written a lengthy script in some language "lang", and now want to convert this single-file script into a directory tree with a project consisting of many files. I want to insert some kind of separators and file-paths into this file, and process it in some way so that in the end I obtain:

a proper project directory layout (sth. like this),
build-definition file,
readme's,
separate subdirectories for main/src and test/src etc.

For example, given the following script (pseudocode):

// required dependencies, should be moved
// into the build definition build.foo
require "org.foo" % "foo-core" % "1.2.3"
require "org.bar" % "bar-gui" % "3.2.1"

// A longer comment that should be converted
// into a text file and moved into a 'notes'
// subdirectory

/*
#README

Another lengthy comment that should go into
a readme.md
*/

/** A class that should 
  * go to src/main/lang/proj/A.lang
  */
class A {
  def a = "foo"
}

/** Another class
  * that should go to src/main/lang/proj/B.lang
  */
class B {
  def b = "bar"
}

/** Some tests,
  * should end up in 
  * src/test/lang/proj/MyTest.lang
@Test def testFoo() {
  assert(2 + 2 == 5)
}

and assuming that I can insert arbitrary separators, commands, escape-sequences and file paths into this file, I would like to obtain the following project:

project/
|-- build.txt
|-- notes
|   `-- note_01.txt
|-- readme.md
`-- src
    |-- main
    |   `-- lang
    |       `-- proj
    |           |-- A.lang
    |           `-- B.lang
    `-- test
        `-- lang
            `-- proj
                `-- MySpec.lang

Edit:

What follows is a less-sophisticated version of my own answer below

What I've tried

Here is one naive way to do it:

Convert the original script into a bash script by prepending #!/bin/bash
split the source code into HEREDOCS
insert package declarations where necessary
add bunch of mkdir -p and cd between the HEREDOC-pieces
cat the HEREDOC pieces into appropriately named files
test the script on empty directories until it works as expected

For the above script, it might look somehow like this:

#!/bin/bash

mkdir project
cd project

cat <<'EOF' > build.txt
// required dependencies, should be moved
// into the build definition build.foo
require "org.foo" % "foo-core" % "1.2.3"
require "org.bar" % "bar-gui" % "3.2.1"
EOF

mkdir notes
cd notes
cat <<'EOF' > note_01.txt
// A longer comment that should be converted
// into a text file and moved into a 'notes'
// subdirectory
EOF
cd ..

cat <<'EOF' > readme.md
/*
#README

Another lengthy comment that should go into
a readme.md
*/
EOF

mkdir -p src/main/lang/proj
cd src/main/lang/proj
cat <<'EOF' > A.lang
package proj

/** A class
  * that should go to src/main/lang/proj/A.lang
  */
class A {
  def a = "foo"
}
EOF

cat <<'EOF' > B.lang
package proj
/** Another class
  * that should go to src/main/lang/proj/B.lang
  */
class B {
  def b = "bar"
}
EOF
cd ../../..

mkdir -p test/lang/proj
cd test/lang/proj
cat <<'EOF' > MySpec.lang
package proj

/** Some tests,
  * should end up in 
  * src/test/lang/proj/MyTest.lang
@Test def testFoo() {
  // this should end up in test
  assert(2 + 2 == 5)
}
EOF
cd ../../..

What's wrong with this approach

It does generate the correct tree, but this approach seems rather error-prone:

it's too easy to cd ../../.. to the wrong nesting level
too easy to mkdir with a wrong name, and then fail to cd into it.
There is no way to handle the entire tree construction as a single transaction, that is, if something fails later in the script, there is no simple way to clean up the mess generated before the error occurred.

I certainly could try to make it a bit less brittle by defining special functions that mkdir and cd in one go, and then wrap invocations of those functions together with cats into (mkdirAndCd d ; cat) etc.

But it just doesn't feel quite right. Isn't there a much simpler way to do it? Could one somehow combine the standard bash/linux utilities into a tiny & very restricted domain specific language for generating directory trees with text files? Maybe some newer version of split where one could specify where to split and where to put the pieces?

Related questions:

mkdir and touch in single command
The reverse of tree - reconstruct file and directory structure from text file contents?

Other interesting proposals that don't seem to work:

Use tar. That would mean that one would have to convert the text file manually into a valid tar-archive. While a tar archive indeed is a single plain-text file, its internal format does not look like the most comfortable DSL for such a simple task. It was never intended to be used by humans directly in that way.
Similar argument holds for shar. Since shar uses the bash itself to extract the archive, my above proposal is, in principle, a manually generated shar-archive in a very uncommon format, therefore shar seems to share all the drawbacks with the above proposal. I'd rather prefer something more restricted, that allows to do fewer things, but provides more guarantees about the quality of the outcome.

Maybe I should emphasize again that I don't have a tree to begin with, so there is nothing to compress. I have only the single script file and a rough idea of what the tree should look like in the end.

LMC · Accepted Answer

Seems to me that you are trying to write a custom parser. Provided that all blocks mentioned by you are ended by double line endings, this could help you

#!/bin/bash

gawk 'BEGIN{RS="

([/][*]|[/]{2,2})"} 
        { 
        if ($0 ~ /#README/){
                system("echo -e \"
This is a Readme.md
--------
" $0 "\"")
        }else if ($0 ~ /class /){
                system("echo -e \"
This is a class
---------
/*" $0 "\"")
        }else if ($0 ~ /require /){
                system("echo -e \"
this is a conf
-----------
" $0 "\"")
        }else if($0 ~ /[/]{2,2}.*
[/]{2,2}/){
                system("echo -e \"
this is a note
-----------
" $0 "\"")
        }

}' your_script.lang

The key part is the record separator RS that splits block of code that start with ' //' or ' /*'. Instead of echo -e you could write custom scripts for each type of block. Please note that the record separator will not be present on $0 so you have to add the missing characters, as in the /class / example above.

The output of the above code is

this is a conf
-----------
// required dependencies, should be moved
// into the build definition build.foo
require org.foo % foo-core % 1.2.3
require org.bar % bar-gui % 3.2.1

this is a note
-----------
A longer comment that should be converted
// into a text file and moved into a 'notes'
// subdirectory

This is a Readme.md
--------

#README

Another lengthy comment that should go into
a readme.md
*/

This is a class
---------
/** A class that should 
* go to src/main/lang/proj/A.lang
*/
class A {
def a = foo
}

This is a class
---------
/** Another class
* that should go to src/main/lang/proj/B.lang
*/
class B {
def b = bar
}

About your concerns:

it's too easy to cd ../../.. to the wrong nesting level -> define a variable with root path and cd to it.
too easy to mkdir with a wrong name, and then fail to cd into it. -> define variables with directory names and check if they already exists.

path1=src/main/lang/some if [ -d $path1 ]; then do_something fi
There is no way to handle the entire tree construction as a single transaction ... -> write to file paths of every NEW directory/file that you create and use it to revert if necessary.

Explode a single file script into a project with proper directory layout

Tags:

bash

Andrey Tyukin

1 Answers

LMC

Recent Activity

Donate For Us