Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do this in PowerShell? Or : what language to use for file and string manipulation?

What language should I use for file and string manipulation?

This might seem objective, but really isn't I think. There's lot to say about this. For example I can see clearly that for most usages Perl would be a more obvious candidate than Java. I need to do this quite often and at this time I use C# for it, but I would like a more scriptlike language to do this.

I can imagine Perl would be a candidate for it, but I would like to do it in PowerShell since PowerShell can access the .NET library (easy). Or is Python a better candidate for it? If I have to learn a new language, Python is certainly one on my list, rather than Perl.

What I want to do for example, is to read a file, make some changes and save it again. E.g.: open it, number all lines (say with 3 digits) and close it. Any example, in any language, would be welcome, but the shorter the better. It is utility scripting I'm after here, not OO, TDDeveloped, unit-tested stuff of course.

What I would very much like to see is something as (pseudocode here):

open foobar.as f

foreach  line in f.lines 
 line.addBefore(currenIteratorCounter.format('ddd') + '. ')

close f

So:

bar.txt 

Frank Zappa
Cowboy Henk
Tom Waits

numberLines bar.txt

bar.txt 

001. Frank Zappa
002. Cowboy Henk
003. Tom Waits

UPDATE:

The Perl and Python examples here are great, and definitely in the line of what I was hoping and expecting. But aren't there any PowerShell guys out there?

like image 236
Peter Avatar asked Dec 02 '22 07:12

Peter


2 Answers

This is actually pretty easy in PowerShell:

function Number-Lines($name) {
    Get-Content $name | ForEach-Object { $i = 1 } { "{0:000}. {1}" -f $i++,$_ }
}

What I'm doing here is getting the contents of the file, this will return a String[], over which I iterate with ForEach-Object and apply a format string using the -f operator. The result just drops out of the pipeline as another String[] which can be redirected to a file if needed.

You can shorten it a little by using aliases:

gc .\someFile.txt | %{$i=1}{ "{0:000}. {1}" -f $i++,$_ }

but I won't recommend that for a function definition.

You way want to consider using two passes, though and constructing the format string on the fly to accommodate for larger numbers of lines. If there are 1500 lines {0:000} it won't be sufficient anymore to get neatly aligned output.

As for which language is best for such tasks, you might look at factors such as

  • conciseness of code (Perl will be hard to beat there, especially that one-liner in another answer)
  • readability and maintainability of code
  • availability of the tools (Perl and Python aren't installed on Windows by default (PowerShell only since Windows 7), so deployment might be hindered.)

In the light of the last point you might even be better off using cmd for this task. The code is similarly pretty simple:

@echo off
setlocal
set line=1
for /f "delims=" %%l in (%1) do call :process %%l
endlocal
goto :eof

:process
call :lz %line%
echo %lz%. %*
set /a line+=1
goto :eof

:lz
if %1 LSS 10 set lz=00%1&goto :eof
if %1 LSS 100 set lz=0%1&goto :eof
set lz=%1&goto :eof
goto :eof

That assumes, of course, that it has to run somewhere else than your own machine. If not, then use whatever fits your needs :-)

like image 59
Joey Avatar answered Dec 20 '22 05:12

Joey


perl -i -ne 'printf("00%d. %s",$.,$_)' your-filename-here

You may want %03d instead.

like image 24
brian-brazil Avatar answered Dec 20 '22 06:12

brian-brazil