What language should I use for file and string manipulation?
This might seem objective, but really isn't I think. There's lot to say about this. For example I can see clearly that for most usages Perl would be a more obvious candidate than Java. I need to do this quite often and at this time I use C# for it, but I would like a more scriptlike language to do this.
I can imagine Perl would be a candidate for it, but I would like to do it in PowerShell since PowerShell can access the .NET library (easy). Or is Python a better candidate for it? If I have to learn a new language, Python is certainly one on my list, rather than Perl.
What I want to do for example, is to read a file, make some changes and save it again. E.g.: open it, number all lines (say with 3 digits) and close it. Any example, in any language, would be welcome, but the shorter the better. It is utility scripting I'm after here, not OO, TDDeveloped, unit-tested stuff of course.
What I would very much like to see is something as (pseudocode here):
open foobar.as f
foreach line in f.lines
line.addBefore(currenIteratorCounter.format('ddd') + '. ')
close f
So:
bar.txt
Frank Zappa
Cowboy Henk
Tom Waits
numberLines bar.txt
bar.txt
001. Frank Zappa
002. Cowboy Henk
003. Tom Waits
UPDATE:
The Perl and Python examples here are great, and definitely in the line of what I was hoping and expecting. But aren't there any PowerShell guys out there?
This is actually pretty easy in PowerShell:
function Number-Lines($name) {
Get-Content $name | ForEach-Object { $i = 1 } { "{0:000}. {1}" -f $i++,$_ }
}
What I'm doing here is getting the contents of the file, this will return a String[]
, over which I iterate with ForEach-Object
and apply a format string using the -f
operator. The result just drops out of the pipeline as another String[]
which can be redirected to a file if needed.
You can shorten it a little by using aliases:
gc .\someFile.txt | %{$i=1}{ "{0:000}. {1}" -f $i++,$_ }
but I won't recommend that for a function definition.
You way want to consider using two passes, though and constructing the format string on the fly to accommodate for larger numbers of lines. If there are 1500 lines {0:000}
it won't be sufficient anymore to get neatly aligned output.
As for which language is best for such tasks, you might look at factors such as
In the light of the last point you might even be better off using cmd
for this task. The code is similarly pretty simple:
@echo off
setlocal
set line=1
for /f "delims=" %%l in (%1) do call :process %%l
endlocal
goto :eof
:process
call :lz %line%
echo %lz%. %*
set /a line+=1
goto :eof
:lz
if %1 LSS 10 set lz=00%1&goto :eof
if %1 LSS 100 set lz=0%1&goto :eof
set lz=%1&goto :eof
goto :eof
That assumes, of course, that it has to run somewhere else than your own machine. If not, then use whatever fits your needs :-)
perl -i -ne 'printf("00%d. %s",$.,$_)' your-filename-here
You may want %03d instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With