I have a fixed width file with records in a format as follows
DDEDM2018890 19960730015000010000
DDETPL015000 20150515015005010000
DDETPL015010 20150515015003010000
DDETPL015020 20150515015002010000
DDETPL015030 20150515015005010000
DDETPL015040 20150515015000010000
the first 3 characters identify the record type, in the above example all records are of type DDE but there are also lines of a different type in the file.
the following regular expression with named capture groups parses the relevant information from each record for my purpose (notice it also filters down to DDE record types:
DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3})
play with this regex on this excellent online parser
I have written a script that uses the Get-Content, ForEach-Object and Select-Object cmdlets to convert the fixed width file into a csv file.
I wonder if I could replace the Get-Content and ForEach-Object cmdlets by a single Select-String cmdlet?
#this powershell script reads fixed width file and generates a csv file of the relevant & converted values
#Prepare HashSet object for Select-Object to convert CategoryCode and append with CategoryId
$Category = @{
Name = "Category"
Expression = {
$cat = switch($_.CategoryCode)
{
"50"{"A"}
"54"{"C"}
"60"{"F"}
"66"{"I"}
"74"{"M"}
"88"{"T"}
}
$cat+$_.CategoryId
}
}
gc "C:\Path\To\File.txt" | % {
if($_ -match "DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3}).*$")
{
#$matches is a hashset of named capture groups, convert to object to allow Select-Object to handle hashset elements as object properties
[PSCustomObject]$matches
}
} | select Database, $Category, Length #| export-csv "AnalysisLengths.csv" -NoTypeInformation
Before I finalized the script, I was trying to use the Select-String cmdlet but could not figure out how to use it, I believe it can achieve the same result in a more eloquent way... this is what I had:
##Could this be completed with just the Select-String commandlet instead of Get-Content+ForEach+Select-Object?
Select-String -Path "C:\Path\To\File.txt" `
-Pattern "DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3})" `
| Select-Object -ExpandProperty Matches
Using -ExpandProperty should convert the Microsoft.PowerShell.Commands.MatchInfo Matches property into the actual System.Text.RegularExpressions.Match objects for each line...
see also Powershell Select-Object vs ForEach on Select-String results
Here is one way (I'am not so proud of it)
Select-String -Path "C:\Path\To\File.txt" -Pattern "DDE(?<Database>\w{3})\d{2}(?<CategoryCode>\d{2})(?<CategoryId>\d{1})\d\s+\d{8}\d{3}(?<Length>\d{3})" | %{New-Object -TypeName PSObject -Property @{Database=$_.matches.groups[1];CategoryCode=$_.matches.groups[2];CategoryId=$_.matches.groups[3];Length=$_.matches.groups[4]}} | export-csv "C:\Path\To\File.csv"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With