Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speed up my batch file parsing

I have a batch file that takes input from a txt file that looks like this..

Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.


Server name lak-print01
Printer name Microsoft XPS Document Writer
Share name 
Driver name Microsoft XPS Document Writer
Port name XPSPort:
Comment 
Location 
Print processor WinPrint
Data type RAW
Parameters 
Attributes 64
Priority 1
Default priority 1
Average pages per minute 0
Printer status Idle 
Extended printer status Unknown 
Detected error state Unknown 
Extended detected error state Unknown 

Server name lak-print01
Printer name 4250_Q1
Share name 4250_Q1
Driver name Canon iR5055/iR5065 PCL5e
Port name IP_192.168.202.84
Comment Audit Department in Lakewood Operations
Location Operations Center
Print processor WinPrint
Data type RAW
Parameters 
Attributes 10826
Priority 1
Default priority 0
Average pages per minute 0
Printer status Idle 
Extended printer status Unknown 
Detected error state Unknown 
Extended detected error state Unknown 

Server name lak-print01
Printer name 3130_Q1
Share name 3130_Q1
Driver name Canon iR1020/1024/1025 PCL5e
Port name IP_192.168.202.11
Comment Canon iR1025 
Location Operations Center
Print processor WinPrint
Data type RAW
Parameters 
Attributes 10824
Priority 1
Default priority 0
Average pages per minute 0
Printer status Idle 
Extended printer status Unknown 
Detected error state Unknown 
Extended detected error state Unknown 

and parses it to get certain things in the list, like server name, printer name, driver name, etc.. and then puts each block entry into its own comma deliminated row. So i can have multiple rows, each one for a block of text, which each column having the particular information. Some of these txt files have 100+ entries. When it gets to parsing, each file I try to parse takes 5-10 minutes

The Parse code is as follows.

:Parselak-print01
SETLOCAL enabledelayedexpansion
:: remove variables starting $
FOR  /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
(FOR /f "delims=" %%a IN (lak-print01.txt) DO CALL :analyse "%%a")>lak-print01.csv
attrib +h lak-print01.csv
GOTO :EOF

:analyse
SET "line=%~1"
SET /a fieldnum=0
FOR %%s IN ("Server name" "Printer name" "Driver name"
            "Port name" "Location" "Comment" "Printer status" 
        "Extended detected error state") DO CALL :setfield %%~s
GOTO :eof

:setfield
SET /a fieldnum+=1
SET "linem=!line:*%* =!"
SET "linet=%* %linem%"
IF "%linet%" neq "%line%" GOTO :EOF 
IF "%linem%"=="%line%" GOTO :EOF
SET "$%fieldnum%=%linem%"
IF NOT DEFINED $8 GOTO :EOF 
SET "line="
FOR /l %%q IN (1,1,7) DO SET "line=!line!,!$%%q!"
ECHO !line:~1!
:: remove variables starting $
FOR  /F "delims==" %%a In ('set $ 2^>Nul') DO SET "%%a="
GOTO :eof

and the output I get is

lak-print01,Microsoft XPS Document Writer,Microsoft XPS Document Writer,XPSPort:,,,Idle 
lak-print01,4250_Q1,Canon iR5055/iR5065 PCL5e,IP_192.168.202.84,Operations Center,Audit Department in Lakewood Operations,Idle 
lak-print01,3130_Q1,Canon iR1020/1024/1025 PCL5e,IP_192.168.202.11,Operations Center,Canon iR1025 ,Idle 
lak-print01,1106_TRN,HP LaserJet P2050 Series PCL6,IP_172.16.10.97,Monroe,HP P2055DN,Idle 
lak-print01,1101_TRN,HP LaserJet P2050 Series PCL6,IP_10.3.3.22,Burlington,Training Room printer,Idle 
lak-print01,1096_Q3,Canon iR1020/1024/1025 PCL5e,IP_192.168.96.248,Silverdale,Canon iR 1025,Idle 
lak-print01,1096_Q2,Kyocera Mita KM-5035 KX,IP_192.168.96.13,Silverdale,Kyocera CS-5035 all in one,Idle 
lak-print01,1096_Q1,HP LaserJet P4010_P4510 Series PCL 6,IP_192.168.96.12,Silverdale,HP 4015,Idle 
lak-print01,1095_Q3,HP LaserJet P4010_P4510 Series PCL 6,IP_192.168.95.247,Sequim,HP LaserJet 4015x,Idle 

Everything is perfect, and the code works as intended.. but its just super freaking slow!

How do I speed this up? the problem is there is no true delim and the tokens vary.. for instance comment needs token 2, but printer name, needs token 3.

Any help to increase the speed of parsing.. the program works perfectly, but super slow during parsing.

like image 786
Alkemdah Avatar asked Dec 02 '25 21:12

Alkemdah


1 Answers

If speed is what you need, I'd suggest Marpa, a general BNF parser, in Perl — code, output.

It would take some time to get used to, but does the job and gives you a very powerful tool you can use easily — note how natural the grammar resembles the input.

Hope this helps.

like image 114
rns Avatar answered Dec 05 '25 19:12

rns