Can you help me to list browsers from this file http://techpatterns.com/downloads/firefox/useragentswitcher.xml into txt file, separated by %tab% delimiter?
3 or 4 columns should be there:
1) folder description from example data: <folder description="Browsers - Windows">
2) browser type from example data: <folder description="Legacy Browsers">
3) user agent from example data:<useragent description="Avant Browser 1.2" useragent="Avant Browser/1.2.789rel1 (http://www.avantbrowser.com)" app
Here I see 1st problem, because some browsers arn't in folder <folder description="Legacy Browsers">"
but under <separator/>
So the 1st column should define system, second is type and third is browser.
The next problem is that the Devises folder contains one more folder.
@echo off
Setlocal EnableDelayedExpansion
SET file=useragentswitcher.xml
SET delim="
FOR /F "tokens=* skip=1" %%F IN (!file!) DO (
REM echo %%F
call :parse "%%F" > temp.txt
FOR /F "tokens=1,2,3,4,5,6,7 skip=1 delims=" %%A IN (temp.txt) DO (
IF "%%A"=="folder" (
SET /A level=!level!+1
echo Level:!level!
) ELSE IF "%%A"=="/folder" (
SET /A level=!level!-1
echo Level:!level!
)
echo A:%%A
)
pause
)
exit /b
:parse
Setlocal EnableDelayedExpansion
SET A=%*
REM REMOVE double paranthesis and <>
SET A=!A:~2,-2!
REM replace double qoutes
SET A=!A:"=µ!
FOR /F "tokens=1,2 delims=µ=" %%A IN ("!A!") DO (
SET first=%%A
SET second=%%B
echo !first!
FOR /F "tokens=1,2 delims= " %%A IN ("!first!") DO (
echo %%A
echo %%B
)
echo !second!
)
endlocal
exit /b
This parses one tag of the line and I am going to work with it now.
Despite its annoying use of angle brackets, XML format is still widely used. Configuration files, RSS feeds, Office files (the ‘x’ in the .docx) are just a partial list. Using PowerShell to parse XML files is an essential step in your PowerShell journey.
You’d like to use PowerShell to parse this XML file get the computer names. To do that, you could use the Select-Xml command. In the file above, the computer names appear in the inner text (InnerXML) of the Name element. InnerXML is the text between the two element’s tags.
Luckily, PowerShell offers a more convenient and intuitive way to read XML files. PowerShell lets you read XML files and convert them to XML objects. Another way to use PowerShell to parse XML is to convert that XML to objects.
XML is all over the place. Despite its annoying use of angle brackets, XML format is still widely used. Configuration files, RSS feeds, Office files (the ‘x’ in the .docx) are just a partial list.
It seems you ought to be able to find a much better tool than batch to parse XML...
But I believe the code below is what you are looking for.
Because the number of folders varies, I swapped the order of the columns in the output. I put the browser description first, followed by the folders, one per column. This allows the definition of each column to be fixed.
I used the info in jeb's answer to include "
as a FOR delimiter.
EDIT - I simplified the code
Note - This first attempt was written to work with a copy of the XML that was retrieved using Internet Explorer. I've since discovered that IE altered the format of the file. This code is highly dependent on the exact format of the file, so it will not work on the original XML. It also serves as an example as to why batch is a poor choice for parsing XML
@echo off
setlocal enableDelayedExpansion
::Define the files to use - change as needed
set input="test.xml"
set output="result.txt"
::The assignment below should have exactly one TAB character between = and "
set "TAB= "
set cnt=0
set "folder0="
>%output% (
for /f usebackq^ tokens^=1^,2^ delims^=^=^" %%A in (%input%) do (
for %%N in (!cnt!) do (
if "%%A"=="- <folder description" (
set /a cnt+=1
for %%M in (!cnt!) do set "folder%%M=!folder%%N!%TAB%%%B"
)
if "%%A"==" </folder>" (
set /a cnt-=1
)
if "%%A"==" <useragent description" (
echo %%B!folder%%N!
)
)
)
)
The code will fail if !
appears in any of the descriptions because delayed expansion will corrupt expansion of any FOR variable that contains !
. I checked, and your file does not contain !
in any description.
The code could be modified to handle !
in the description, but it would get more complicated. It requires toggling of delayed expansion on and off, and preservation of variable values across the ENDLOCAL barrier.
The above code is highly dependent on the format of the XML. It will fail if the non-standard dashes are removed, or if the white space arrangement changes.
The following variation is a bit more robust, but it still requires that each line contains exactly one XML tag.
@echo off
setlocal enableDelayedExpansion
::Define the files to use - change as needed
set input="test.xml"
set output="result.txt"
::The assignment below should have exactly one TAB character between = and "
set "TAB= "
set cnt=0
set "folder0="
>%output% (
for /f usebackq^ tokens^=1^,2^ delims^=^=^" %%A in (%input%) do (
for %%N in (!cnt!) do (
set "test=%%A"
if "!test:<folder description=!" neq "!test!" (
set /a cnt+=1
for %%M in (!cnt!) do set "folder%%M=!folder%%N!%TAB%%%B"
)
if "!test:</folder>=!" neq "!test!" (
set /a cnt-=1
)
if "!test:<useragent description=!" neq "!test!" (
echo %%B!folder%%N!
)
)
)
)
EDIT - One last version
Here is a version that can handle !
in the data. I've added an additional column to the output. The first column is still the browser description. The 2nd column is the useragent string. The remaining columns are the folders. The solution uses the delayed expansion toggling technique. It also uses an additional FOR /F to preserve a variable value across the ENDLOCAL barrier.
@echo off
setlocal disableDelayedExpansion
::Define the files to use - change as needed
set input="test.xml"
set output="result.txt"
::The assignment below should have exactly one TAB character between = and "
set "TAB= "
set cnt=0
set folder0=""
>%output% (
for /f usebackq^ tokens^=1-4^ delims^=^=^" %%A in (%input%) do (
set "test=%%A"
set "desc=%%B"
set "agent=%%D"
setlocal enableDelayedExpansion
for %%N in (!cnt!) do (
if "!test:<folder description=!" neq "!test!" (
set /a cnt+=1
for %%M in (!cnt!) do for /f "delims=" %%E in ("!folder%%N!") do (
endlocal
set "folder%%M=%%~E%TAB%%%B"
set "cnt=%%M"
)
) else if "!test:</folder>=!" neq "!test!" (
endlocal
set /a cnt-=1
) else if "!test:<useragent description=!" neq "!test!" (
echo !desc!%TAB%!agent!!folder%%N!
endlocal
) else endlocal
)
)
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With