Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I multiply all values in a csv using Powershell or Batch?

I have a load of personal ID numbers, which I need to download automatically. Rather than store thousands of unique personal IDs on my machine, I want to multiply all the IDs by a random integer, save the result, and delete the original file.

This will end up in Excel, but VBA is not an option because of org policy. I can't install anything - limited to what comes with Windows. As far as I understand, Excel's power query isn't able to delete a source file after reading it, but I could be wrong - if so that could be another direction to go in.

Sample of csv (the ids are not real)

136983,96771,216453,211521,589046,513320,195746,404556,190103,312261
,,,,,,,,,
,,,,,,,,,
,,,,,,,,,
413213,122904,135502,282999,460212,343970,433526,530707,,
,,,,,,,,,
,,,,,,,,,
216327,121969,165339,538695,440435,175431,396049,360505,160665,375980
,,,,,,,,,
251625,327979,155758,231189,385458,342295,365599,100644,312974,385772
,,,,,,,,,
398438,114618,216349,356734,504029,256328,466559,534053,144303,211043
306499,515840,90297,434565,,,,,,

Features

  • It could be any number of rows, probably between a couple of hundred and ten thousand.
  • Many of the rows are blank. Blank rows need to be preserved, because another column will need to be imported to go next to id (a blank row indicates no people correspond to this value from the imported column)
  • Many of the rows have less than 10 id numbers - less than the maximum, in other words.

Desired output

Let's say that the random integer was 2, this would produce:

273966,193542,432906,423042,1178092,1026640,391492,809112,380206,624522
,,,,,,,,,
,,,,,,,,,
,,,,,,,,,
826426,245808,271004,565998,920424,687940,867052,1061414,,
,,,,,,,,,
,,,,,,,,,
432654,243938,330678,1077390,880870,350862,792098,721010,321330,751960
,,,,,,,,,
503250,655958,311516,462378,770916,684590,731198,201288,625948,771544
,,,,,,,,,
796876,229236,432698,713468,1008058,512656,933118,1068106,288606,422086
612998,1031680,180594,869130,,,,,,

What I've tried

This answer looks promising, but it only deals with one column. I couldn't figure how to adapt it so that it accepts any number of columns.

@echo off
setlocal enabledelayedexpansion
FOR /F "tokens=1-18* delims=," %%A IN (mycsv.csv) DO (
set sum1="%%~C"
set /a cole=!sum1! * 2
echo %%~A,%%~B,%%~C,%%~D,!cole!
) >> output.csv

I don't really understand this Powershell based answer.

$csv = Import-Csv mycsv.csv
foreach ($row in $csv) {
  [int]$row.B *= -1
  [int]$row.F *= -1
}
$csv | Export-Csv output.csv

If I try to use it, it returns

powershell : Exception setting "B": "The property 'B' cannot be found on this object. Verify that the property exists and can be 
At line:1 char:1
+ powershell -ep Bypass .\t.ps1
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (Exception setti...sts and can be :String) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError
 
set."
At C:\Users\user\OneDrive\Personal\t.ps1:3 char:3
+   [int]$row.B *= -1
+   ~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], SetValueInvocationException
    + FullyQualifiedErrorId : ExceptionWhenSetting
 
Exception setting "F": "The property 'F' cannot be found on this object. Verify that the property exists and can be 
set."
At C:\Users\user\OneDrive\Personal\t.ps1:4 char:3
+   [int]$row.F *= -1
+   ~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], SetValueInvocationException
    + FullyQualifiedErrorId : ExceptionWhenSetting
 
Exception setting "B": "The property 'B' cannot be found on this object. Verify that the property exists and can be 
set."
At C:\Users\user\OneDrive\Personal\t.ps1:3 char:3
+   [int]$row.B *= -1
+   ~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], SetValueInvocationException
    + FullyQualifiedErrorId : ExceptionWhenSetting
 
Exception setting "F": "The property 'F' cannot be found on this object. Verify that the property exists and can be 
set."
At C:\Users\user\OneDrive\Personal\t.ps1:4 char:3
+   [int]$row.F *= -1
+   ~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], SetValueInvocationException
    + FullyQualifiedErrorId : ExceptionWhenSetting

I assumed that B and F just referred to the rows in the csv, but if I understand the error message (I may not) they're names that have to be defined (how?) Also I need this to work across all rows which have values, and skip empty rows and empty columns within rows. So hardcoding the rows and columns to be multiplied isn't going to work for me here.

Groping around in the dark, replacing

  [int]$row.B *= -1
  [int]$row.F *= -1

with

  [int]$row *= -1

returns

powershell : Cannot convert the "@{12=18}" value of type "System.Management.Automation.PSCustomObject" to type "System.Int32".
At line:1 char:1
+ powershell -ep Bypass .\t.ps1
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (Cannot convert ..."System.Int32".:String) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError
 
At C:\Users\user\OneDrive\Personal\t.ps1:3 char:3
+   [int]$row *= -1
+   ~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [], RuntimeException
    + FullyQualifiedErrorId : ConvertToFinalInvalidCastException
 
Cannot convert the "@{12=}" value of type "System.Management.Automation.PSCustomObject" to type "System.Int32".
At C:\Users\user\OneDrive\Personal\t.ps1:3 char:3
+   [int]$row *= -1
+   ~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidArgument: (:) [], RuntimeException
    + FullyQualifiedErrorId : ConvertToFinalInvalidCastException
like image 792
Ne Mo Avatar asked Oct 18 '25 17:10

Ne Mo


2 Answers

The logic for multiplying the values on each row of your CSV can be like this, assuming the CSV has a fixed number of columns:

function MultiplyRows {
[cmdletbinding()]
param(
    [parameter(Mandatory, ValueFromPipeline)]
    [object]$InputObject,
    [int]$MultiplyBy = 2
)

    begin { $isFirstObject = $true }
    process {
        if($isFirstObject) {
            $headers = $InputObject.PSObject.Properties.Name
            $isFirstObject = $false
        }

        $out = [ordered]@{}
        foreach($prop in $headers) {
            if(-not ($thisValue = [int]$InputObject.$prop)) {
                $out[$prop] = $null
                continue
            }
            $out[$prop] = $thisValue * $MultiplyBy
        }
        [pscustomobject]$out
    }
}

Import-Csv ... | MultiplyRows -MultiplyBy 3 | ConvertTo-Csv

If the CSV has no headers, you will need to determine how many columns it has and use the -Header parameter on Import-Csv. You can use the following, which, by looking at your CSV it should work but note that it may not work for all CSVs since some of then can have embedded commas like:

[pscustomobject]@{
    ExampleCol = 'value with , comma'
    AnotherCol = 'some value'
} | ConvertTo-Csv

# Results in:
"ExampleCol","AnotherCol"
"value with , comma","some value"

Using the function from before, and the CSV provided in question this is how the code would look:

$headers = (Get-Content ./some.csv -TotalCount 1).Split(',').ForEach({
    begin { $i = 0 }
    process { "Col{0}" -f $i++ }
})

# -MultiplyBy 2 is implied (Default Param value)
Import-Csv ./some.csv -Header $headers | MultiplyRows | ConvertTo-Csv

Which results in:

"Col0","Col1","Col2","Col3","Col4","Col5","Col6","Col7","Col8","Col9"
"6931098","5009052","5009132","6931104","5012678","6931102","4635438","5012902","6931094","6931100"
,,,,,,,,,
,,,,,,,,,
,,,,,,,,,
"6931092","6931088","5012886","6931084","6931090","5013392","6931086","5013550",,
,,,,,,,,,
,,,,,,,,,
"4713760","4713558","4713656","4713888","4713830","4713834","4713856","4713566","4713700","4713562"
,,,,,,,,,
"6883024","4934280","5143272","5143534","5143624","4934294","6883028","5143296","6883026","6883022"
,,,,,,,,,
"4761264","4761164","4761258","4761172","4761226","4761074","4761202","4761272","4761078","4760942"
"6442852","6442848","6442850","6442858",,,,,,
"5142022","4973532","3409286","5142074","5142080","5142036","5142086","5141880","4973484","4973540"
"5900466","5900428","5900536","6304092","6304100","6304094","6304098","6304102","6304096","5900768"
like image 77
Santiago Squarzon Avatar answered Oct 20 '25 09:10

Santiago Squarzon


The for /F loop available in a batch-file combines multiple consecutive delimiters to one, so you cannot use it with the comma as delimiter to read a CSV file with empty cells/values/fields. However, you could use it to read full lines, together with a standard for loop to split each at commas, given that there are none of the characters ?, * and <, > (so for does not need to access the file system to derive matching file names and therefore maintains the given strings):

@echo off
setlocal EnableExtensions DisableDelayedExpansion

rem // Define constants here:
set "_FILE=%~dp0data.csv"         & rem // (full path to target file)
set "_SEPC=,"                     & rem // (separator character)
set "_HEAD="                      & rem // (set to something if header)
set /A "_MULT=%RANDOM%%%(1<<7)+1" & rem // (multiplicator; result < 2 Gi!)
set /A "_ZPAD=0"                  & rem // (optionally zero-pad to width)

setlocal EnableDelayedExpansion
if defined _HEAD (set "LINE=") else (set "LINE=#")
rem // Read file line by line:
for /F usebackq^ delims^=^ eol^= %%J in ("!_FILE!") do (
    if defined LINE (
        set "LINE=%%J" & set "COLL=%_SEPC%"
        rem // Iterate through separated items, even blank ones:
        for %%I in ("!LINE:%_SEPC%=" "!") do (
            rem // Retain blank items in case:
            if not "%%~I"=="" (
                rem set /A "ITEM=%%~I*_MULT"
                rem // Multiply item and correctly handle zero-padded numbers:
                set "ITEM=00000000%%~I" & set "ITEM=1!ITEM:~-9!"
                set /A "ITEM%%=1000000000, ITEM*=_MULT"
                rem // Optionally zero-pad resulting number:
                if %_ZPAD% gtr 0 (
                    if "!ITEM:~,-%_ZPAD%!"=="" (
                        set "ITEM=000000000!ITEM!" & set "ITEM=!ITEM:~-10!"
                        set "ITEM=!ITEM:~-%_ZPAD%!"
                    )
                )
                rem // Append new number to new line string:
                set "COLL=!COLL!%_SEPC%!ITEM!"
            ) else set "COLL=!COLL!%_SEPC%"
        )
        rem // Return current altered line string:
        echo(!COLL:~2!
    ) else set "LINE=#" & echo(%%J
)
endlocal

endlocal
exit /B

In the top section, set variable _MULT to a multiplication factor so that the resulting values are always less than 231 − 1, because otherwise, negative values may result due to overflow.

The variable _ZPAD defines the width which the resulting products are padded to by preceding them with enough zeros. Numbers with more digits are not truncated. The maximal width is 10 digits. No padding occurs when _ZPAD is set to zero or less.

The variable _HEAD must be set to something in case the CSV file contains a headline to be kept.

Note, that this script would fail when the CSV file contains (quoted) strings and/or quoted values that contain separator characters on their own.

The script, let us call it multiply.bat, returns the resulting lines in the console. To write them to a file, use redirection (assuming the CSV file is data.csv in the current directory):

multiply.bat "data.csv" > "data_NEW.csv"

To overwrite the original CSV file, simply execute the following command line afterwards:

move /Y "data_NEW.csv" "data.csv"
like image 45
aschipfl Avatar answered Oct 20 '25 09:10

aschipfl