Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select the first 10 columns of a headerless csv file using PowerShell?

Tags:

powershell

csv

I have a CSV file called test.csv ($testCSV). There are many columns in this file but I would simply like to select the first 10 columns and put these 10 columns in to another CSV file. Please note that I DO NOT HAVE ANY COLUMN HEADERS so can not select columns based on a column name.

The below line of code will get the first 10 ROWS of the file:

$first10Rows = Get-Content $testCSV | select -First 10

However I need all the data for the first 10 COLUMNS and I am struggling to find a solution.

I have also had a look at splitting the file and attempting to return the first column as follows:

$split = ( Get-Content $testCSV) -split ','
$FirstColumn = $split[0]

I had hoped the $split[0] would return the entire first column but it only returns the very first field in the file.

Any help in solving this problem is very much appreciated.

Thanks in advance.

******UPDATE******

I am using the method as answered below by vonPryz to solve this problem, i.e.:

Import-Csv -Delimiter "," -Header @("a","b","c") -Path $testCSV  | Select a,b

However I am now also trying to import the CSV file only where column b is not null by adding this extra bit of code:

Import-Csv -Delimiter "," -Header @("a","b","c") -Path $testCSV  | Select a,b | where b -notmatch $null

I need to do this to speed up the script as there are tens of thousands of lines where column b is null and I do not need to import these lines. However, the above code returns no data, either meaning the code must be wrong or it thinks the field b is not null. An example of 2 lines of the text file is:

1,2,3
x,,z

And I only want the line(s) where the second column is occupied.

I hope I've explained that well and again, any help is appreciated.

*******************ANSWER********************

Import-Csv -Delimiter "," -Header @("a","b","c") -Path $testCSV | Select a,b | Where-Object { $_.b -ne '' }

Thanks!

like image 346
Johnathan Avatar asked Dec 07 '25 09:12

Johnathan


2 Answers

Lack of column headers is no problem. The cmdlet Import-CSV can specify headers with -Header switch. Assuming test data is saved as C:\temp\headerless.csv and contains

val11,val12,val13,val14
val21,val22,val23,val24
val31,val32,val33,val34

Importing it as CSV is trivial:

Import-Csv -Delimiter "," -Header @("a","b","c","d") -Path C:\temp\headerless.csv

#Output
a     b     c     d
-     -     -     -
val11 val12 val13 val14
val21 val22 val23 val24
val31 val32 val33 val34

Selecting just columns a and b is not hard either:

Import-Csv -Delimiter "," -Header @("a","b","c","d") -Path C:\temp\headerless.csv | select a,b | ft -auto

#Output    
a     b
-     -
val11 val12
val21 val22
val31 val32
like image 110
vonPryz Avatar answered Dec 10 '25 02:12

vonPryz


To start I want to mention that vonPryz's answer is a superb way of dealing with this. I just wanted to chime in about what you were trying to do and why it was not working.

You had the right idea. You were splitting the data on commas. However you were not doing this on every line. Just the file as a whole which was the source of your woes.

Get-Content $testCSV | ForEach-Object{
    $split = $_ -split ","
    $FirstColumn = $split[0]
}

That would split each line individually and then you could have populated the $FirstColumn variable.

like image 31
Matt Avatar answered Dec 10 '25 03:12

Matt



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!