Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read contents of a csv file inside zip file using PowerShell

I have a zip file which contains several CSV files inside it. How do I read the contents of those CSV files without extracting the zip files using PowerShell?

I having been using the Read-Archive Cmdlet which is included as part of the PowerShell Community Extensions (PSCX)

This is what I have tried so far.

$path = "$env:USERPROFILE\Downloads\"
$fullpath = Join-Path $path filename.zip

Read-Archive $fullpath | Foreach-Object {
    Get-Content $_.Name
}

But when I run the code, I get this error message Get-Content : An object at the specified path filename.csv does not exist, or has been filtered by the -Include or -Exclude parameter.

However, when I run Read-Archive $fullpath, it lists all the file inside the zip file

like image 243
Ishan Avatar asked Jun 01 '16 06:06

Ishan


2 Answers

There are multiple ways of achieving this:

1. Here's an example using Ionic.zip dll:

clear
Add-Type -Path "E:\sw\NuGet\Packages\DotNetZip.1.9.7\lib\net20\Ionic.Zip.dll"
$zip = [Ionic.Zip.ZipFile]::Read("E:\E.zip")

$file = $zip | where-object { $_.FileName -eq "XMLSchema1.xsd"}

$stream = new-object IO.MemoryStream
$file.Extract($stream)
$stream.Position = 0

$reader = New-Object IO.StreamReader($stream)
$text = $reader.ReadToEnd()
$text

$reader.Close()
$stream.Close()
$zip.Dispose()

It's picking the file by name (XMLSchema1.xsd) and extracting it into the memory stream. You then need to read the memory stream into something that you like (string in my example).

2. In Powershell 5, you could use Expand-Archive, see: https://technet.microsoft.com/en-us/library/dn841359.aspx?f=255&MSPPError=-2147217396

It would extract entire archive into a folder:

Expand-Archive "E:\E.zip" "e:\t"

Keep in mind that extracting entire archive is taking time and you will then have to cleanup the temporary files

3. And one more way to extract just 1 file:

$shell = new-object -com shell.application
$zip = $shell.NameSpace("E:\E.zip")
$file =  $zip.items() | Where-Object { $_.Name -eq "XMLSchema1.xsd"}
$shell.Namespace("E:\t").copyhere($file)

4. And one more way using native means:

Add-Type -assembly "system.io.compression.filesystem"
$zip = [io.compression.zipfile]::OpenRead("e:\E.zip")
$file = $zip.Entries | where-object { $_.Name -eq "XMLSchema1.xsd"}
$stream = $file.Open()

$reader = New-Object IO.StreamReader($stream)
$text = $reader.ReadToEnd()
$text

$reader.Close()
$stream.Close()
$zip.Dispose()
like image 79
Andrey Marchuk Avatar answered Oct 18 '22 20:10

Andrey Marchuk


Based on 4. solution of Andrey, I propose the following function:

(keep in mind that "ZipFile" class exists starting at .NET Framework 4.5)

Add-Type -assembly "System.IO.Compression.FileSystem"

function Read-FileInZip($ZipFilePath, $FilePathInZip) {
    try {
        if (![System.IO.File]::Exists($ZipFilePath)) {
            throw "Zip file ""$ZipFilePath"" not found."
        }

        $Zip = [System.IO.Compression.ZipFile]::OpenRead($ZipFilePath)
        $ZipEntries = [array]($Zip.Entries | where-object {
                return $_.FullName -eq $FilePathInZip
            });
        if (!$ZipEntries -or $ZipEntries.Length -lt 1) {
            throw "File ""$FilePathInZip"" couldn't be found in zip ""$ZipFilePath""."
        }
        if (!$ZipEntries -or $ZipEntries.Length -gt 1) {
            throw "More than one file ""$FilePathInZip"" found in zip ""$ZipFilePath""."
        }

        $ZipStream = $ZipEntries[0].Open()

        $Reader = [System.IO.StreamReader]::new($ZipStream)
        return $Reader.ReadToEnd()
    }
    finally {
        if ($Reader) { $Reader.Dispose() }
        if ($Zip) { $Zip.Dispose() }
    }
}
like image 5
Kino101 Avatar answered Oct 18 '22 18:10

Kino101