I've been trying to use the AWS CLI to download all files from a sub-folder in AWS however, after the first few files download it fails to download the rest. I believe this is because it adds an extension to the filename and it then sees that as an invalid filepath.
I'm using the following command;
aws s3 cp s3://my_bucket/sub_folder /tmp/ --recursive
It gives me the following error for almost all of the files in the subfolder;
[Errno 22] Invalid argument: 'C:\\tmp\\2019-08-15T16:15:02.tif.deDBF2C2
I think this is because of the .deDBF2C2 extension it seems to be adding to the files when downloading though I don't know why it does. The filenames all end with .tif in the actual bucket.
Does anyone know what causes this?
Update: The command worked once I executed it from a linux machine. Seems to be specific to windows.
This is an oversight by AWS using Windows reserved characters in Log files names! When you execute the command it will create all the directory's however any logs with ::
in the name fail to download.
Issue is discussed here: https://github.com/aws/aws-cli/issues/4543
So frustrated I came up with a workaround by executing a "DryRun" which prints the expected log output and porting that to a text file, eg:
>aws s3 cp s3://config-bucket-7XXXXXXXXXXX3 c:\temp --recursive --dryrun > c:\temp\aScriptToDownloadFilesAndReplaceNames.txt
The output file is filled with these aws log entries we can turn into aws script commands:
(dryrun) download: s3://config-bucket-7XXXXXXXXXXX3/AWSLogs/7XXXXXXXXXXX3/Config/ap-southeast-2/2019/10/1/ConfigHistory/7XXXXXXXXXXX3_Config_ap-southeast-2_ConfigHistory_AWS::RDS::DBInstance_20191001T103223Z_20191001T103223Z_1.json.gz to \AWSLogs\7XXXXXXXXXXX3\Config\ap-southeast-2\2019\10\1\ConfigHistory\703014955993_Config_ap-southeast-2_ConfigHistory_AWS::RDS::DBInstance_20191001T103223Z_20191001T103223Z_1.json.gz
In Notepad++ or other text editor you replace the (dryrun) download:
with aws s3 cp
Then you will see the following lines with the command: aws s3 cp, the Bucket file and the local file path. We need to remove the :: in the local file path on the right side of the to:
aws s3 cp s3://config-bucket-7XXXXXXXXXXX3/AWSLogs/7XXXXXXXXXXX3/Config/ap-southeast-2/2019/10/1/ConfigHistory/7XXXXXXXXXXX3_Config_ap-southeast-2_ConfigHistory_AWS::RDS::DBInstance_20191001T103223Z_20191001T103223Z_1.json.gz to AWSLogs\7XXXXXXXXXXX3\Config\ap-southeast-2\2019\10\1\ConfigHistory\7XXXXXXXXXXX3_Config_ap-southeast-2_ConfigHistory_AWS::RDS::DBInstance_20191001T103223Z_20191001T103223Z_1.json.gz
We can replace the :: with - only in local paths not S3 Bucket path's using a regex (.*)::
that removes the last occurrence of chars at the end of each line:
And here we can see I've replaced the ::'s with hyphens $1-
by clicking 'Replacing All' twice:
Next remove the to (ignore the | cursor icon in the below image, to should be replaced with nothing).
FIND: json.gz to AWSLogs
REPLACE: json.gz AWSLogs
Finally select all the lines copy/paste into a command prompt to download all the files with reserved file characters!
UPDATE:
If you have WSL (Windows Subsystem Linux) you should be able to download the files and then issue a simple file rename replacing the ::'s before copying to the mounted Windows folder system.
I tried from my raspberry pi and it worked. Seems to only be an issue with Windows OS.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With