Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling files with carriage return in filename on Windows

I have an external USB, NTFS-formatted hard drive which contains many files which I need to eventually copy to a drive on a Windows Server 2008 R2 machine.

The files on the drive were placed there by scripts run with the drive mounted on Solaris. The user who did this copy was careless and edited their copy script on a Windows machine, resulting in shell script lines such as:

cp /sourceDir/sourceFileName /externalDrivePath/targetFileName\r\n

and as such, the files on the external drive have a trailing carriage return in their filenames. Standard Windows copy utilities (copy, xcopy, robocopy) fail to copy these files with error 0x7B / 123 : "The filename, directory name, or volume label syntax is incorrect."

I have tested, and am fairly sure that if I had the drive mounted again on a Linux box, I should be able to repair the files with commands such as:

mv /externalDrive/targetFileName\r /externalDrivePath/targetFileName\n

However, I do not have immediate access to a Linux machine.

What I have tried so far to repair/move these files:

"Application" solutions on Windows Server 2008 R2:

  1. Renaming files in Windows Explorer -- would be unfeasible solution due to sheer volume of files, but it doesn't work anyways.
  2. Wildcard pattern matching the filenames from cmd prompt, e.g. copy E:\externalDrivePath\targetFileName* anotherPath. Fails with 0x7B error.
  3. Copying files from cmd prompt using 8.3 (short) filenames. Files in question do not have short names, per output of dir /x

"Programming" solutions on Windows Server 2008 R2:

  1. Copying/Renaming files using Python/Java: any attempt to open/copy the carriage-return file throws exception tracing back to the same 0x7B Windows error.
  2. Copying files using Windows C 'CopyFile' API: fails with 0x7B error. Here I found the files using FindNextFile API, and passed that source path into CopyFile, but the OS still fails to copy the file.
  3. Writing my own file copy function in C using fopen, ofstream, etc. The fopen call again fails with 0x7B.
  4. Copying files using C++ boost::filesystem APIs: fails with 0x7B error. Again, found the files using a boost::filesystem::directory_iterator and passed the found file's path to boost::filesystem::copy_file()
  5. Providing file path to Win32 APIs CopyFile / MoveFile as "\?\E:\externalDrivePath\targetFileName\r". Calls fail again with 0x7B error.

I also dabbled with mounting this drive on an OS X machine to run the copy, expecting it would provide support for the NTFS drive more like Solaris did. However, it fails to copy with similar error messages to Windows -- I guess OS X's NTFS implementation is more "Windows-like"?

If this is solvable on Windows, I feel like it's going to either require a very low-level C function that manipulates the FILE itself, without 'opening' it based on its string filename. Not sure how to go about that. That, or some file repair utility that I'm unaware of which incorporates this functionality already.

Any alternative approaches or suggestions how to implement what I'm describing would be most appreciated.

like image 946
cuberoot8 Avatar asked Nov 10 '22 13:11

cuberoot8


1 Answers

TLDR: Try CreateFileW with a unicode path prefixed with \\?\ and containing the trailing carriage return.

The \\?\ path syntax bypasses a lot of the usual validation rules, unicode exansion, etc and allows long file paths and even (dangerously) allows characters like slashes inside a filename.

Given that, I'd imagine a carriage returns should be fairly trivial to handle...

This page relating to long filenames has more details. Relevant parts quoted below

There is no need to perform any Unicode normalization on path and file name strings for use by the Windows file I/O API functions because the file system treats path and file names as an opaque sequence of WCHARs. Any normalization that your application requires should be performed with this in mind, external of any calls to related Windows file I/O API functions.

When using an API to create a directory, the specified path cannot be so long that you cannot append an 8.3 file name (that is, the directory name cannot exceed MAX_PATH minus 12). The shell and the file system have different requirements. It is possible to create a path with the Windows API that the shell user interface is not able to interpret properly.

And from here

On newer file systems, such as NTFS, ex-FAT, UDFS, and FAT32, Windows stores the long file names on disk in Unicode, which means that the original long file name is always preserved. This is true even if a long file name contains extended characters and regardless of the code page that is active during a disk read or write operation. The case of the file name is preserved, even when the file system is not case-sensitive ...

like image 173
Basic Avatar answered Nov 14 '22 21:11

Basic