Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to rename a file with non-ASCII character encoding to ASCII

I have the file name, "abc枚.xlsx", containing some kind of non-ASCII character encoding and I'd like to remove all non-ASCII characters to rename it to "abc.xlsx".

Here is what I've tried:

import os
import string
os.chdir(src_dir)  #src_dir is a path to my directory that contains the odd file
for file_name in os.listdir(): 
    new_file_name = ''.join(c for c in file_name if c in string.printable)
    os.rename(file_name, new_file_name)

The following error results at os.rename():

builtins.WindowsError: (2, 'The system cannot find the file specified')

This is on a Windows system, sys.getfilesystemencoding() gives me mbcs, if that helps any.

What should I do to circumvent this error and allow me to change the file name?

like image 764
Vijchti Avatar asked Jul 25 '13 22:07

Vijchti


People also ask

What are non ASCII file names?

Non-ASCII filenames are stored in a special format called “Unicode”. But in some cases, Unicode offers multiple ways to write things that look exactly the same to humans.

How do I type non ASCII characters?

This is easily done on a Windows platform: type the decimal ascii code (on the numeric keypad only) while holding down the ALT key, and the corresponding character is entered. For example, Alt-132 gives you a lowercase "a" with an umlaut.

How do I encode an Ascii code?

ASCII encodes characters into seven bits of binary data. Since each bit can either be a 1 or a 0, that gives a total of 128 possible combinations. Each of these binary numbers can be converted to denary number from 0 through to 127. For example 1000001 in binary equals 65 in denary.

Is ASCII a character set or encoding?

ASCII is a 7-bit character set containing 128 characters. It contains the numbers from 0-9, the upper and lower case English letters from A to Z, and some special characters. The character sets used in modern computers, in HTML, and on the Internet, are all based on ASCII.


1 Answers

Here you go, this works with python 2.7 as well

import os
import string

for file_name in os.listdir(src_dir): 
    new_file_name = ''.join(c for c in file_name if c in string.printable)
    os.rename(os.path.join(src_dir,file_name), os.path.join(src_dir, new_file_name))

Cheers! Don't forget to up-vote if you find this answer useful! ;)

like image 68
Simanas Avatar answered Sep 28 '22 06:09

Simanas