I have a file in Unicode format on a windows machine. Is there any way to convert it to ASCII format on a windows machine using perl script
It's UTF-16 BOM.
If you want to convert unicode to ascii, you must be aware that some characters can't be converted, because they just don't exist in ascii. If you can live with that, you can try this:
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
use open IN => ':encoding(UTF-16)';
use open OUT => ':encoding(ascii)';
my $buffer;
open(my $ifh, '<', 'utf16bom.txt');
read($ifh, $buffer, -s $ifh);
close($ifh);
open(my $ofh, '>', 'ascii.txt');
print($ofh $buffer);
close($ofh);
If you do not have autodie, just remove that line - you should then change your open/close statements with a
open(...) or die "error: $!\n";
If you have characters that can't be converted, you will get warnings on the console and your output file will have e.g. text like
\x{00e4}\x{00f6}\x{00fc}\x{00df}
in it. BTW: If you don't have a mom but know it is Big Endian (Little Endian), you can change the encoding line to
use open IN => ':encoding(UTF-16BE)';
or
use open IN => ':encoding(UTF-16LE)';
Hope it works under Windows as well. I can't give it a try right now.
Take a look at the encoding option on the Perl open command. You can specify the encoding when opening a file for reading or writing:
It'd be something like this would work:
#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say switch);
use Data::Dumper;
use autodie;
open (my $utf16_fh, "<:encoding(UTF-16BE)", "test.utf16.txt");
open (my $ascii_fh, ">:encoding(ASCII)", ".gvimrc");
while (my $line = <$utf16_fh>) {
print $ascii_fh $line;
}
close $utf16_fh;
close $ascii_fh;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With