I have some subtitle files in UTF-8. Sometimes there are some sporadic multibyte characters in these files which cause problem in some applications. How do I check in linux (and possibility locate these) if a certain file contains any multibyte character.

You can use file command <pre class="prettyprint"><code>chalet16$ echo test > a.txt chalet16$ echo testก > b.txt #One of Thai characters chalet16$ file *.txt a.txt: ASCII text b.txt: UTF-8 Unicode text </code></pre>

You can use <code>file</code> or <code>chardet</code> command.

Check if file contains multibyte character

2 Answers

You can use file command

chalet16$ echo test > a.txt
chalet16$ echo testก >  b.txt #One of Thai characters
chalet16$ file *.txt
a.txt: ASCII text
b.txt: UTF-8 Unicode text

185

answered Sep 24 '22 13:09

chalet16

You can use file or chardet command.

answered Sep 23 '22 13:09

kev

Related questions
                            
                                how to avoid routing through local stack in Linux
                            
                                reading serial port blocks for unknown reason
                            
                                Console overwriting like top
                            
                                Running emacs in graphics mode on remote server
                            
                                How do I create an 'install' package for a Qt application?
                            
                                Alignment and performance
                            
                                Compiling object file from an intermediate file of gcc
                            
                                gnuplot-like program for timeline data
                            
                                Linux Kernel timers subsystems: Precision vs. Resolution
                            
                                Setting up OpenGL on Fedora
                            
                                find command listing results in directory order
                            
                                How can I use '>' to redirect output within Node.js?
                            
                                How can I avoid preemption of my thread in user mode
                            
                                java light-weight debugger [closed]
                            
                                Tshark - can't display just data of custom protocol
                            
                                Parent/child process close file descriptor
                            
                                Units from non root user (for commands start/stop/enable/disable)
                            
                                segfault on write() with ~8MB buffer (OSX, Linux)
                            
                                Change back into a running process on Linux after you put it into the background
                            
                                Seeing too many lsof can't identify protocol

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Check if file contains multibyte character

Tags:

linux

multibyte

Masroor

People also ask

2 Answers

chalet16

kev

Recent Activity

Donate For Us