Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading srt (subtitle) files with Python3

I wish to be able to read an srt file with python3.

These files can be found here: http://www.opensubtitles.org/

With info here: http://en.wikipedia.org/wiki/SubRip

Subrip supports any encoding: ascii or unicode, for example.

If I understand correctly then I need to specify which decoder to use when I use pythons read function. So am I right in saying that I need to know how the file is encoded in order to make this judgement? If so how do I establish that for each file if I have a hundred such files with different sources and language support?

Ultimately I would prefer if I could convert the files so that they are all in utf-8 encoding to start with. But some of these files might be some obscure encoding for all I know.

Please help,

Barry

like image 398
Baz Avatar asked Jun 12 '26 15:06

Baz


1 Answers

You could use the charade package (formerly chardet) to detect the encoding.

like image 171
Thomas Avatar answered Jun 14 '26 03:06

Thomas



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!