Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby: How to determine if file being read is binary or text

Tags:

file-io

ruby

I am writing a program in Ruby which will search for strings in text files within a directory - similar to Grep.

I don't want it to attempt to search in binary files but I can't find a way in Ruby to determine whether a file is binary or text.

The program needs to work on both Windows and Linux.

If anyone could point me in the right direction that would be great.

Thanks,

Xanthalas

like image 410
Xanthalas Avatar asked Mar 01 '10 12:03

Xanthalas


People also ask

How can you tell if a file is text or binary?

File extensions We can usually tell if a file is binary or text based on its file extension. This is because by convention the extension reflects the file format, and it is ultimately the file format that dictates whether the file data is binary or text.

What is the difference between a binary and a text file in Ruby?

The two file types may look the same on the surface, but they encode data differently. While both binary and text files contain data stored as a series of bits (binary values of 1s and 0s), the bits in text files represent characters, while the bits in binary files represent custom data.

What is the difference between a binary and a text file *?

Text files are organized around lines, each of which ends with a newline character ('\n'). The source code files are themselves text files. A binary file is the one in which data is stored in the file in the same way as it is stored in the main memory for processing.

Is reading a binary file is faster than reading a text file?

The top four rows are times in seconds to write 100 numbers. The ratios are in the last two rows. So the answer is that it is between 14 and 62 times faster to write binary versus text.


2 Answers

libmagic is a library which detects filetypes. For this solution I assume, that all mimetype's which start with text/ represent text files. Eveything else is a binary file. This assumption is not correct for all mime types (eg. application/x-latex, application/json), but libmagic detect's these as text/plain.

require "filemagic"

def binary?(filename)
  begin
    fm= FileMagic.new(FileMagic::MAGIC_MIME)
    !(fm.file(filename)=~ /^text\//)
  ensure
    fm.close
  end
end
like image 141
johannes Avatar answered Sep 22 '22 01:09

johannes


gem install ptools
require 'ptools'
File.binary?(file)
like image 21
quadruplebucky Avatar answered Sep 21 '22 01:09

quadruplebucky