Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to tesseract multiple files in the same folder from command prompt?

I know how to Tesseract multiple files in the same directory using Terminal on OS X.

for i in *.tif ; do tesseract $i outtext;  done;

Does anyone have suggestions for how to do this on the Command Prompt on a computer running Windows?

like image 441
Thomas Padilla Avatar asked Jul 28 '15 15:07

Thomas Padilla


People also ask

How do I open multiple files in CMD?

To open multiple files from a command line, type the following command, then press Return: gcrystal file1 file2 file2 ... where file1 and others, might be relative, absolute paths or uris. When the application starts, a Gnome Crystal window is created for each file that you specified.

How do I move files between command prompts?

Highlight the files you want to move. Press the keyboard shortcut Command + C . Move to the location you want to move the files and press Option + Command + V to move the files.

How do I traverse a folder in command prompt?

To move into a directory, we use the cd command, so to move into the Desktop type cd desktop and press Enter . Once you've moved into a new directory, the prompt changes. So, in our example, the prompt is now C:\Users\Mrhope\Desktop>. You can see what files are found in this directory by typing the dir command again.

How to write text from multiple images in tesseract?

Tesseract will write over the same output file outtext.txt for each iteration. You will end up with a single file ( outtext.txt) containing only the text from the last image. You need to uniquely name each output file.

What does tesseract do on Unix compared to Windows?

Without knowing exactly what the tesseract command does on Unix compared to Windows it is difficult to give a comprehensive answer. On Windows you can use the for command to perform a command on several files.

How do I perform a command on several files at once?

On Windows you can use the for command to perform a command on several files. An A-Z Index of the Windows CMD command line - An excellent reference for all things Windows cmd line related.

How do I compare two files in a folder?

Search for Command Prompt, right-click the top result, and select the Run as administrator option. Type the following command to browse to the folder with the files you want to compare and press Enter: In the command, update the path with the location of the folder with the files to compare.


2 Answers

What is the Windows equivalent of the Unix for i command?

Without knowing exactly what the tesseract command does on Unix compared to Windows it is difficult to give a comprehensive answer.

On Windows you can use the for command to perform a command on several files.

From a command line:

for %i in (*.tif) do tesseract %i outtext

In a batch file:

for %%i in (*.tif) do tesseract %%i outtext

Further Reading

  • An A-Z Index of the Windows CMD command line - An excellent reference for all things Windows cmd line related.
  • for - Conditionally perform a command on several files.
like image 171
DavidPostill Avatar answered Oct 25 '22 16:10

DavidPostill


In the above example:

for %%i in (*.tif) do tesseract %%i outtext

Tesseract will write over the same output file outtext.txt for each iteration. You will end up with a single file (outtext.txt) containing only the text from the last image. You need to uniquely name each output file. You could replace the string outtext with %%i as shown below.

for %%i in (*.tif) do tesseract %%i %%i

However, if you want a different output file name, you can assign an additional variable using the set command. Then increment this variable for each iteration.

set /a j=1
for %%i in (*.tif) do (
tesseract %%i output_file%j%
set /a j+=1
)

However, %j% will expand to '1' for each iteration. You will end up with one file named outputfile1.txt. The %j% is expanded once at the beginning of the loop, and that same value is used for each iteration. Using the setlocal enabledelayedexpansion command and replacing %j% with !j! will force Windows to expand !j! for each iteration. To restore the previous environment settings a matching endlocal command should be issued.

setlocal enabledelayedexpansion
set /a j=1
for %%i in (*.tif) do (
tesseract %%i output_file!j!
set /a j+=1
)
endlocal

I tested this successfully on Microsoft Windows 7 Home Premium edition. I hope it helps you.

like image 20
Joe W. Avatar answered Oct 25 '22 16:10

Joe W.