I'm trying to write a Rust program that gets a separated list of filenames on stdin
.
On Windows, I might invoke it from a cmd
window with something like:
dir /b /s | findstr .*,v$ | rust-prog -n
On Unix I'd use something like:
find . -name '*,v' -print0 | rust-prog -0
I'm having trouble converting what I receive on stdin
into something that can be used by std::path::Path
. As I understand it, to get something that will compile on Windows or Unix, I'm going to need to use conditional compilation, and std::os::windows::ffi
or std::os::unix::ffi
as appropriate.
Furthermore, It seems on Windows I'll need to use kernel32::MultiByteToWideChar
using the current code page to create something usable by std::os::windows::ffi::OsStrExt
.
Is there an easier way to do this? Does what I'm suggesting even seem workable?
As an example, it's easy to convert a string to a path, so I tried to use the string handling functions of stdin
:
use std::io::{self, Read};
fn main() {
let mut buffer = String::new();
match io::stdin().read_line(&mut buffer) {
Ok(n) => println!("{}", buffer),
Err(error) => println!("error: {}", error)
}
}
On Windows, if I have a directory with a single file called ¿.txt
(that's 0xbf). and pipe the name into stdin
. I get: error: stream did not contain valid UTF-8
.
Here's a reasonable looking version for Windows. Convert the console supplied string to a wide string using win32api functions then wrap it in an OsString using OsString::from_wide
.
I'm not convinced it uses the correct code page yet. dir
seems to use OEM code page, so maybe that should be the default. There's also a distinction between input code page and output code page in a console.
In my Cargo.toml
[dependencies]
winapi = "0.2"
kernel32-sys = "0.2.2"
Code to read a list of filenames piped through stdin on Windows as per the question.
extern crate kernel32;
extern crate winapi;
use std::io::{self, Read};
use std::ptr;
use std::fs::metadata;
use std::ffi::OsString;
use std::os::windows::ffi::OsStringExt;
/// Convert windows console input to wide string that can
/// be used by OS functions
fn wide_from_console_string(bytes: &[u8]) -> Vec<u16> {
assert!(bytes.len() < std::i32::MAX as usize);
let mut wide;
let mut len;
unsafe {
let cp = kernel32::GetConsoleCP();
len = kernel32::MultiByteToWideChar(cp, 0, bytes.as_ptr() as *const i8, bytes.len() as i32, ptr::null_mut(), 0);
wide = Vec::with_capacity(len as usize);
len = kernel32::MultiByteToWideChar(cp, 0, bytes.as_ptr() as *const i8, bytes.len() as i32, wide.as_mut_ptr(), len);
wide.set_len(len as usize);
}
wide
}
/// Extract paths from a list supplied as Cr LF
/// separated wide string
/// Would use a generic split on substring if it existed
fn paths_from_wide(wide: &[u16]) -> Vec<OsString> {
let mut r = Vec::new();
let mut start = 0;
let mut i = start;
let len = wide.len() - 1;
while i < len {
if wide[i] == 13 && wide[i + 1] == 10 {
if i > start {
r.push(OsString::from_wide(&wide[start..i]));
}
start = i + 2;
i = i + 2;
} else {
i = i + 1;
}
}
if i > start {
r.push(OsString::from_wide(&wide[start..i]));
}
r
}
fn main() {
let mut bytes = Vec::new();
if let Ok(_) = io::stdin().read_to_end(&mut bytes) {
let pathlist = wide_from_console_string(&bytes[..]);
let paths = paths_from_wide(&pathlist[..]);
for path in paths {
match metadata(&path) {
Ok(stat) => println!("{:?} is_file: {}", &path, stat.is_file()),
Err(e) => println!("Error: {:?} for {:?}", e, &path)
}
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With