I'm trying to write a simple program that will compare the files in separate folders. I'm currently using LINQ to Objects to parse the folder and would like to included information extracted from the string in my result set as well.
Here's what I have so far:
FileInfo[] fileList = new DirectoryInfo(@"G:\Norton Backups").GetFiles();
var results = from file in fileList
orderby file.CreationTime
select new { file.Name, file.CreationTime, file.Length };
foreach (var x in results)
Console.WriteLine(x.Name);
This produces:
AWS025.sv2i
AWS025_C_Drive038.v2i
AWS025_C_Drive038_i001.iv2i
AWS025_C_Drive038_i002.iv2i
AWS025_C_Drive038_i003.iv2i
AWS025_C_Drive038_i004.iv2i
AWS025_C_Drive038_i005.iv2i
...
I would like to modify the LINQ query so that:
_C_Drive038
in the examples above, though 038
and possibly the drive letter could change)._i0XX
at the end of the file name).038
).001
would be an increment number)I believe the basic layout of the query would look like the following, but I'm not sure how best to complete it (I've got some ideas for how some of this might be done, but I'm interested to heard how others might do it):
var results = from file in fileList
let IsMainBackup = \\ ??
let ImageNumber = \\ ??
let IncrementNumber = \\ ??
where \\ it is a backup file.
orderby file.CreationTime
select new { file.Name, file.CreationTime, file.Length,
IsMainBackup, ImageNumber, IncrementNumber };
When looking for the ImageNumber
and IncrementNumber
, I would like to assume that the location of this data is not always fixed, meaning, I'd like to know of a good way to parse this (If this requires RegEx, please explain how I might use it).
NOTE: Most of my past experience in parsing text involved using location-based string functions, such as LEFT
, RIGHT
, or MID
. I'd rather not fall back on those if there is a better way.
LINQ can be used to query and transform strings and collections of strings. It can be especially useful with semi-structured data in text files. LINQ queries can be combined with traditional string functions and regular expressions.
LINQ to objects – Allows querying in-memory objects like arrays, lists, generic list and any type of collections. LINQ to XML – Allows querying the XML document by converting the document into XElement objects and then querying using the local execution engine.
LINQ applies the principles of object-oriented programming to relational data. It provides a unified programming model for querying data from different types of data sources, and extends data capabilities directly into the C# and Visual Basic languages. For more information, see Language-Integrated Query (LINQ).
LINQ is a data querying API that provides querying capabilities to . NET languages with a syntax similar to a SQL. LINQ queries use C# collections to return data. LINQ in C# is used to work with data access from sources such as objects, data sets, SQL Server, and XML. LINQ stands for Language Integrated Query.
Using regular expressions:
Regex regex = new Regex(@"^.*(?<Backup>_\w_Drive(?<ImageNumber>\d+)(?<Increment>_i(?<IncrementNumber>\d+))?)\.[^.]+$");
var results = from file in fileList
let match = regex.Match(file.Name)
let IsMainBackup = !match.Groups["Increment"].Success
let ImageNumber = match.Groups["ImageNumber"].Value
let IncrementNumber = match.Groups["IncrementNumber"].Value
where match.Groups["Backup"].Success
orderby file.CreationTime
select new { file.Name, file.CreationTime, file.Length,
IsMainBackup, ImageNumber, IncrementNumber };
Here is a description of the regular expression:
^ Start of string.
.* Allow anything at the start.
(?<Backup>...) Match a backup description (explained below).
\. Match a literal period.
[^.]+$ Match the extension (anything except periods).
$ End of string.
Backup is:
_\w_Drive A literal underscore, any letter, another underscore, then the string "Drive".
(?<ImageNumber>\d+) At least one digit, saved as ImageNumber.
(?<Increment>...)? An optional increment description.
Increment is:
_i A literal underscore, then the letter i.
(?<IncrementNumber>\d+) At least one digit, saved as IncrementNumber.
Here is the test code I used:
using System;
using System.IO;
using System.Text.RegularExpressions;
using System.Linq;
class Program
{
static void Main(string[] args)
{
FileInfo[] fileList = new FileInfo[] {
new FileInfo("AWS025.sv2i"),
new FileInfo("AWS025_C_Drive038.v2i"),
new FileInfo("AWS025_C_Drive038_i001.iv2i"),
new FileInfo("AWS025_C_Drive038_i002.iv2i"),
new FileInfo("AWS025_C_Drive038_i003.iv2i"),
new FileInfo("AWS025_C_Drive038_i004.iv2i"),
new FileInfo("AWS025_C_Drive038_i005.iv2i")
};
Regex regex = new Regex(@"^.*(?<Backup>_\w_Drive(?<ImageNumber>\d+)(?<Increment>_i(?<IncrementNumber>\d+))?)\.[^.]+$");
var results = from file in fileList
let match = regex.Match(file.Name)
let IsMainBackup = !match.Groups["Increment"].Success
let ImageNumber = match.Groups["ImageNumber"].Value
let IncrementNumber = match.Groups["IncrementNumber"].Value
where match.Groups["Backup"].Success
orderby file.CreationTime
select new { file.Name, file.CreationTime,
IsMainBackup, ImageNumber, IncrementNumber };
foreach (var x in results)
{
Console.WriteLine("Name: {0}, Main: {1}, Image: {2}, Increment: {3}",
x.Name, x.IsMainBackup, x.ImageNumber, x.IncrementNumber);
}
}
}
And here is the output I get:
Name: AWS025_C_Drive038.v2i, Main: True, Image: 038, Increment:
Name: AWS025_C_Drive038_i001.iv2i, Main: False, Image: 038, Increment: 001
Name: AWS025_C_Drive038_i002.iv2i, Main: False, Image: 038, Increment: 002
Name: AWS025_C_Drive038_i003.iv2i, Main: False, Image: 038, Increment: 003
Name: AWS025_C_Drive038_i004.iv2i, Main: False, Image: 038, Increment: 004
Name: AWS025_C_Drive038_i005.iv2i, Main: False, Image: 038, Increment: 005
It was a bit of fun working out a good answer for this one :)
The below piece of code gives you what you need. Note the use of the search pattern when retrieving the files - there is no point retrieving more files than necessary. Also notice the use of the parseNumber() function, this was just to show you how to change the string result from the regex to a number should you need it in that format.
static class Program
{
[STAThread]
static void Main()
{
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
//Application.Run(new Form1());
GetBackupFiles(@"c:\temp\backup files");
}
static void GetBackupFiles(string path)
{
FileInfo[] fileList = new DirectoryInfo(path).GetFiles("*_Drive*.*v2i");
var results = from file in fileList
orderby file.CreationTime
select new
{ file.Name
,file.CreationTime
,file.Length
,IsMainBackup = file.Extension.ToLower() == ".v2i"
,ImageNumber = Regex.Match(file.Name, @"drive([\d]{0,5})", RegexOptions.IgnoreCase).Groups[1]
,IncrementNumber = parseNumber( Regex.Match(file.Name, @"_i([\d]{0,5})\.iv2i", RegexOptions.IgnoreCase).Groups[1])
};
foreach (var x in results)
Console.WriteLine(x.Name);
}
static int? parseNumber(object num)
{
int temp;
if (num != null && int.TryParse(num.ToString(), out temp))
return temp;
return null;
}
}
Note that with the regexs i am assuming some consistency in the file names, if they were to deviate from the format you mentioned then you would have to adjust them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With