Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading text file and get line with date values

Tags:

c#

readfile

Is there an easy way to find lines that consist of date time.

So far I can read the textfile and my next step is to parse it, but before that I think I need some guidance before I proceed. Here is my current read script:

List<string> Temp = new List<string>();            
string[] filePaths = Directory.GetFiles(@"C:\\Temp\\", "*.txt");

foreach (string files in filePaths)
{
    var fileStream = new FileStream(files, FileMode.Open, FileAccess.Read);
    using (var streamReader = new StreamReader(fileStream, Encoding.UTF8))
    {
        Temp.Add(streamReader.ReadToEnd());
    }
}

foreach (string i in Temp)
{
    if (i.Contains("Events"))
    {
        Console.WriteLine(i);        
    }
}

Here is the sample text template that I need to parse generated from the tool.

"[Output]"
"[Events]"
"Time"  "Duration"  "Severity"  "Event" "Text1" "Text2"


"[Acquisition Settings_1]"
"Data Set"  "DataSet1"
"Data Stream"   "Data"


"[Scan Data (Pressures in Torr)]"
"Time"  "Scan"  "Mass 1"    "Mass 2"    "Mass 3"    
"10/25/2018 4:59:27 PM" 1   5.5816e-008 1.3141e-008 -1.6109e-010    
"10/25/2018 4:59:35 PM" 2   5.5484e-008 1.3403e-008 6.9720e-010 
"10/25/2018 4:59:41 PM" 3   5.5633e-008 1.3388e-008 8.8094e-011 
"10/25/2018 4:59:48 PM" 4   5.7289e-008 1.2343e-008 1.4095e-010 
"10/25/2018 4:59:54 PM" 5   5.2841e-008 1.3219e-008 7.5257e-010 

"10/25/2018 4:59:57 PM" "After Calibration due to marginal data of daily pm3 rga checking"  
"10/25/2018 5:49:51 PM" "RGA Base Pressure
Flat pallet (2018-10-25_011_a1a)"   
"10/25/2018 6:21:53 PM" "PM3 SiNFILL_27A
2018-10-25_011_A4A" 
"10/25/2018 9:51:29 PM" "IBE1 STEP
FULL TAPE
NO PRE-BAKE"    
"10/25/2018 9:58:48 PM" "IBE2 STEP

And here is my Aim or expected result is to get the line with datetime value:

"10/25/2018 4:59:27 PM" 1   5.5816e-008 1.3141e-008 -1.6109e-010    
"10/25/2018 4:59:35 PM" 2   5.5484e-008 1.3403e-008 6.9720e-010 
"10/25/2018 4:59:41 PM" 3   5.5633e-008 1.3388e-008 8.8094e-011 
"10/25/2018 4:59:48 PM" 4   5.7289e-008 1.2343e-008 1.4095e-010 
"10/25/2018 4:59:54 PM" 5   5.2841e-008 1.3219e-008 7.5257e-010 

Any suggestion TIA.

like image 625
Syntax Rommel Avatar asked Nov 06 '22 19:11

Syntax Rommel


1 Answers

You could probably (tentatively) get away with something like this Pattern. It takes into consideration negative Extended Notation and also has the tabs in the original format (not shown in the example)

^""\d+/\d+/\d+ \d+:\d+:\d+ (AM|PM)""\s+-?\d+\s+\d+.?\d+e-\d+

Note: I am not going to write the regex explanation as it's too long.

Example

var pattern = @"^""\d+/\d+/\d+ \d+:\d+:\d+ (AM|PM)""\s+-?\d+\s+\d+.?\d+e-\d+";
var regex = new Regex(pattern, RegexOptions.Compiled);

var filePaths = Directory.GetFiles(@"C:\Temp", "*.txt");

var results = new List<string>();

foreach (var file in filePaths)
{
   var lines = File.ReadLines(@"D:\sample.txt").Where(x => regex.IsMatch(x));
   results.AddRange(lines);
}

However, to take this a step further you could do the following. This will put all the data parsed into a class.

Given

public class ScanData
{
   public DateTime Time { get; set; }
   public int Scan { get; set; }
   public decimal?[] MassResults  { get; set; }

   public static ScanData FromString(string data)
   {
      var split = data.Split('\t');

      decimal? Local(string value)
      {
         return decimal.TryParse(value, NumberStyles.Float, null, out var output) ? output : (decimal?)null;
      }

      var scanData = new ScanData()
                     {
                        Time = DateTime.ParseExact(split[0].Trim('"'), "M/d/yyyy h:m:s tt", null),
                        Scan = int.Parse(split[1]),
                        MassResults = split.Skip(2).Select(Local).ToArray()
                     };

      return scanData;
   }

}

Example

var pattern = @"^""\d+/\d+/\d+ \d+:\d+:\d+ (AM|PM)""\s+-?\d+\s+\d+.?\d+e-\d+";
var regex = new Regex(pattern, RegexOptions.Compiled);

var filePaths = Directory.GetFiles(@"C:\Temp", "*.txt");

var results = new List<ScanData>();

foreach (var file in filePaths)
{
   var lines = File.ReadLines(@"D:\sample.txt")
                   .Where(x => regex.IsMatch(x))
                   .Select(x => ScanData.FromString(x));
   results.AddRange(lines);
}
like image 144
TheGeneral Avatar answered Nov 14 '22 23:11

TheGeneral