I have a string that looks like that:
random text 12234
another random text
User infos:
User name : John
ID : 221223
Date : 23.02.2018
Job: job1
User name : Andrew
ID : 378292
Date : 12.08.2017
Job: job2
User name : Chris
ID : 930712
Date : 05.11.2016
Job : job3
some random text
And this class:
class User
{
public string UserName { get; set; }
public string ID { get; set; }
public string Date { get; set; }
public string Job { get; set; }
public User(string _UserName, string _ID, string _Date, string _Job)
{
UserName = _UserName
ID = _ID;
Date = _Date;
Job = _Job;
}
}
And I want to create a List of Users with informations from that string.
I have tried doing that:
List<User> Users = new List<User>();
string Data = (the data above)
string[] lines = Data.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
List<string> UserNames = new List<string>();
List<string> IDs = new List<string>();
List<string> Dates = new List<string>();
List<string> Jobs = new List<string>();
foreach (var line in lines)
{
if (line.StartsWith("User name : "))
{
UserNames.Add(Line.Remove(0, 12));
}
if (Line.StartsWith("ID : "))
{
IDs.Add(Line.Remove(0, 5));
}
if (Line.StartsWith("Date : "))
{
Dates.Add(Line.Remove(0, 7));
}
if (Line.StartsWith("Job : "))
{
Jobs.Add(Line.Remove(0, 6));
}
}
var AllData = UserNames.Zip(IDs, (u, i) => new { UserName = u, ID = i });
foreach (var data in AllData)
{
Users.Add(new User(data.UserName, data.ID, "date", "job"));
}
But I can only combine two lists using this code. Also, I have more than 4 values for each user (the string above was just a short example) .
Is there a better method? Thanks.
Since it seems to be always 4 lines of information you could go in steps of 4
with a loop through the splitted array lines
. At each step you would split by colon :
and collect the last item, which is the desired value:
EDIT: In this case I would suggets to look for the START of the data.
int startIndex = Data.IndexOf("User name");
EDIT 2:
also ends with another line of text
then you can use LastIndexOf to find the end of the important information:
int endIndex = Data.LastIndexOf("Job");
int lengthOfLastLine = Data.Substring(endIndex).IndexOf(Environment.NewLine);
endIndex += lengthOfLastLine;
and then simply take a SubString from the startindex on until the end
string [] lines = Data.Substring(startIndex, endIndex - startIndex)
.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
List<User> allUsers = new List<UserQuery.User>();
for (int i = 0; i < lines.Length; i += 4)
{
string name = lines[i].Split(':').Last().Trim();
string ID = lines[i + 1].Split(':').Last().Trim();
string Date = lines[i + 2].Split(':').Last().Trim();
string Job = lines[i + 3].Split(':').Last().Trim();
allUsers.Add(new User(name, ID, Date, Job));
}
Ahhh, and you should Trim
the spaces away.
This solution should be readable. The hard coded step size of 4
is actually annoying in my solution
Disclaimer: This solution works only as long as the format does not change. If the order of the lines should change, it will return false results
Instead of checking each line to add each of them to a a list, you can create your list of User directly. There you go:
Code:
var users = data.Split(new[] {"\n\n" }, StringSplitOptions.None).Select(lines =>
{
var line = lines.Split(new[] { "\n" }, StringSplitOptions.None);
return new User(line[0].Substring(11), line[1].Substring(4), line[2].Substring(6), line[3].Substring(5));
});
Try it online!
As @Mong Zhu answer, remove everything before and after. A this point, this is another question I wont try to solve. Remove the noise before and after then parse your data.
For a robust, flexible and self-documenting solution that will allow you to easily add new fields, ignore all the extraneous text and also cater for variations in your file format (this seems to be the case with, for example, no space in "ID:" only in the 3rd record), I would use a Regex
and some LINQ to return a collection of records as follows:
using System.Text.RegularExpressions;
public class Record
{
public string Name { get; set; }
public string ID { get; set; }
public string Date { get; set; }
public string Job { get; set; }
}
public List<Record> Test()
{
string s = @"User name : John
ID : 221223
Date : 23.02.2018
Job: job1
User name : Andrew
ID : 378292
Date : 12.08.2017
Job: job2
User name : Chris
ID: 930712
Date : 05.11.2016
Job: job3
";
Regex r = new Regex(@"User\sname\s:\s(?<name>\w+).*?ID\s:\s(?<id>\w+).*?Date\s:\s(?<date>[0-9.]+).*?Job:\s(?<job>\w\w+)",RegexOptions.Singleline);
r.Matches(s);
return (from Match m in r.Matches(s)
select new Record
{
Name = m.Groups["name"].Value,
ID = m.Groups["id"].Value,
Date = m.Groups["date"].Value,
Job = m.Groups["job"].Value
}).ToList();
}
The CSV format seems to be what you're looking for (since you want to add some header to this file the actual CSV stars on 6th line):
random text 12234
another random text
User infos:
UserName;ID;Date;Job
John;221223;23.02.2018;job1
Andrew;378292;12.08.2017;job2
Chris;930712;05.11.2016;job3
And then you could read this file and parse it:
var lines = File.ReadAllLines("pathToFile");
var dataStartIndex = lines.IndexOf("UserName;ID;Date;Job");
var Users = lines.Skip(dataStartIndex + 1).Select(s =>
{
var splittedStr = s.Split(';');
return new User(splittedStr[0], splittedStr[1], splittedStr[2], splittedStr[3]);
}).ToList();
If you're working with console entry just skip the header part and let user enter comma separated values for each user on a different string. Parse it in a same way:
var splittedStr = ReadLine().Split(';');
var userToAdd = new User(splittedStr[0], splittedStr[1], splittedStr[2] , splittedStr[3]);
Users.Add(userToAdd);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With