Maybe is it duplicate but I can't find proper way to do following correctly.
Generally I want to retrieve all data from Employee table which are related with List. Type MyEmployee contains EntitySourceID which I use to map with EmployeeID. So, I want to retrieve all Employees which have match EmployeeID with EntitySourceID in List collection.
Type MyEmployee looks like:
public class MyEmployee
{
public long PersonID { get; set; }
public string ConnectionString { get; set; }
public long EntitySourceID { get; set; }
public int EntitySourceTypeID { get; set; }
}
My query looks like:
internal IEnumerable<Person> GetPersons(List<MyEmployee> myEmployees)
{
return (from p in _context.Employee
join pList in myEmployees on p.EmployeeID equals pList.EntitySourceID
select new Person
{
PersonID = pList.PersonID,
FirstName = p.FirstName,
LastName = p.LastName,
Name = p.Name,
Suffix = p.Suffix,
Title = p.Title
}).ToList();
}
You can see in query, when I create new Person object I used pList.PersonID from List myEmployees collection to fill Person's.
So, my question is how I can efficient retrieve data from Employee table which have match with List collection, and also use pList.PersonID (from collection) to create return result?
I use EF 6, database first approach.
Also, I didn't mention. This query produce following exception: Unable to create a constant value of type 'MyEmployee'. Only primitive types or enumeration types are supported in this context.
A good start for solving some of your deeper questions would be spending some time discovering the differences between
IQueryable
andIEnumerable<T>
and maybe also between
Func<T>
and "Expressions" such as Expression<T>
because while they have similar forms they differ in purpose and behavior.
For starters let's name a few things:
MyEmployee
instances THE LIST
Sadly you didn't specify a few very important details while writing your question. That leads me to proposing 4 different answers. The answers will be categorized based on the truth values of the following 2 questions:
We have 4 very different cases:
You can imagine by now that the fourth is maybe the ugliest.
In cases 1 and 2 you could think about your problem from a different perspective:
Say you need to fetch ONE (or zero) record(s) from the database based on Precisely 1 parameter which is an ID. Should you be performing a JOIN?
The answer is: absolutely NOT. Take a look at this code:
var query = from employee in _context.Employee
where employee.EmployeeId == 23
select employee;
var found = query.FirstOrDefault();
What if I wanted to fetch the records associated with Precisely 2 parameters? I could achieve that in a similar manner:
var query = from employee in _context.Employee
where employee.EmployeeId == 23 || employee.EmployeeId == 24
select employee;
var results = query.ToArray();
if (results.Length == 0)
// didn't find anyone of the presumably existing records
else if (results.Length == 1) {
if (results[0].EmployeeId == 23)
// then we found the 23
else
// the other one
} else if (results.Length == 2)
// found both, look inside to see which is which
I have intentionally written the finishing touches of the algorithms (the if
part) in a silly manner in order to avoid extra confusion.
This would be a more humane approach for the finishing touches:
...
var results = ... got them (see above)
var map = results.ToDictionary(keySelector: x => x.EmployeeId);
var count = map.Count; // this gives you the number of results, same as results.Length
var have23 = map.ContainsKey(23); // this tells you whether you managed to fetch a certain id
var record23 = map[23]; // this actually gives you the record
foreach (var key in map.Keys) { .. } // will iterate over the fetched ids
foreach (var record in map.Values) { .. } // will iterate over the fetched values
Worry not about the ToDictionary
extension method.
It has NOTHING to do with EntityFramework (look it up by clicking on it).
Now.. back to our story: What if you wanted to bring the records associated with 15 ids? Stop. Where is this going? Am I asking you to hardcode a different query for each possible number of ids?
Of course not. As long as the number of ids is "relatively small" (meaning you are allowed by someone, or by yourself to bombard the database with that request magnitude) you could very well use an "column IN list of params" SQL construct.
How can you instruct LINQ to SQL, or EF to translate into an "x IN y" operation instead of an "x = y" operation, on the SQL side?
By using a primitive array of the respective type and the Contains
method.
In other words, get a load of:
var query = from employee in _context.Employee
where listOfIds.Contains( employee.EmployeeId )
select employee;
var results = query.ToArray();
But you need a "list of Ids" not a "list of MyEmployee instances". You could pull that off very easily like so:
List<MyEmployee> originalList = new List<MyEmployee>();
// ... say you populate this somehow, or you've received it from elsewhere
int[] listOfIds = (from employee in originalList
select employee.EntityId).ToArray();
// .. and then carry on with the EF query
Please note that queries on collections manifest as IEnumerable<T>
instances, not as IQueryable<T>
instances and have nothing to do with EF or LINQ to SQL or anyother DB or external data service.
IF THE TABLE IS NOT HUGE
Then you could refrain from actually using EF with complex queries, use it just for a "Full table fetch", temporarily store the results in your .NET process and use regular LINQ however you like.
The key to this story is fetching the entire table from the beginning. In your question you wrote:
return (from p in _context.Employee
join pList in myEmployees on p.EmployeeID equals pList.EntitySourceID
select new Person
{
PersonID = pList.PersonID,
FirstName = p.FirstName
... etc
Simply augment that with:
var entityList = _context.Employee.ToArray();
return (from p in entityList // PLEASE NOTE THIS CHANGE ALSO
join pList in myEmployees on p.EmployeeID equals pList.EntitySourceID
select ...
TO WRAP IT UP
You can either:
Either one side or the other (the database or the .NET process) needs to have all the cards (needs to have a clone of the other side) in order be able to perform the JOIN.
So it's just a game of compromise.
HOW ABOUT THE REMAINING CASE
If both THE TABLE and THE LIST are huge, then you're s****d. No- I'm just kidding.
Nobody heard of someone asking someone else to do wonders when they can't actually be done.
If this is the case, then you have to simplify the problem into a large number of smaller problems. I would suggest transforming into a TABLE HUGE + LIST NOT SO HUGE problem multiplied by N.
So how do you go about that?
List<MyEmployee> original = ...
// you take your list
// and you split it in sections of .. say 50 (which in my book is not huge for a database
// although be careful - the pressure on the database will be almost that of 50 selects running in parallel for each select)
// how do you split it?
// you could try this
public static IEnumerable<List<MyEmployee>> Split(List<MyEmployee> source, int sectionLength) {
List<MyEmployee> buffer = new List<MyEmployee>();
foreach (var employee in source) {
buffer.Add(employee);
if (buffer.Count == sectionLength) {
yield return buffer.ToList(); // MAKE SURE YOU .ToList() the buffer in order to clone it
buffer.Clear(); // or otherwise all resulting sections will actually point to the same instance which gets cleared and refilled over and over again
}
}
if (buffer.Count > 0) // and if you have a remainder you need that too
yield return buffer; // except for the last time when you don't really need to clone it
}
List<List<MyEmployee>> sections = Split(original, 50).ToList();
// and now you can use the sections
// as if you're in CASE 2 (the list is not huge but the table is)
// inside a foreach loop
List<Person> results = new List<Person>(); // prepare to accumulate results
foreach (var section in sections) {
int[] ids = (from x in section select x.EntityID).ToArray();
var query = from employee in _context.Employee
where ids.Contains(employee.EmployeeId)
... etc;
var currentBatch = query.ToArray();
results.AddRange(currentBatch);
}
Now you could say that this is simply a way of fooling the database into believing it has little work to do when in fact we're still flooding it with a lot of work and maybe making life hard for the other concurrent clients.
Well- yeah, but at least you could throttle down.
You could Thread.Sleep
between sections... You could use iterators
(look them up) and actually not flood the RAM with records that will take a long time to process anyway but rather "stream things".
You have more control over the situation.
Good luck!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With