Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Full outer join, on 2 data tables, with a list of columns

I have 2 data tables, which I do not know their list of data columns. This list must be extracted at run time, and be used for the full outer join.

When using these columns, the columns between the 2 tables need to be merged, and I need all data to be displayed.

Till now what I am doing is

  1. Get common columns, using intersect, and implementing IEqualityComparer
  2. Create a new datatable, with these columns, so that the 2 datatables will be merged into this new table

However, I am having issues with Linq, on the 2nd step.

Till now I have :

Get common columns


    // Get common columns
    var commonColumns = dt1.Columns.OfType().Intersect(dt2.Columns.OfType(), new DataColumnComparer());

Create new data table


    // Create the result which is going to be sent to the user
    DataTable result = new DataTable();

    // Add all the columns from both tables
    result.Columns.AddRange(
    dt1.Columns.OfType()
    .Union(dt2.Columns.OfType(), new DataColumnComparer())
    .Select(c => new DataColumn(c.Caption, c.DataType, c.Expression, c.ColumnMapping)).ToArray());

How can I obtain an efficient full outer join dynamically, from the List of datacolumns, that is extracted at run time?

like image 337
Mez Avatar asked May 29 '13 09:05

Mez


People also ask

How do I join two tables in SQL with all columns?

In SQL the FULL OUTER JOIN combines the results of both left and right outer joins and returns all (matched or unmatched) rows from the tables on both sides of the join clause. Let's combine the same two tables using a full join. Here is an example of full outer join in SQL between two tables.

How many rows fetched after performing full outer join?

A FULL OUTER JOIN returns one distinct row from each table—unlike the CROSS JOIN which has multiple.


3 Answers

This might work for you

var commonColumns = dt1.Columns.OfType<DataColumn>().Intersect(dt2.Columns.OfType<DataColumn>(), new DataColumnComparer());
        DataTable result = new DataTable();

        dt1.PrimaryKey = commonColumns.ToArray();

        result.Merge(dt1, false, MissingSchemaAction.AddWithKey);
        result.Merge(dt2, false, MissingSchemaAction.AddWithKey);
like image 160
Matthew Grima Avatar answered Oct 19 '22 04:10

Matthew Grima


Based on Matthew's answer, I've created a function that accepts more than 2 datatables. I hope it helps:

Usage:

var table123 = FullOuterJoinDataTables(table1, table2, table3);

Here is the function source:

public DataTable FullOuterJoinDataTables(params DataTable[] datatables) // supports as many datatables as you need.
{
    DataTable result = datatables.First().Clone();

    var commonColumns = result.Columns.OfType<DataColumn>();

    foreach (var dt in datatables.Skip(1))
    {
        commonColumns = commonColumns.Intersect(dt.Columns.OfType<DataColumn>(), new DataColumnComparer());
    }

    result.PrimaryKey = commonColumns.ToArray();

    foreach (var dt in datatables)
    {
        result.Merge(dt, false, MissingSchemaAction.AddWithKey);
    }

    return result;
}

/* also create this class */
public class DataColumnComparer : IEqualityComparer<DataColumn>
{
    public bool Equals(DataColumn x, DataColumn y) => x.Caption == y.Caption;       

    public int GetHashCode(DataColumn obj) => obj.Caption.GetHashCode();        
}
like image 37
Gerardo Grignoli Avatar answered Oct 19 '22 04:10

Gerardo Grignoli


I struggled also to get the answer, I am copying pasting the entire code. I am sure this will help you.

you just need DataTable1, DataTable2 and primarykeys of both tables on which this join will get performed. You can set the datatable primary key as

datatable1.PrimaryKey = new DataColumn[] { captureDT.Columns["Your Key Name"] };

// Your Code

/// <summary>
    /// Combines the data of two data table into a single data table. The grouping of tables
    /// will be based on the primary key provided for both the tables.
    /// </summary>
    /// <param name="table1"></param>
    /// <param name="table2"></param>
    /// <param name="table1PrimaryKey"></param>
    /// <param name="table2PrimaryKey"></param>
    /// <returns></returns>
    private DataTable DataTablesOuterJoin(DataTable table1, DataTable table2, string table1PrimaryKey, string table2PrimaryKey)
    {
        DataTable flatDataTable = new DataTable();

        foreach (DataColumn column in table2.Columns)
        {
            flatDataTable.Columns.Add(new DataColumn(column.ToString()));
        }
        foreach (DataColumn column in table1.Columns)
        {
            flatDataTable.Columns.Add(new DataColumn(column.ToString()));
        }

        // Retrun empty table with required columns to generate empty extract
        if (table1.Rows.Count <= 0 && table2.Rows.Count <= 0)
        {
            flatDataTable.Columns.Remove(table2PrimaryKey);
            return flatDataTable;
        }

        var dataBaseTable2 = table2.AsEnumerable();
        var groupDataT2toT1 = dataBaseTable2.GroupJoin(table1.AsEnumerable(),
                                br => new { id = br.Field<string>(table2PrimaryKey).Trim().ToLower() },
                                jr => new { id = jr.Field<string>(table1PrimaryKey).Trim().ToLower() },
                                (baseRow, joinRow) => joinRow.DefaultIfEmpty()
                                    .Select(row => new
                                    {
                                        flatRow = baseRow.ItemArray.Concat((row == null) ? new object[table1.Columns.Count] :
                                        row.ItemArray).ToArray()
                                    })).SelectMany(s => s);

        var dataBaseTable1 = table1.AsEnumerable();
        var groupDataT1toT2 = dataBaseTable1.GroupJoin(table2.Select(),
                                br => new { id = br.Field<string>(table1PrimaryKey).Trim().ToLower() },
                                jr => new { id = jr.Field<string>(table2PrimaryKey).Trim().ToLower() },
                                (baseRow, joinRow) => joinRow.DefaultIfEmpty()
                                    .Select(row => new
                                    {
                                        flatRow = (row == null) ? new object[table2.Columns.Count].ToArray().Concat(baseRow.ItemArray).ToArray() :
                                        row.ItemArray.Concat(baseRow.ItemArray).ToArray()
                                    })).SelectMany(s => s);

        // Get the union of both group data to single set
        groupDataT2toT1 = groupDataT2toT1.Union(groupDataT1toT2);

        // Load the grouped data to newly created table 
        foreach (var result in groupDataT2toT1)
        {
            flatDataTable.LoadDataRow(result.flatRow, false);
        }

        // Get the distinct rows only
        IEnumerable rows = flatDataTable.Select().Distinct(DataRowComparer.Default);

        // Create a new distinct table with same structure as flatDataTable
        DataTable distinctFlatDataTable = flatDataTable.Clone();
        distinctFlatDataTable.Rows.Clear();

        // Push all the rows into distinct table.
        // Note: There will be two different columns for primary key1 and primary key2. In grouped rows,
        // primary key1 or primary key2 can have empty values. So copy all the primary key2 values to
        // primary key1 only if primary key1 value is empty and then delete the primary key2. So at last
        // we will get only one perimary key. Please make sure the non-deleted key must be present in 
        foreach (DataRow row in rows)
        {
            if (string.IsNullOrEmpty(row[table1PrimaryKey].ToString()))
                row[table1PrimaryKey] = row[table2PrimaryKey];

            if (string.IsNullOrEmpty(row[CaptureBusDateColumn].ToString()))
                row[CaptureBusDateColumn] = _businessDate;

            if (string.IsNullOrEmpty(row[CaptureUserIDColumn].ToString()))
                row[CaptureUserIDColumn] = row[StatsUserIDColumn];

            distinctFlatDataTable.ImportRow(row);
        }

        // Sort the table based on primary key.
        DataTable sortedFinaltable = (from orderRow in distinctFlatDataTable.AsEnumerable()
                                      orderby orderRow.Field<string>(table1PrimaryKey)
                                      select orderRow).CopyToDataTable();

        // Remove primary key2 as we have already copied it to primary key1 
        sortedFinaltable.Columns.Remove(table2PrimaryKey);

        return ReplaceNulls(sortedFinaltable, "0");
    }

    /// <summary>
    /// Replace all the null values from data table with specified string 
    /// </summary>
    /// <param name="dt"></param>
    /// <param name="replaceStr"></param>
    /// <returns></returns>
    private DataTable ReplaceNulls(DataTable dt, string replaceStr)
    {
        for (int a = 0; a < dt.Rows.Count; a++)
        {
            for (int i = 0; i < dt.Columns.Count; i++)
            {
                if (dt.Rows[a][i] == DBNull.Value)
                {
                    dt.Rows[a][i] = replaceStr;
                }
            }
        }
        return dt;
    }
like image 31
PawanS Avatar answered Oct 19 '22 02:10

PawanS