Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get list of ONLY excel worksheet names in Excel using OLEDB; filter out non-worksheets that show up in metadata

Tags:

c#

excel

oledb

I have an issue getting worksheet names from an Excel spreadsheet using OLEDB. The problem is that when I use GetOleDbSchemaTable, the resulting DataTable has more than just the actual worksheet names; it has extra rows for "Tables" that I can only assume are used internally by Excel.

So for example, if I have a worksheet named myWorksheet, the code below might end up with a list that contains myWorksheet$, myWorksheet$PrintTable and myWorksheet$_. Only the first myWorksheet$ record is for the actual worksheet. The others are just garbage that I don't need. When you look at them in metadata they look just like regular tables, even with the type of TABLE.

For now I just manually filtered out anything with "$_" or "$Print" in the name, but who knows what other Excel feature might make these extra records turn up in a different format.

Does anyone know the best way to get ONLY actual worksheet names, and not these internal tables that aren't worksheets? Is there something in metadata that would differentiate them?

 private ArrayList getXlsWorksheetNames(OleDb.OleDbConnection conn)
    {
        ArrayList wsList = new ArrayList();
        DataTable schemaTable;

        try
        {
            conn.Open();
            schemaTable = conn.GetOleDbSchemaTable(OleDb.OleDbSchemaGuid.Tables, null);

            foreach (DataRow row in schemaTable.Rows)
            {
                //form.appendToResultsTxt("Adding worksheet to list: " + Environment.NewLine +
                //    "Name = " + row.Field<string>("TABLE_NAME") + "," + Environment.NewLine +
                //    "Type = " + row.Field<string>("TABLE_TYPE") + "," + Environment.NewLine + Environment.NewLine);
                wsList.Add(row.Field<string>("TABLE_NAME"));
            }
            conn.Close();
        }
        catch (Exception ex)
        {
            if (this.mode == Cps2TxtUtilModes.GUI_MODE)
            {
                this.form.appendToResultsTxt(ex.ToString());
            }
            throw;
        }

        return wsList;
    }

I read through the article at this link, but they don't seem to be doing anything differently than I am, and I don't see any filtering out of extra non-worksheet tables, so Microsoft doesn't seem to have provided the right answer.

http://support.microsoft.com/kb/318452

And I've also looked around alot of StackOverflow, like at the thread from the link below which was helpful, but doesn't solve this one problem.

Using Excel OleDb to get sheet names IN SHEET ORDER

Before anyone asks, I'd also like to say that I don't really have control over what features are used in the spreadsheet, so I can't just tell them "Don't turn on filtering" or "Don't use print tables".

Any ideas are much appreciated. Thanks!

like image 413
Jim Avatar asked May 18 '12 14:05

Jim


People also ask

How read Oledb file in Excel?

Using Code Before start Reading/Writing from/in Excel file, we need to connect to OLEDB using connection string, here OLEDB will act as Bridge between your program and EXCEL. Rows and columns of Excel sheet can be directly imported to data-set using OLEDB, no need to open Excel file using INTROP EXCEL object.

How do I summarize all tabs in Excel?

Click Data>Consolidate (in the Data Tools group). In the Function box, click the summary function that you want Excel to use to consolidate the data. The default function is SUM. Select your data.


2 Answers

The question is old, but for those who found it now, the skipping can be done as Jim found...

// skip those that do not end correctly
foreach (DataRow row in schemTable.Rows)
{
    string sheetName = row["TABLE_NAME"].ToString();
    if (!sheetName.EndsWith("$") && !sheetName.EndsWith("$'"))
        continue;
    Console.WriteLine(sheetName);
}

That is the wanted are or those that end with $ or those that end with $'.

like image 126
pepr Avatar answered Sep 23 '22 20:09

pepr


From experience, it seems to be all those whose name ends in a dollar sign. I've come across scenarios from clients where extra worksheets seemed to be appearing which weren't present in the data - these later turned out to be hidden worksheets in Excel!

like image 28
Bridge Avatar answered Sep 21 '22 20:09

Bridge