Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I determine the character encoding of an excel file? [duplicate]

Possible Duplicate:
Excel to CSV with UTF8 encoding

Scenario: I have an excel file containing a large amount of global customer data. I do not know what encoding was used when the file was created.

Question: How can I determine the character encoding used in the excel file so I can import it correctly into another piece of software?

like image 221
samaspin Avatar asked Nov 05 '12 15:11

samaspin


1 Answers

For Excel 2010 it should be UTF-8. Instruction by MS :
http://msdn.microsoft.com/en-us/library/bb507946:

"The basic document structure of a SpreadsheetML document consists of the Sheets and Sheet elements, which reference the worksheets in the Workbook. A separate XML file is created for each Worksheet. For example, the SpreadsheetML for a workbook that has two worksheets name MySheet1 and MySheet2 is located in the Workbook.xml file and is shown in the following code example.

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>  <workbook xmlns=http://schemas.openxmlformats.org/spreadsheetml/2006/main xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">     <sheets>         <sheet name="MySheet1" sheetId="1" r:id="rId1" />          <sheet name="MySheet2" sheetId="2" r:id="rId2" />      </sheets> </workbook> 

The worksheet XML files contain one or more block level elements such as SheetData. sheetData represents the cell table and contains one or more Row elements. A row contains one or more Cell elements. Each cell contains a CellValue element that represents the value of the cell. For example, the SpreadsheetML for the first worksheet in a workbook, that only has the value 100 in cell A1, is located in the Sheet1.xml file and is shown in the following code example.

<?xml version="1.0" encoding="UTF-8" ?>  <worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">     <sheetData>         <row r="1">             <c r="A1">                 <v>100</v>              </c>         </row>     </sheetData> </worksheet> 

"

Detection of cell encodings:

https://metacpan.org/pod/Spreadsheet::ParseExcel::Cell

http://forums.asp.net/t/1608228.aspx/1

like image 176
Jüri Ruut Avatar answered Oct 04 '22 10:10

Jüri Ruut