In ColdFusion 11, I'm using cfSpreadsheet to convert a .xls file into a query object. Here is a screenshot of my demo spreadsheet:
I use this code to see the query object as soon as it is created:
<cfspreadsheet action="read" src="demo_spreadsheet.xls"
excludeHeaderRow="true"
headerrow="1" query="demo_query"/>
<cfdump var="#demo_query#">
... and I get these results:
Notice that all of my 4-digit years from the spreadsheet are now 2-digit years? When I go to output the data in the query object using this code:
<ul>
<cfoutput query="demo_query">
<li>#name# - #dateformat(start_date, 'medium')#</li>
</cfoutput>
</ul>
... I get the following output (ok, I'm new here, so I can't post more than two screen shots so you'll have to trust me on this copy/paste):
The year 1907 is now 2007, 1917 is now 2017, 1929 is now 2029, and 2030 is now 1930. It appears that the year of any date before Jan 1, 1930 is read as 20xx, and after Dec 31, 2029 is read as 19xx.
Am I missing something? I thought we figured this kind of thing out with Y2K. Is there a simple setting somewhere that I have incorrect? I've Googled the heck out of this issue and I can't find anything about it.
Any advice would be most welcome.
If you enter a date with a two-digit year in a text formatted cell or as a text argument in a function, such as =YEAR("1/1/31"), Excel interprets the year as follows: 00 through 29 is interpreted as the years 2000 through 2029. For example, if you type the date 5/28/19, Excel assumes the date is May 28, 2019.
Click Start, point to Settings, and then click Control Panel. Double-click the Regional Settings icon. Click the Date tab. In the When a two digit year is entered, interpret a year between box, type the cutoff year that you want, and then click OK.
We can just open the Format Cells dialog by pressing Ctrl + 1. After that select the Custom category on the Number tab, and enter one of the codes below in the Type box: yy – to display 2-digit years, as 00-99. yyyy – to display 4-digit years, as 1900-9999.
Most likely your spreadsheet cells are using the built in regional format *m/d/yy
, meaning the displayed value (or in this case "read") may vary depending on the environment or client used.
Date and time formats that begin with an asterisk (*) respond to changes in regional date and time settings that are specified in Control Panel. Formats without an asterisk are not affected by Control Panel settings.
That seems to be what is happening with cfspreadsheet. Not sure why Excel displays a four digit year, rather than two digit, for format *m/d/yy
. However, CF/POI are returning the correct result according to the Excel specifications. Notice if you switch the cell format to the non-regional, four digit year ie m/d/yyyy
the output is what you expected:
Update: As to why your CF code displays different years than you expected, it is due to how ambiguous date strings are handled by CF. It is important to note, the query returned by CFSpreadsheet contains strings, not date objects. When you pass those strings into DateFormat
, CF must first interpret the strings and convert them into date objects before it can apply the date mask. According to CF's rules, two digit years are interpreted as follows:
A string containing a date/time value formatted according to U.S. locale conventions. Can represent a date/time in the range 100 AD–9999 AD. Years 0-29 are interpreted as 2000-2029; years 30-99 are interpreted as 1930-1999.
Honestly, CFSpreadsheet is designed to provide a simple way to read and write spreadsheets without a lot of bells and whistles. AFAIK, it does not support changing the way the cell values are interpreted. If you want to force a four digit year, you must change the spreadsheet to use a non-regional date format, either manually or programmatically (ie read the spreadsheet with CF, and apply a new cell format). That is probably the simplest option.
If you want more flexibility in terms of code, you can also use spreadsheet functions instead of cfspreadsheet. Though in this specific case, I think they too lack the necessary features. So you might look into using the underlying POI library and a bit of java code. This thread demonstrates how to obtain all kinds of details about the spreadsheet cells and values. It could easily be modified to build your own query or an array of structures containing the values, formats, etcetera:
Code:
<cfscript>
// get the sheet you want to read
cfSheet = SpreadSheetRead("c:/temp/demo_spreadsheet.xls");
workbook = cfSheet.getWorkBook();
sheetIndex = workbook.getActiveSheetIndex();
sheet = workbook.getSheetAt( sheetIndex );
// utility used to distinguish between dates and numbers
dateUtil = createObject("java", "org.apache.poi.ss.usermodel.DateUtil");
// process the rows and columns
rows = sheet.rowIterator();
while (rows.hasNext()) {
currentRow = rows.next();
data = {};
cells = currentRow.cellIterator();
while (cells.hasNext()) {
currentCell = cells.next();
col = {};
col.value = "";
col.type = "";
col.column = currentCell.getColumnIndex()+ 1;
col.row = currentCell.getRowIndex()+ 1;
col.format = currentCell.getCellStyle().getDataFormatString();
if (currentCell.getCellType() EQ currentCell.CELL_TYPE_STRING) {
col.value = currentCell.getRichStringCellValue().getString();
col.type = "string";
}
else if (currentCell.getCellType() EQ currentCell.CELL_TYPE_NUMERIC) {
if (DateUtil.isCellDateFormatted(currentCell)) {
col.value = currentCell.getDateCellValue();
col.type = "date";
}
else {
col.value = currentCell.getNumericCellValue();
col.type = "number";
}
}
else if (currentCell.getCellType() EQ currentCell.CELL_TYPE_BOOLEAN) {
col.value = currentCell.getBooleanCellValue();
col.type = "boolean";
}
// ... handle other types CELL_TYPE_BLANK, CELL_TYPE_ERROR, CELL_TYPE_FORMULA
data["COL"& col.column] = col;
}
// this row is finished. display all values
WriteDump(data);
}
</cfscript>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With