Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to get the content created time of an excel file in r?

Tags:

r

openxlsx

I have been trying to obtain the time at which the content of an .xlsx file was created without any success so far. I can track the much-desired information on Windows either through File Properties -> Details -> Origin -> Content created, or by opening the Excel file and navigating to File -> Info -> Related Dates -> Created.

I was hoping that I would be able to obtain this information through openxlsx but while I am able to track down the creators by using the getCreators() function there does not appear to exist a similar function for the time.

I have also tried the file.info() function but it won't cut it as mtime, ctime, and atime all point to the time of the download.

Any help would be much appreciated!

like image 354
manos-atx Avatar asked Oct 21 '25 16:10

manos-atx


2 Answers

I don't think openxlsx is going to do it for you, but you might want to submit a FR for them to add/extend file metadata availability. Here's something in a pinch, assuming that the XLSX file is in a newer zip-based format and not the previous binary format.

myfile <- "path/to/yourfile.xlsx"
docProps <- xml2::read_xml(unz(myfile, "docProps/core.xml"))
docProps
# {xml_document}
# <coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dcmitype="http://purl.org/dc/dcmitype/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
# [1] <dc:creator>r2</dc:creator>
# [2] <cp:lastModifiedBy>r2</cp:lastModifiedBy>
# [3] <dcterms:created xsi:type="dcterms:W3CDTF">2021-05-09T20:01:41Z</dcterms:created>
# [4] <dcterms:modified xsi:type="dcterms:W3CDTF">2021-05-10T00:14:14Z</dcterms:modified>

xml2::xml_text(xml2::xml_find_all(docProps, "dcterms:created"))
# [1] "2021-05-09T20:01:41Z"

It's a text file, so in a pinch you can look at it manually, but I recommend not trying to do regex on XML in general. (You could get away with it here, but it's still fraught with peril.)

like image 123
r2evans Avatar answered Oct 23 '25 07:10

r2evans


I don't think openxlsx is going to do it for you, but you might want to submit a FR for them to add/extend file metadata availability. Here's something in a pinch, assuming that the XLSX file is in a newer zip-based format and not the previous binary format.

myfile <- "path/to/yourfile.xlsx"
docProps <- xml2::read_xml(unz(myfile, "docProps/core.xml"))
docProps
# {xml_document}
# <coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dcmitype="http://purl.org/dc/dcmitype/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
# [1] <dc:creator>r2</dc:creator>
# [2] <cp:lastModifiedBy>r2</cp:lastModifiedBy>
# [3] <dcterms:created xsi:type="dcterms:W3CDTF">2021-05-09T20:01:41Z</dcterms:created>
# [4] <dcterms:modified xsi:type="dcterms:W3CDTF">2021-05-10T00:14:14Z</dcterms:modified>

xml2::xml_text(xml2::xml_find_all(docProps, "dcterms:created"))
# [1] "2021-05-09T20:01:41Z"

It's a text file, so in a pinch you can look at it manually, but I recommend not trying to do regex on XML in general. (You could get away with it here, but it's still fraught with peril.)

like image 34
r2evans Avatar answered Oct 23 '25 05:10

r2evans



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!