A passage in the file
documentation caught my eye:
## We can do the same thing with an anonymous file.
Tfile <- file()
cat("abc\ndef\n", file = Tfile)
readLines(Tfile)
close(Tfile)
What exactly is this anonymous file? Does it exist on disk, or only in memory? I'm interested in this as I'm contemplating a program that will potentially need to create/delete thousands of temporary files, and if this happens only in memory it seems like it would have a much lesser impact on system resources.
This linux SO Q appears to suggest this file could be a real disk file, but I'm not sure how relevant to this particular example that is. Additionally, this big memory doc seems to hint at a real disk based storage (though I'm assuming the file
based anonymous file is being used):
It should also be noted that a user can create an “anonymous” file-backed big.matrix by specifying "" as the filebacking argument. In this case, the backing resides in the temporary directory and a descriptor file is not created. These should be used with caution since even anonymous backings use disk space which could eventually fill the hard drive. Anonymous backings are removed either manually, by a user, or automatically, when the operating system deems it appropriate.
Alternatively, if textConnection
is appropriate for use for this type of application (opened/closed hundreds/thousands of times) and is memory only that would satisfy my needs. I was planning on doing this until I read the note in that function's documentation:
As output text connections keep the character vector up to date line-by-line, they are relatively expensive to use, and it is often better to use an anonymous file() connection to collect output.
An Anonymous file is just a temporary file without a descriptor( sort of header).
My C is very rusty, so hopefully more experienced people can correct me, but I think the answer to your question "What exactly is this anonymous file? Does it exist on disk, or only in memory?" is "It exists on disk".
Here is what happens at C level (I'm looking at the source code at http://cran.r-project.org/src/base/R-3/R-3.0.2.tar.gz):
A. Function file_open
, defined in src/main/connections.c:554
, has the following logic related to anonymous file (with an empty description), lines 565-568:
if(strlen(con->description) == 0) {
temp = TRUE;
name = R_tmpnam("Rf", R_TempDir);
} else name = R_ExpandFileName(con->description);
So a new temporary filename is generated if no file name was supplied to file
.
B. If the name of the file is not equal to stdin
, the call R_fopen(name, con->mode)
happens at line 585 (there some subtleties with Win32 and UTF8 names, but we can ignore them now).
C. Finally, the file name
is unlinked at line 607. The documentation for unlink
says:
The unlink() function removes the link named by path from its directory and decrements the link count of the file which was referenced by the link. If that decrement reduces the link count of the file to zero, and no process has the file open, then all resources associated with the file are reclaimed. If one or more process have the file open when the last link is removed, the link is removed, but the removal of the file is delayed until all references to it have been closed.
So in effect the directory entry is removed but file exists as long as it's being open by R process.
D. Finally, R_fopen
is defined in src/main/sysutils.c:135
and just calls fopen
internally.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With