Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What Exactly are Anonymous Files

Tags:

file

r

A passage in the file documentation caught my eye:

## We can do the same thing with an anonymous file.
Tfile <- file()
cat("abc\ndef\n", file = Tfile)
readLines(Tfile)
close(Tfile)

What exactly is this anonymous file? Does it exist on disk, or only in memory? I'm interested in this as I'm contemplating a program that will potentially need to create/delete thousands of temporary files, and if this happens only in memory it seems like it would have a much lesser impact on system resources.

This linux SO Q appears to suggest this file could be a real disk file, but I'm not sure how relevant to this particular example that is. Additionally, this big memory doc seems to hint at a real disk based storage (though I'm assuming the file based anonymous file is being used):

It should also be noted that a user can create an “anonymous” file-backed big.matrix by specifying "" as the filebacking argument. In this case, the backing resides in the temporary directory and a descriptor file is not created. These should be used with caution since even anonymous backings use disk space which could eventually fill the hard drive. Anonymous backings are removed either manually, by a user, or automatically, when the operating system deems it appropriate.

Alternatively, if textConnection is appropriate for use for this type of application (opened/closed hundreds/thousands of times) and is memory only that would satisfy my needs. I was planning on doing this until I read the note in that function's documentation:

As output text connections keep the character vector up to date line-by-line, they are relatively expensive to use, and it is often better to use an anonymous file() connection to collect output.

like image 587
BrodieG Avatar asked Feb 07 '14 01:02

BrodieG


People also ask

What is an anonymous file?

An Anonymous file is just a temporary file without a descriptor( sort of header).


1 Answers

My C is very rusty, so hopefully more experienced people can correct me, but I think the answer to your question "What exactly is this anonymous file? Does it exist on disk, or only in memory?" is "It exists on disk".

Here is what happens at C level (I'm looking at the source code at http://cran.r-project.org/src/base/R-3/R-3.0.2.tar.gz):

A. Function file_open, defined in src/main/connections.c:554, has the following logic related to anonymous file (with an empty description), lines 565-568:

if(strlen(con->description) == 0) {
    temp = TRUE;
    name = R_tmpnam("Rf", R_TempDir);
} else name = R_ExpandFileName(con->description);

So a new temporary filename is generated if no file name was supplied to file.

B. If the name of the file is not equal to stdin, the call R_fopen(name, con->mode) happens at line 585 (there some subtleties with Win32 and UTF8 names, but we can ignore them now).

C. Finally, the file name is unlinked at line 607. The documentation for unlink says:

The unlink() function removes the link named by path from its directory and decrements the link count of the file which was referenced by the link. If that decrement reduces the link count of the file to zero, and no process has the file open, then all resources associated with the file are reclaimed. If one or more process have the file open when the last link is removed, the link is removed, but the removal of the file is delayed until all references to it have been closed.

So in effect the directory entry is removed but file exists as long as it's being open by R process.

D. Finally, R_fopen is defined in src/main/sysutils.c:135 and just calls fopen internally.

like image 149
Victor K. Avatar answered Oct 15 '22 16:10

Victor K.