While learning C, I made some mistakes and printed elements of a character array that were uninitialized.
If I expand the size of the array to be quite large, say 1 million elements in size and then print the contents, what comes out is not always user unreadable, but seems to contain some runtime info.
Consider the following code:
#include <stdio.h>
main() {
char s[1000000];
int c, i;
printf("Enter input string:\n");
for (i = 0; ( c = getchar()) != '\n'; i++) {
s[i] = c;
}
printf("Contents of input string:\n");
for (i = 0; i < 999999; i++) {
putchar(s[i]);
}
printf("\n");
return 0;
}
Just scrolling through the output, I find things such as:
???l????????_dyldVersionNumber_dyldVersionString_dyld_all_image_infos_dyld_fatal_error_dyld_shared_cache_ranges_error_string__mh_dylinker_header_stub_binding_helper_dyld_func_lookup_offset_to_dyld_all_image_infos__dyld_start__ZN13dyldbootstrapL30randomizeExecutableLoadAddressEPK12macho_headerPPKcPm__ZN13dyldbootstrap5startEPK12macho_headeriPPKcl__ZN4dyldL17setNewProgramVarsERK11ProgramVars__ZN4dyld17getExecutablePathEv__ZN4dyld22mainExecutablePreboundEv__ZN4dyld14mainExecutableEv__ZN4dyld21findImageByMachHeaderEPK11mach_header__ZN4dyld26findImageContainingAddressEPKv
and also,
Apple Inc.1&0$U ?0?*?H??ot CA0?"0ple Certification Authority10U ?䑩 ??GP??^y?-?6?WLU????Kl??"0?>?P ?A?????f?$kУ????z ?G?[?73??M?i??r?]?_???d5#KY?????P??XPg? ?ˬ, op??0??C??=?+I(??ε??^??=?:??? ?b??q?GSU?/A????p??LE~LkP?A??tb
?!.t?< ?A?3???0X?Z2?h???es?g^e?I?v?3e?w??-??z0?v0U?0U?0?0U+?iG?v ??k?.@??GM^0U#0?+?iG?v ??k?.@??GM^0?U 0?0? ?H??cd0??0+https://www.apple.com/appleca/0?+0????Reliance on this certificate by any party assumes acceptance of the then applicable standard terms and conditions of use, certificate poli?\6?L-x?팛??w??v?w0O????=G7?@?,Ա?ؾ?s???d?yO4آ>?x?k??}9??S ?8ı??O 01?H??[d?c3w?:,V??!ںsO??6?U٧??2B???q?~?R??B$*??M?^c?K?P????????7?uu!0?0??0
I believe one time my $PATH
environment variable was even printed out.
Can the contents of an uninitialized variable ever pose a security risk?
Update 1
Update 2
So it seems clear from the answers that this is indeed a security risk. This surprises me.
Is there no way for a program to declare its memory content protected to allow the OS to restrict any access to it other than the program that initialized that memory?
Most C
programs use malloc
to allocate memory. A common misunderstanding is that malloc
zeros out the memory returned. It actually does not.
As a result, due to the fact that memory chunks are "recycled" it is quite possible to get one with information of "value".
An example of this vulnerability was the tar
program on Solaris which emitted contents of /etc/passwd
. The root cause was the fact that the memory allocated to tar
to read a block from disk was not initialized and before getting this memory chunk the tar
utility made a OS system call to read /etc/passwd
. Due to the memory recycling and the fact that tar
did not initialize the chunk fragments of /etc/passwd
were printed to logs. This was solved by replacing malloc
with calloc
.
This is an actual example of security implication if you don't explicitly and properly initialize memory.
So yes, do initialize your memory properly.
Update:
Is there no way for a program to declare its memory content protected to allow the OS to restrict any access to it other than the program that initialized that memory?
The answer is yes (see in the end) and no.
I think that you view it the wrong way here. The more appropriate question would be for example, why doesn't malloc
initialize the memory on request or clears the memory on release but instead recycles it?
The answer is that the designers of the API explicitly decided not to initialize (or clear memory) as doing this for large blocks of memory 1)would impact performance and 2)is not always necessary (for example you may not deal, in your application or several parts in your application with data that you actually care if they are exposed). So the designers decided not to do it, as it would inadvertently impact performance, and to drop the ball to the programmer to decide on this.
So carrying this also to the OS, why should it be the OS's responsibility to clear the pages? You expect from your OS to hand you memory in a timely manner but security is up to the programmer.
Having said that there are some mechanism provided that you could use to make sure that sensitive data are not stored in swap using mlock in Linux.
mlock() and mlockall() respectively lock part or all of the calling process's virtual address space into RAM, preventing that memory from being paged to the swap area. munlock() and munlockall() perform the converse operation, respectively unlocking part or all of the calling process's virtual address space, so that pages in the specified virtual address range may once more to be swapped out if required by the kernel memory manager. Memory locking and unlocking are performed in units of whole pages.
Yes, at least on systems where the data may be transmitted to outside users.
There have been a whole series of attacks on webservers (and even iPods) where you get it to dump the contents of memory from other process - and so get details of the type and version of the OS, the data in other apps and even things like password tables
It's quite possible to perform some sensitive work in an area of memory, and not clear that buffer.
A future invocation can then retrieve that uncleared work via a call to malloc()
or by checking the heap (via an unitiaised buffer/array declaration). It could inspect it (maliciously) or inadvertently copy it. If you're doing anything sensitive it thus makes sense to clear that memory before binning it (memset()
or similar), and perhaps before using/copying it.
From the C standard:
6.7.8 Initialization
"If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate."
indeterminate value is defined as:
either an unspecified value or a trap representation.
Trap representation is defined as:
Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.41) Such a representation is called a trap representation.
Accessing such a values leads to undefined behaviour and can pose security threats.
This paper Attacks on uninitialized variables can give some insights on they can be used to exploit the system.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With