Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What happens if I use "&" with string in scanf function?

Tags:

c

string

scanf

I just saw some code in a blog. It used

scanf("%s",&T);

but as we know, we shouldn't use ampersand with a string because it automatically assigns the first address of that string. I did run that code, and surprisingly it is working, so I want to know what happens when I use & in string?

#include <stdio.h>
int main()
{
    char T[2];
    scanf("%s", &T);
    printf("You entered %s\n", T);
}
like image 968
mahin hossen Avatar asked Jan 28 '23 03:01

mahin hossen


2 Answers

The relevant part of the code snippet is:

char T[2];
scanf("%s", &T);

&T is a pointer to the array of two characters (char (*)[2]). This is not the type that scanf needs for a %s specifier: it needs a pointer to a character (char *). So the behavior of the program is undefined.

The correct way to write this program, as you know, is

char T[2];
scanf("%s", T);

Since T is an array, when it is used in most contexts, it “decays” to a pointer to the first character: T is equivalent to &(T[0]) which has the type char *. This decay does not happen when you take the address of the array (&T) or its size (sizeof(T)).

In practice, almost all platforms use the same representation for all pointers to the same address. So the compiler generates exactly the same code for T and &T. There are some rare platforms that may generate different code (I've heard of them but I couldn't name one). Some platforms use different encodings for “byte pointers” and “word pointers”, because their processor natively addresses words, not bytes. On such platforms, an int * and a char * that point to the same address have different encodings. A cast between those types converts the value, but misuse in something like a variable argument list would result in the wrong address. I would expect such platforms to use byte addresses for a char array, however. There are also rare platforms where a pointer encodes not only the address of the data, but also some type or size information. However, on such platforms, the type and size information would have to be equivalent: it's a block of 2 bytes, starting at the address of T, and addressable byte by byte. So this particular mistake is unlikely to have any practical impact.

Note that it would be completely different if you had a pointer instead of an array in the first place:

char *T; // known to point to an array of two characters
scanf("%s", &T); // bad

Here &T is a pointer to the location in memory that contains the address of the character array. So scanf would write the characters that it reads at the location where the pointer T is stored in memory, not at the location that T points to. Most compilers analyze the format string of functions like printf and scanf and so would emit an error message.

Note that char T[2] only has room for two characters, and this includes the null byte at the end of the string. So scanf("%s", T) only has room to read a single character. If the input contains more than one non-whitespace character at this point, the program will overflow the buffer. To read a single character and make it a one-character string, use

char T[2];
scanf("%c", T);
T[1] = 0;

Unlike scanf("%s", T), this reads any character, even whitespace. To read a string with a length limit, add a limit to the %s specification. You should never use an unlimited %s in scanf since this will read as much input as is available, regardless of how much room there is to store this input in memory.

char T[2];
scanf("%1s", T); // one less than the array size
like image 78
Gilles 'SO- stop being evil' Avatar answered Feb 07 '23 18:02

Gilles 'SO- stop being evil'


Technically speaking, this is a type mismatch, leading to undefined behavior. For scanning a string, the expected argument is a pointer to the initial element of a character array.

When you have an array t of type char[somevalue], when you say

scanf("%s",t);

t decays to a pointer to the first element, so that is OK.

On the other hand, when you say &t, it is of type char (*)[somevalue] - pointer to an array, the whole array, not the pointer to the the initial element of the array.

Now, since the address of the array and the address of the first element of the array are same (memory location), so, writing the scanned value to the supplied address may not lead to any problem and work as intended - but this is neither defined nor recommended.

like image 36
Sourav Ghosh Avatar answered Feb 07 '23 19:02

Sourav Ghosh