Can anyone explain this C program..

silverscorpion · 2009-06-20T02:31:58+00:00

Hi, I was just trying something with C and I stumbled upon this little piece of code. void main() { union { int a; struct { char b; char c; }ch; }num; num.a=0; num.ch.c++; printf("%d",num.a); }The answer came out to be 256. Each time I increment num.ch.c, the value of num.a increases by 256. ie, if the value of num.ch.c is 3, the value of num.a would be 762. Why is this so? Note that, if I change the value of num.a, the value of num.ch.c does not change. It remains the same as num.a, which should be expected. Please explain..

Can anyone explain this C program..

silverscorpion

Member

Updated: Oct 26, 2024

Views: 936

Hi,

I was just trying something with C and I stumbled upon this little piece of code.

void main()
{
union
{
int a;

struct
{
char b;
char c;
}ch;
}num;

num.a=0;
num.ch.c++;

printf("%d",num.a);
}
The answer came out to be 256.
Each time I increment num.ch.c, the value of num.a increases by 256.
ie, if the value of num.ch.c is 3, the value of num.a would be 762.
Why is this so?

Note that, if I change the value of num.a, the value of num.ch.c does not change. It remains the same as num.a, which should be expected.

Please explain..

0

Replies

Howdy guest!

Dear guest, you must be logged-in to participate on CrazyEngineers. We would love to have you as a member of our community. Consider creating an account or login.

Replies

skipper

Member • Jun 20, 2009

The reason for the value sequence is because of the way the compiler assigns storage, the chars are stored backwards to how they get loaded into registers
The printf statement interprets the 16 bit value.

Are you sure? This action cannot be undone.
Cancel
ms_cs

Member • Jun 20, 2009

One question SS, How did you tell that, the value is incremented by 256?
Actually If you use union data type, the memory allocated based on members data type.And here memory allocated is 2 byte. First you have assigned 0 to a. Now this integer takes that 2 byte allocated memory. When you increment the member of struct , that will occupy those 2 byte memory. So again when you access integer, the same memory that 2 byte will be used, which doesnot have any value or it might have some garbage value. That;s why the result is 256.
Correct me If I am wrong

Are you sure? This action cannot be undone.
Cancel
ali_shakiba

Member • Jun 21, 2009

Salam = Hi;
I've also run the code; with some corrections to be build-able by gcc.
The result was 0 as I've commented the
num.ch.c++ line;
but after I un-comment it; the result was the same;
Dear skipper; I can not understand what you said;

The reason for the value sequence is because of the way the compiler assigns storage, the chars are stored backwards to how they get loaded into registers
The printf statement interprets the 16 bit value.
Is it a standard? if yes, would you mind setting a reference point for it.

Thanks in advance;
& With the hope of rising of Mahdi;
Ali

Are you sure? This action cannot be undone.
Cancel
silverscorpion

Member • Jun 21, 2009

ms_cs
One question SS, How did you tell that, the value is incremented by 256?
Actually If you use union data type, the memory allocated based on members data type.And here memory allocated is 2 byte. First you have assigned 0 to a. Now this integer takes that 2 byte allocated memory. When you increment the member of struct , that will occupy those 2 byte memory. So again when you access integer, the same memory that 2 byte will be used, which doesnot have any value or it might have some garbage value. That;s why the result is 256.
Correct me If I am wrong
I couldn't get you?
How is the 'garbage value' increasing in steps of 256?

Are you sure? This action cannot be undone.
Cancel
silverscorpion

Member • Jun 21, 2009

Or, for your beter understanding, I'll reframe the question. Just insert a for loop before the print statement, and You'll see..

void main()
{
int i;
union
{
int a;

struct
{
char b;
char c;
}ch;
}num;

num.a=0;

for(i=0;i<5;i++)
{
num.ch.c++;
printf("%d\n",num.a);
}
}
The answer came out to be,

256
512
768
1024
1280
Can you understand my question now?

Are you sure? This action cannot be undone.
Cancel
ms_cs

Member • Jun 21, 2009

Your question is clear now. I will check and tell...

Are you sure? This action cannot be undone.
Cancel
ali_shakiba

Member • Jun 21, 2009

Salam = Hi,
Dear silverscorpion; can you explain the context in which you faced this portion of code? or you faced it while you where playing with the C language?
If I know the context in which you faced it; I might be able to give you a solution and letting this problem to be studied later on;

Thanks
&
with the hope of rising of Mahdi;
Ali

Are you sure? This action cannot be undone.
Cancel
skipper

Member • Jun 21, 2009

It's because the ORDER is different for character strings than ints, in the machine the compiler implements.
So if you store the characters "XY" in a and b , they are read as an int like: "YX", except the printf interprets a decimal integer (signed). Try a second printf that prints two characters in octal, say, or hex.

Are you sure? This action cannot be undone.
Cancel
ms_cs

Member • Jun 21, 2009

void main()
{
int i;
union
{
int a;

struct
{
char b;
char c;
}ch;
}num;

num.a=0;
num.ch.c++;

for(i=0;i<5;i++)
printf("%d\n",num.a);
}
The result for this is,

256
256
256
256
256
I didn't get the following

256
512
768
1024
1280

Are you sure? This action cannot be undone.
Cancel
silverscorpion

Member • Jun 21, 2009

My mistake.. The code is this:

void main()
{
int i;
union
{
int a;

struct
{
char b;
char c;
}ch;
}num;

num.a=0;

for(i=0;i<5;i++)
{
num.ch.c++;
printf("%d\n",num.a);
}
}
ie, you should put the increment statement inside the for loop. Sorry..

Are you sure? This action cannot be undone.
Cancel
silverscorpion

Member • Jun 21, 2009

ali_shakiba
Salam = Hi,
Dear silverscorpion; can you explain the context in which you faced this portion of code? or you faced it while you where playing with the C language?
If I know the context in which you faced it; I might be able to give you a solution and letting this problem to be studied later on;

Thanks
&
with the hope of rising of Mahdi;
Ali
Hi,
I faced this problem while I was just playing with C. There's no particular situation, so there's no need for a solution, per se.
I just wanted to know why this happens.

And thank you!! 😀😀

Are you sure? This action cannot be undone.
Cancel
Munguti

Member • Jun 21, 2009

My C programming skills are hazy, but i think a union cannot have a structure similar to the one you are using, something to do with it having only one value because of its memory structure.
i will confirm this later when i get time to go through my c programming books. For the time being do some digging on the memory structure of unions.

Are you sure? This action cannot be undone.
Cancel
skipper

Member • Jun 24, 2009

I propose that the reason is different implementations in compilers. Try printing characters as well as signed integer values, address the values differently, say as boolean strings, etc.
Also it isn't something you would do with a c compiler unless you were sure how it worked (my guess). C is intentionally not a safe language, it's been labeled the closest to the underlying machine as you get with compilation; you can write nice compilers (including C compilers) in cc and lex.

Are you sure? This action cannot be undone.
Cancel
pradeep_agrawal

Member • Jun 28, 2009
To understand the logic behind the behavior, its good to have understanding of union. This is described below under "Understanding Union". For the user who know this or directly want to go through the rational refer "Rational Behind Behavior".

Understanding Union

Similar to structure, union is a way to create a user defined data type which can hold variables of different types and size. But the difference is the size of a structure is aggregate of size of all variables present in a structure (considering memory alignment) and every variable hold different memory location, whereas size of union is same as size of its widest member and a same memory location is shared by all members.

The variables of a union can also be accessed the way variables of a structure are accessed, i.e.,
<union name>.member
or
<union pointer name>->member

Consider below code:
```
#include "stdio.h"

struct s {
  int i;
  char *p;
};

union u {
  int i;
  char *p;
};

int main() {
  struct s svar = { 0 };
  union u uvar = { 0 };
  char c = 0;

  svar.p = &c;
  uvar.p = &c;

  printf("Structure size: %d\n", sizeof(svar));
  printf("Union size: %d\n", sizeof(uvar));

  printf("i of structure: %d\n", svar.i);
  printf("i of Union: %d\n", uvar.i);
  printf("Address of c: %d\n", &c);

  return 0;
}
```
The output of code is similar to:
Structure size: 8
Union size: 4
i of structure: 0
i of Union: 2293459
Address of c: 2293459

From the code and its output it can be noticed that:
1. The structure and union contain same variable but the size of structure and union is different. Structure occupies 8 bytes (aggregate of size of int and pointer to char). Whereas union occupies only 4 bytes (both int and pointer to char need 4 bytes and memory is shared between them).

2. We assigned the same address of variable 'c' to char pointers of structure and union. When we output the value of variable 'i' of structure, it is '0', i.e., the value assigned during initialization has not got modified because all variables of structure hold different memory location. Whereas when we output the value of variable 'i' of union, it is same as the address of variable 'c' assigned to variable 'p' of union. This is because the memory is shared between the variables of union.

Rational Behind Behavior

Let's discuss the actual problem statement now.

The code in the given problem statement is:
```
void main()
{
union
{
int a;

struct
{
char b;
char c;
}ch;
}num;

num.a=0;
num.ch.c++;

printf("%d",num.a);
}
```
Here the union defined is:
```
union {
  int a;
  struct {
    char b;
    char c;
  } ch;
};
```
For the above union, the memory will be shared between the integer variable 'a' and structure variable 'ch'.

On a 32-bit machine, integer variable 'a' will take 4 bytes and the structure 'ch' will take 2 bytes. So the sharing of the memory will be as:
```
Variable a :  MSB-> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX  <-LSB
Variable ch:  MSB->                   XXXXXXXX XXXXXXXX  <-LSB
                                       char c   char b
```
Consider the given case. When we assign '0' to variable 'a' the above shared memory will look like:
```
Variable a :  MSB-> 00000000 00000000 00000000 00000000  <-LSB
Variable ch:  MSB->                   00000000 00000000  <-LSB
                                       char c   char b
```
And when we increment variable 'ch.c' by '1', the changes in shared memory will look like:
```
Variable a :  MSB-> 00000000 00000000 00000001 00000000  <-LSB
Variable ch:  MSB->                   00000001 00000000  <-LSB
                                       char c   char b
```
The decimal equivalent of binary value "00000000 00000000 00000001 00000000" is 256. Hence on incrementing 'ch.c' by one the value of 'a' got incremented by 256.

Consider one more increment in 'ch.c', the changes in shared memory will look like:
```
Variable a :  MSB-> 00000000 00000000 00000010 00000000  <-LSB
Variable ch:  MSB->                   00000010 00000000  <-LSB
                                       char c   char b
```
The decimal equivalent of binary value "00000000 00000000 00000010 00000000" is 512. Hence each increment in 'ch.c' increment the value of 'a' by 256.

Consider a case when we increment 'ch.b' instead of 'ch.c', the changes in shared memory will look like:
```
Variable a :  MSB-> 00000000 00000000 00000000 00000001  <-LSB
Variable ch:  MSB->                   00000000 00000001  <-LSB
                                       char c   char b
```
The decimal equivalent of binary value "00000000 00000000 00000000 00000001" is 1. Hence each increment in 'ch.b' increment the value of 'a' by 1.

The case can be extended to a union where the structure ch is having three character variables as:
```
union {
  int a;
  struct {
    char b;
    char c;
    char d;
  } ch;
};
```
For the above union the sharing of the memory will be as:
```
Variable a :  MSB-> XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX  <-LSB
Variable ch:  MSB->          XXXXXXXX XXXXXXXX XXXXXXXX  <-LSB
                              char d   char c   char b
```
Here if we initially assign '0' to variable 'a' and then increment variable 'ch.d' by 1, the memory will look like:
```
Variable a :  MSB-> 00000000 00000001 00000000 00000000  <-LSB
Variable ch:  MSB->          00000001 00000000 00000000  <-LSB
                              char d   char c   char b
```
The decimal equivalent of binary "00000000 00000001 00000000 00000000" is 65536, hence incrementing 'ch.d' by 1 will increment the value of 'a' by 65536.

Let me know if any item need more clarification.

-Pradeep
Are you sure? This action cannot be undone.
Cancel
Saandeep Sreerambatla

Member • Jun 28, 2009

Very good Explanation Pradeep.

Thanks😀

on 32-bit machine Integer takes 4 bytes , why does it take 4 ?
In the above code if we increment ch.b instead of ch.c are the values printed are 1 ,2 ,3 etc .

Are you sure? This action cannot be undone.
Cancel
pradeep_agrawal

Member • Jun 28, 2009

English-Scared
on 32-bit machine Integer takes 4 bytes , why does it take 4 ?
The data type 'int' was intended to always have size equal to the word size of the processor.

On a 16-bit machine the word size is 2byte and hence 'int' is of 2byte. Whereas on a 32-bit machine, the word size is of 4byte and hence 'int' is of 4byte. There are machines (e.g., 24bit DSP) where word size is 24-bit and hence size of 'int' is also 24bit.

English-Scared
In the above code if we increment ch.b instead of ch.c are the values printed are 1 ,2 ,3 etc .
Yes, as i states earlier in my explanation, when we increment ch.b instead of ch.c by 1, the value of 'a' also gets incremented by 1 instead of 256.

-Pradeep

Are you sure? This action cannot be undone.
Cancel
silverscorpion

Member • Jun 28, 2009

Nice explanation Pradeep. Thanks very much..

However, I still have a doubt. Why does ch.b occupy the first position and ch.c the second? Is it because of the way we have defined them??

If the code had been,

struct
{
char c;
char b;
}ch;
then will the increments be 1 by 1 and not by 256??

Are you sure? This action cannot be undone.
Cancel
Saandeep Sreerambatla

Member • Jun 28, 2009

Yes SS , i guess its depends upon your declaration.

Are you sure? This action cannot be undone.
Cancel
pradeep_agrawal

Member • Jun 28, 2009

silverscorpion
Why does ch.b occupy the first position and ch.c the second? Is it because of the way we have defined them??
Yes it is because of the way you have defined the structure.

If you declare ch.c first and ch.b later, then incrementing ch.c by 1 will increment 'a' by 1 and incrementing ch.b by 1 will increment 'a' by 256.

-Pradeep

Are you sure? This action cannot be undone.
Cancel