The C standard does not define what a pointer is internally and how it works internally. This is intentional so as not to limit the number of platforms, where C can be implemented as a compiled or interpreted language.
A pointer value can be some kind of ID or handle or a combination of several IDs (say hello to x86 segments and offsets) and not necessarily a real memory address. This ID could be anything, even a fixed-size text string. Non-address representations may be especially useful for a C interpreter.
Answer from Alexey Frunze on Stack OverflowWhat exactly is a C pointer if not a memory address? - Stack Overflow
Why Use Pointers in C? - Stack Overflow
programming languages - What is a Pointer? - Stack Overflow
Should the pointer datatype be always same as the data type of variable it points in embedded C? - Electrical Engineering Stack Exchange
Videos
Could anyone recommend a video that provides a clear explanation of pointers in C programming? I've been struggling to understand them, and I'm looking for a resource that breaks down the concept effectively.
The C standard does not define what a pointer is internally and how it works internally. This is intentional so as not to limit the number of platforms, where C can be implemented as a compiled or interpreted language.
A pointer value can be some kind of ID or handle or a combination of several IDs (say hello to x86 segments and offsets) and not necessarily a real memory address. This ID could be anything, even a fixed-size text string. Non-address representations may be especially useful for a C interpreter.
Answer from Alexey Frunze on Stack OverflowThe C standard does not define what a pointer is internally and how it works internally. This is intentional so as not to limit the number of platforms, where C can be implemented as a compiled or interpreted language.
A pointer value can be some kind of ID or handle or a combination of several IDs (say hello to x86 segments and offsets) and not necessarily a real memory address. This ID could be anything, even a fixed-size text string. Non-address representations may be especially useful for a C interpreter.
I'm not sure about your source, but the type of language you're describing comes from the C standard:
6.5.3.2 Address and indirection operators
[...]
3. The unary & operator yields the address of its operand. [...]
So... yeah, pointers point to memory addresses. At least that's how the C standard suggests it to mean.
To say it a bit more clearly, a pointer is a variable holding the value of some address. The address of an object (which may be stored in a pointer) is returned with the unary & operator.
I can store the address "42 Wallaby Way, Sydney" in a variable (and that variable would be a "pointer" of sorts, but since that's not a memory address it's not something we'd properly call a "pointer"). Your computer has addresses for its buckets of memory. Pointers store the value of an address (i.e. a pointer stores the value "42 Wallaby Way, Sydney", which is an address).
Edit: I want to expand on Alexey Frunze's comment.
What exactly is a pointer? Let's look at the C standard:
6.2.5 Types
[...]
20. [...]
A pointer type may be derived from a function type or an object type, called the referenced type. A pointer type describes an object whose value provides a reference to an entity of the referenced type. A pointer type derived from the referenced type T is sometimes called ‘‘pointer to T’’. The construction of a pointer type from a referenced type is called ‘‘pointer type derivation’’. A pointer type is a complete object type.
Essentially, pointers store a value that provides a reference to some object or function. Kind of. Pointers are intended to store a value that provides a reference to some object or function, but that's not always the case:
6.3.2.3 Pointers
[...]
5. An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
The above quote says that we can turn an integer into a pointer. If we do that (that is, if we stuff an integer value into a pointer instead of a specific reference to an object or function), then the pointer "might not point to an entity of reference type" (i.e. it may not provide a reference to an object or function). It might provide us with something else. And this is one place where you might stick some kind of handle or ID in a pointer (i.e. the pointer isn't pointing to an object; it's storing a value that represents something, but that value may not be an address).
So yes, as Alexey Frunze says, it's possible a pointer isn't storing an address to an object or function. It's possible a pointer is instead storing some kind of "handle" or ID, and you can do this by assigning some arbitrary integer value to a pointer. What this handle or ID represents depends on the system/environment/context. So long as your system/implementation can make sense of the value, you're in good shape (but that depends on the specific value and the specific system/implemenation).
Normally, a pointer stores an address to an object or function. If it isn't storing an actual address (to an object or function), the result is implementation defined (meaning that exactly what happens and what the pointer now represents depends on your system and implementation, so it might be a handle or ID on a particular system, but using the same code/value on another system might crash your program).
That ended up being longer than I thought it would be...
One common place where pointers are helpful is when you are writing functions. Functions take their arguments 'by value', which means that they get a copy of what is passed in and if a function assigns a new value to one of its arguments that will not affect the caller. This means that you couldn't write a "doubling" function like this:
void doubling(int x)
{
x = x * 2;
}
This makes sense because otherwise what would the program do if you called doubling like this:
doubling(5);
Pointers provide a tool for solving this problem because they let you write functions that take the address of a variable, for example:
void doubling2(int *x)
{
(*x) = (*x) * 2;
}
The function above takes the address of an integer as its argument. The one line in the function body dereferences that address twice: on the left-hand side of the equal sign we are storing into that address and on the right-hand side we are getting the integer value from that address and then multiply it by 2. The end result is that the value found at that address is now doubled.
As an aside, when we want to call this new function we can't pass in a literal value (e.g. doubling2(5)) as it won't compile because we are not properly giving the function an address. One way to give it an address would look like this:
int a = 5;
doubling2(&a);
The end result of this would be that our variable a would contain 10.
A variable itself is a pointer to data
No, it is not. A variable represents an object, an lvalue. The concept of lvalue is fundamentally different from the concept of a pointer. You seem to be mixing the two.
In C it is not possible to "rebind" an lvalue to make it "point" to a different location in memory. The binding between lvalues and their memory locations is determined and fixed at compile time. It is not always 100% specific (e.g. absolute location of a local variable is not known at compile time), but it is sufficiently specific to make it non-user-adjustable at run time.
The whole idea of a pointer is that its value is generally determined at run time and can be made to point to different memory locations at run time.
This wikipedia article will give you detailed information on what a pointer is:
In computer science, a pointer is a programming language data type whose value refers directly to (or "points to") another value stored elsewhere in the computer memory using its address. Obtaining or requesting the value to which a pointer refers is called dereferencing the pointer. A pointer is a simple implementation of the general reference data type (although it is quite different from the facility referred to as a reference in C++). Pointers to data improve performance for repetitive operations such as traversing string and tree structures, and pointers to functions are used for binding methods in Object-oriented programming and run-time linking to dynamic link libraries (DLLs).
A pointer is a variable that contains the address of another variable. This allows you to reference another variable indirectly. For example, in C:
// x is an integer variable
int x = 5;
// xpointer is a variable that references (points to) integer variables
int *xpointer;
// We store the address (& operator) of x into xpointer.
xpointer = &x;
// We use the dereferencing operator (*) to say that we want to work with
// the variable that xpointer references
*xpointer = 7;
if (5 == x) {
// Not true
} else if (7 == x) {
// True since we used xpointer to modify x
}
This is a somewhat complex topic. Generally, unless you are a C veteran, then my advise is to never convert a pointer to a different type. Even conversions to/from void pointers are very often questionable.
If we are to restrict the topic to object pointers (and ignore function pointers), then first there's the mentioned void pointers - every object pointer type in C can be implicitly converted to a void pointer and vice versa. That is, the conversion itself is safe, what happens when you de-reference the data is another story.
Other than void pointers, you generally get a compiler error when trying to assign between pointers to different type. C has a much stronger type system for pointers than for say integers.
C also allows all manner of wild pointer conversions by the means of a cast. The conversion itself is almost always fine - what might not be fine is what happens when you de-reference the pointed-at data. The C standard says this (C17 6.3.2.3/7):
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.
So if you have for example a uint8_t* pointer pointing at an aligned address, increase that one by 1 byte, then convert to uint16_t*, you may get a misaligned access. Depending on MCU core used, this may or may not be a problem. Generally, 8 bitter MCUs don't care about alignment. Some 16 bitters do, some don't. Pretty much all 32 bitters do. Also there a CPUs which can give instruction traps for misalignment at the point of conversion, even before de-referencing.
And then if we continue to read the same text quoted above, it continues:
When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.
This means that we can use a character type pointer, such as unsigned char*, to inspect individual bytes of a larger object. (uint8_t* almost certainly counts as a character type on any mainstream system.) This is useful for serialization of data, if you for example want to send a 32 bit integer over some serial bus, one byte at a time.
However, we cannot grab a chunk of raw bytes and access that through a pointer to a larger type. There is no special rule like the one above for such scenarios, rather it is something called a "strict aliasing violation" (What is the strict aliasing rule?):
uint8_t array [n] = { ... };
uint16_t* ptr = (uint16_t*)array; // C allows this conversion in itself, but...
*ptr = something; // this is BAD, undefined behavior - a strict aliasing violation
In order to dodge this dangerous part of C, we would rather invent custom union types for "type punning" purposes like the one above:
typedef union
{
uint8_t array8 [n];
uint16_t array16 [n/2]
} array_t;
This allows us to access the data as different types without using dangerous pointer conversions.
Other special exception scenarios do exist, for example we are allowed to convert between a struct pointer and a pointer to that struct's first member. The special rule for this is (C17 6.7.2.1/15):
pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.
This is only safe for the first object of the struct! Unless that object is an array (or a union between an array and something else)
And finally, there's the matter of qualified pointers. If you have data that is qualified with const or volatile, then the pointer to that data must use the same qualifier(s). We may never "cast away" qualifiers, doing so is undefined behavior and may result in strange program behavior. It is however always fine to go from a non-qualified pointer to a qualified one.
int* i_ptr;
const int* ci_ptr = ptr; // fine, and no need to cast either
int* another_ptr = ci_ptr; // BAD, undefined behavior
volatile uint8_t some_register;
volatile uint8_t* reg = &some_register; // fine
int* another_ptr = reg; // BAD, undefined behavior
void my_func (const uint8_t* data)
{
uint8_t* ptr = (uint8_t*)data; // BAD, undefined behavior
}
But here as well, the undefined behavior doesn't occur until you try to de-reference the pointer. The specific rule (C17 6.7.3/6):
If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is made to refer to an object defined with a volatile-qualified type through use of an lvalue with non-volatile-qualified type, the behavior is undefined.
In general, yes the pointer to a variable should have the same type as the variable it points to. Basically respect the principle of least suprise, don't use a void pointer where an int pointer will do.
int x = 100;
int * xPtr = &x;
However, sometimes when you have data structures, you can use a pointer to "interpret" the structure as an array. This can be kind of dangerous as the memory alignment of variables is often platform-dependent and compiler-dependent. Edit : It's actually undefined behavior, however you still encounter this kind of code.
typedef struct
{ int x;
int y;
int z;
} coordinates_t;
coordinates_t coordinates = {-1, 0, 1};
int * coordinatesPtr = (int *) &coordinates;
coordinatesPtr[2] = 3; // coordinates.z = 3 now
It is also useful if you want to transfer floating-point data on a serial port in a binary format for example. This example will not work on all platforms, so be careful. The format of the bytes will change whether your platform is little-endian or big-endian. Plus, I've seen some weird 24-bit floating-point numbers on PIC18 microcontrollers in the past.
float dummy = 32.3456;
uint8_t * bytePtr = (uint8_t *) &dummy;
rs232Tx(&myUart, bytePtr, 4); // send floating-point number byte-by-byte
Finally, when you want to use some kind of DMA (or memcpy) you usually use void pointers even if it's not always explicit
void * memcpy ( void * destination, const void * source, size_t num );