Pointers are not arrays. The basic reasons for choosing which to use are the same as they always are with arrays versus pointers. In the special case of flexible array members, here are some reasons you may prefer them over a pointer:
Reducing storage requirements. A pointer will enlarge your structure by (typically) 4 or 8 bytes, and you'll spend much more in overhead if you allocate the pointed-to storage separately rather than with a single call to
malloc.Improving access efficiency. A flexible array member is located at a constant offset from the structure base. A pointer requires a separate dereference. This affects both number of instructions required to access it, and register pressure.
Atomicity of allocation success/failure. If you allocate the structure and allocate storage for it to point to as two separate steps, your code for cleaning up in the failure cases will be much uglier, since you have the case where one succeeded and the other failed. This can be avoided with some pointer arithmetic to carve both out of the same
mallocrequest, but it's easy to get the logic wrong and invoke UB due to alignment issues.Avoiding need for deep-copy. If you use a flexible array instead of a pointer, you can simply memcpy (not assign, since assignment can't know the flexible array length) to copy the structure rather than having to copy the pointed-to data too and fix up the pointer in the new copy.
Avoiding need for deep-free. It's very convenient and clean to be able to just
freea single object rather than having tofreepointed-to data too. This can also be achieved with the "carving up a singlemalloc" approach mentioned above, of course, but flexible arrays make it easier and less error-prone.Surely many more reasons...
Pointers are not arrays. The basic reasons for choosing which to use are the same as they always are with arrays versus pointers. In the special case of flexible array members, here are some reasons you may prefer them over a pointer:
Reducing storage requirements. A pointer will enlarge your structure by (typically) 4 or 8 bytes, and you'll spend much more in overhead if you allocate the pointed-to storage separately rather than with a single call to
malloc.Improving access efficiency. A flexible array member is located at a constant offset from the structure base. A pointer requires a separate dereference. This affects both number of instructions required to access it, and register pressure.
Atomicity of allocation success/failure. If you allocate the structure and allocate storage for it to point to as two separate steps, your code for cleaning up in the failure cases will be much uglier, since you have the case where one succeeded and the other failed. This can be avoided with some pointer arithmetic to carve both out of the same
mallocrequest, but it's easy to get the logic wrong and invoke UB due to alignment issues.Avoiding need for deep-copy. If you use a flexible array instead of a pointer, you can simply memcpy (not assign, since assignment can't know the flexible array length) to copy the structure rather than having to copy the pointed-to data too and fix up the pointer in the new copy.
Avoiding need for deep-free. It's very convenient and clean to be able to just
freea single object rather than having tofreepointed-to data too. This can also be achieved with the "carving up a singlemalloc" approach mentioned above, of course, but flexible arrays make it easier and less error-prone.Surely many more reasons...
Those concepts are definitely not necessary as you have pointed out yourself.
The differences between the two that you have demonstrated are where your data is located in memory.
In the first example with flexible array your metadata and the array itself are in the same block of memory and can be moved as one block (pointer) if you have to.
In the second example your metadata is on the stack and your array is elsewhere on the heap. In order to move/copy it you will now need to move two blocks of memory and update the pointer in your metadata structure.
Generally flexible size arrays are used when you need to place an array and it's metadata spatially together in memory.
An example where this is definitely useful is for instance when placing an array with it's metadata in a file - you have only one continuous block of memory and each time you load it it will (most likely) be placed in a different location of your VM.
Videos
Pointers are not arrays. The basic reasons for choosing which to use are the same as they always are with arrays versus pointers. In the special case of flexible array members, here are some reasons you may prefer them over a pointer:
Reducing storage requirements. A pointer will enlarge your structure by (typically) 4 or 8 bytes, and you'll spend much more in overhead if you allocate the pointed-to storage separately rather than with a single call to
malloc.Improving access efficiency. A flexible array member is located at a constant offset from the structure base. A pointer requires a separate dereference. This affects both number of instructions required to access it, and register pressure.
Atomicity of allocation success/failure. If you allocate the structure and allocate storage for it to point to as two separate steps, your code for cleaning up in the failure cases will be much uglier, since you have the case where one succeeded and the other failed. This can be avoided with some pointer arithmetic to carve both out of the same
mallocrequest, but it's easy to get the logic wrong and invoke UB due to alignment issues.Avoiding need for deep-copy. If you use a flexible array instead of a pointer, you can simply memcpy (not assign, since assignment can't know the flexible array length) to copy the structure rather than having to copy the pointed-to data too and fix up the pointer in the new copy.
Avoiding need for deep-free. It's very convenient and clean to be able to just
freea single object rather than having tofreepointed-to data too. This can also be achieved with the "carving up a singlemalloc" approach mentioned above, of course, but flexible arrays make it easier and less error-prone.Surely many more reasons...
Your structure declaration is incorrect: char foo[]; can only appear as the last member, and it is missing a ; at the end. Here is correct declaration:
struct ss {
char *foo;
char bar[3];
int bazSize;
char baz[];
};
We have a pointer foo and a flexible array baz at the end. When allocating such a structure from the heap, the actual space for the last member must be known and cannot be changed without reallocating the whose structure which may be complicated if the structure it referred to from various other places. Flexible arrays save space but are not flexible at all.
Advantages of the flexible array:
- save space
- save one indirection
- allocate in one step
bazis neverNULL
Conversely, making baz a pointer requires separate allocation of the array it points to. This disadvantage in size, code and speed comes with compensations.
Advantages of the pointer version:
bazcan beNULLto specify no data.bazcan be allocated on demand, when the actual size is known.bazcan be reallocated easily.
So which you should use depends on how you use these structures. The syntax at point of use is the same, but the compiler has seen the actual declaration and will generate the appropriate code.
The problem is that char *foo and char foo[] are only the same thing in some contexts (like function parameter declaration) and not others (like structure field declarations).
(I have not hacked C for a long while.)
The difference is how the struct is stored. In the first example you over-allocate memory but that doesn't magically mean that the data pointer gets set to point at that memory. Its value after malloc is in fact indeterminate, so you can't reliably print it.
Sure, you can set that pointer to point beyond the part allocated by the struct itself, but that means potentially slower access since you need to go through the pointer each time. Also you allocate the pointer itself as extra space (and potentially extra padding because of it), whereas in a flexible array member sizeof doesn't count the flexible array member. Your first design is overall much more cumbersome than the flexible version, but other than that well-defined.
The reason why people malloc twice when using a struct with pointers could either be that they aren't aware of flexible array members or using C90, or alternatively that the code isn't performance-critical and they just don't care about the overhead caused by fragmented allocation.
I am wondering whether it is safe and able to malloc struct with pointers like this, and what is the exactly reason programmers malloc twice when using struct with pointers.
If you use pointer method and malloc only once, there is one extra thing you need to care of in the calculation: alignment.
Let's add one extra field to the structure:
struct Vector {
size_t size;
uint32_t extra;
double *data;
};
Let's assume that we are on system where each field is 4 bytes, there is no trailing padding on struct and total size is 12 bytes. Let's also assume that double is 8 bytes and requires alignment to 8 bytes.
Now there is a problem: expression (char*)newVector + sizeof*newVector no longer gives address that is divisible by 8. There needs to be manual padding of 4 bytes between structure and data. This complicates the malloc size calculation and data pointer offset calculation.
So the main reason you see 1 malloc pointer version less, is that it is harder to get right. With pointer and 2 mallocs, or flexible array member, compiler takes care of necessary alignment calculation and padding so you don't have to.
Structures with a flexible array as their last member cannot be used as members of other structures or as array elements. In such constructions, the flexible array cannot be used as it has a size of 0 elements. The C Standard quoted by Jonathan Leffler is explicit, although the language used is quite technical and the paragraphs cannot be found in the Standard by searching for flexible.
The compiler should have issued an error for your array of struct vector.
In your program, you should instead use an array of pointers to struct vectors, each pointing to an object allocated for the appropriate number of elements in the its flexible array.
Here is a modified version:
Copy#include <stdio.h>
#include <stdlib.h>
struct vector {
size_t length;
double array[];
};
struct vector *make_vector(size_t n) {
struct vector *v = malloc(sizeof(*v) + n * sizeof(v->array[0]));
v->length = n;
for (size_t i = 0; i < n; i++) {
v->array[i] = (double)i;
}
return v;
}
int main(void) {
struct vector *arr[3];
arr[0] = make_vector(10);
arr[1] = make_vector(5);
arr[2] = make_vector(20);
for (size_t n = 0; n < 3; n++) {
for (size_t i = 0; i < arr[n]->length; i++) {
printf("arr[%zu]->array[%2zu] equals %2.0lf.\n",
n, i, arr[0]->array[i]);
}
}
return 0;
}
You can't have arrays of structures with flexible array members.
The C standard, ISO/IEC 9899:2011, says:
6.7.2.1 Structure and union specifiers
¶3 A structure or union shall not contain a member with incomplete or function type (hence, a structure shall not contain an instance of itself, but may contain a pointer to an instance of itself), except that the last member of a structure with more than one named member may have incomplete array type; such a structure (and any union containing, possibly recursively, a member that is such a structure) shall not be a member of a structure or an element of an array.
Emphasis added — the italic part of that prohibits arrays of structures with flexible array members. You can have arrays of pointers to such structures, though, but each structure will be separately allocated.
¶18 As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply. However, when a
.(or->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.
This defines a flexible array member.
If you think about it, it makes sense. Pointer arithmetic and arrays rely on all the objects in the array being the same size (hence the equivalence of a[i] == *(a + i), etc), so having an array of objects of varying size would break pointer arithmetic. An array of pointers isn't a problem because the pointers are all the same size, even if the objects pointed at are of different sizes.
If you manage to get a compiler to ignore the violated constraint, then each element of the array will have a zero length flexible array member because the structures will be treated as having the size of the structure without the array member (that's the 'in most situations, the flexible array member is ignored' rule at work). But the compiler should reject an array of a structure type with a flexible array member; such code is violating a constraint (¶3 is in the constraints section; ¶18 is in the semantics section).
The C FAQ answers precisely this question. The quick answer is that this structure will include the double array inside the structure rather than a pointer to an array outside the structure. As a quick example, you could use your structure as in this example:
struct s *mystruct = malloc(sizeof(struct s) + 5 * sizeof(double));
s->n = 12;
s->d[0] = 4.0;
s->d[1] = 5.0;
s->d[2] = 6.0;
And so on - the size of the array you care about is included in the allocation, and then you can use it just like any array. Normally such a type contains the size as part of the structure, since using the + trick to skip through an array of type s will be necessarily complicated by this situation.
To your added question 'how is this construct any more or less powerful than keeping a [pointer] as the 2nd element?', it's no more powerful per se, but you don't need to keep a pointer around, so you would save at least that much space - also when you are copying the structure, you would also copy the array, rather than a pointer to an array - a subtle difference sometimes, but very important other times. 'You-can-do-it-in-multiple-ways' is probably a good explanation, but there are cases where you would specifically want one design or the other.
The primary advantage is that a flexible array member allows you to allocate a single block of memory for the array along with the other data in the struct (with a pointer, you'd typically end up with two separately allocated blocks).
It's also useful with data transmitted by quite a few network protocols, where the incoming stream is defined the same way -- an integer defining a length, followed by that many units (typically bytes/octets) of data. You can (typically) use a type-pun to overlay a struct with a flexible array member onto a buffer filled with such data, and work with it directly instead of having to parse it out into pieces and then work with the pieces individually.
