First, a C string is not just a char, but an array of char with the last element (or at least the last one that's counted as part of the string) set to the null character (numerically 0, also '\0' as a character constant).
Next, in the code you posted you probably meant char buffer[50] rather than char *buffer[50]... the version you have is an array of 50 char *s, but you need an array of 50 chars. After that's corrected, then...
Since fgets() always fills in a null char at the end of the string it read, buffer would already be a valid C string after you call fgets(). If you'd like to copy it to another string so you can reuse the buffer to read more input, you can use the usual string handling functions from <string.h>, such as strcpy(). Just make sure the string you copy it into is large enough to hold all the used characters plus a terminating null character.
This code copies the string into a newly malloc()ed string (error checking omitted):
char buffer[50];
char *str;
fgets(buffer,50,stdin);
str = malloc(strlen(buffer) + 1);
strcpy(str,buffer);
This code does the same, but copies to a char array on the stack (not malloc()ed):
char buffer[50];
char str[50];
fgets(buffer,50,stdin);
strcpy(str,buffer);
strlen() will tell you how many characters are used in the string, but doesn't count the terminating null (so you need to have one more character allocated than what strlen() returns). strcpy() will copy the characters and the null at the end from one string/buffer to another. It stops after the null, and doesn't know how much space you've allocated -- so you need to make sure it will find a null character before running out of space in the destination, or reaching the end of the source buffer. If in doubt, place a null at the end of the buffer yourself to make sure.
First, a C string is not just a char, but an array of char with the last element (or at least the last one that's counted as part of the string) set to the null character (numerically 0, also '\0' as a character constant).
Next, in the code you posted you probably meant char buffer[50] rather than char *buffer[50]... the version you have is an array of 50 char *s, but you need an array of 50 chars. After that's corrected, then...
Since fgets() always fills in a null char at the end of the string it read, buffer would already be a valid C string after you call fgets(). If you'd like to copy it to another string so you can reuse the buffer to read more input, you can use the usual string handling functions from <string.h>, such as strcpy(). Just make sure the string you copy it into is large enough to hold all the used characters plus a terminating null character.
This code copies the string into a newly malloc()ed string (error checking omitted):
char buffer[50];
char *str;
fgets(buffer,50,stdin);
str = malloc(strlen(buffer) + 1);
strcpy(str,buffer);
This code does the same, but copies to a char array on the stack (not malloc()ed):
char buffer[50];
char str[50];
fgets(buffer,50,stdin);
strcpy(str,buffer);
strlen() will tell you how many characters are used in the string, but doesn't count the terminating null (so you need to have one more character allocated than what strlen() returns). strcpy() will copy the characters and the null at the end from one string/buffer to another. It stops after the null, and doesn't know how much space you've allocated -- so you need to make sure it will find a null character before running out of space in the destination, or reaching the end of the source buffer. If in doubt, place a null at the end of the buffer yourself to make sure.
It should be char buffer[50]; and yes, you can then use strncpy (which does not care if it got a static or a heap allocated zone).
But I would recommend using getline in your case.
Python `ctypes` - How to copy buffer returned by C function into a bytearray - Stack Overflow
Copy bytes to buffer starting from nth byte of buffer in C - Stack Overflow
python - How to copy bytes from a ctypes Structure to a buffer created from create_string_buffer - Stack Overflow
loops - Copying characters from a source string to a buffer - C - Stack Overflow
The array animals is an array of pointers. It is not an array of buffers of some size. Therefor, if you do
sizeof(*animals)
You will get the sizeof of the first element of that array. Equivalent to
sizeof(char*)
Because your array stores pointers. So, in the line that reads
char *output[sizeof(*animals)];
You allocate 4 or 8 pointers in one array (depends on how wide a pointer on your platform is. Usually it's either 4 or 8). But that's of course not senseful! What you wanted to do is create an array of pointers of the same size as animals. You will have to first get the total size of the animals array, and then divide by the size of one element
char *output[sizeof(animals)/sizeof(*animals)];
Now, that is what you want. But the pointers will yet have indeterminate values... Next you pass the array using *&animals (same for the other). Why that? You can pass animals directly. Taking its address and then dereference is the same as doing nothing in the first place.
Then in the function you call, you copy the strings pointed to by elements in animal to some indeterminate destination (remember the elements of the output array - the pointers - have yet indeterminate values. We have not assigned them yet!). You first have to allocate the right amount of memory and make the elements point to that.
while(*animals) {
// now after this line, the pointer points to something sensible
*output = malloc(sizeof("new animal ") + strlen(*animals));
sprintf(*output, "new animal %s", *animals);
output++; // no need to dereference the result
animals++; // don't forget to increment animals too!
}
Addition, about the sizeof above
There's one important thing you have to be sure about. It's the way we calculate the size. Whatever you do, make sure you always have enough room for your string! A C string consists of characters and a terminating null character, which marks the end of the string. So, *output should point to a buffer that is at least as large so that it contains space for "new animal " and *animals. The first contains 11 characters. The second depends on what we actually copy over - its length is what strlen returns. So, in total we need
12 + strlen(*animals)
space for all characters including the terminating null. Now it's not good style to hardcode that number into your code. The prefix could change and you could forget to update the number or miscount about one or two characters. That is why we use sizeof, which we provide with the string literal we want to have prepended. Recall that a sizeof expression evaluates to the size of its operand. You use it in main to get the total size of your array before. Now you use it for the string literal. All string literals are arrays of characters. string literals consist of the characters you type in addition to the null character. So, the following condition holds, because strlen counts the length of a C string, and does not include the terminating null character to its length
// "abc" would have the type char[4] (array of 4 characters)
sizeof "..." == strlen("...") + 1
We don't have to divide by the size of one element, because the sizeof char is one anyway, so it won't make a difference. Why do we use sizeof instead of strlen? Because it already accounts for the terminating null character, and it evaluates at compile time. The compiler can literally substitute the size that the sizeof expression returns.
You haven't allocated any space in your output array to put the copy into. You'll need to use malloc to allocate some space before using sprintf to copy into that buffer.
void p_init(const char **animals, char **output)
{
while(*animals)
{
size_t stringSize = 42; /* Use strlen etc to calculate the size you need, and don't for get space for the NULL! */
*output = (char *)malloc(stringSize);
sprintf(*output, "new animal %s", *animals);
output++;
animals++;
}
}
Don't forget to call free() on that allocated memory when you are done with it.
OK, reading the answer to the question linked to in the comments (thanks, @"John Zwinck" and @"eryksun"), there are two ways of storing the data, either in a bytearray or a numpy.array. In all these snippets, image_data is of type POINTER(c_ubyte), and we have array_type defined as -
array_type = c_ubyte * num_channels * width * height
We can create a bytearray first and then loop over and set the bytes
arr_bytes = bytearray(array_size)
for i in range(array_size):
arr_bytes[i] = image_data[i]
Or a better way is to create a C array instance using from_address and then initialize a bytearray with it -
image_data_carray = array_type.from_address(addressof(image_data.contents))
# Copy into bytearray
image_data_bytearray = bytearray(image_data_carray)
And during writing the image (didn't ask this question, just sharing for completeness), we can obtain pointer to the bytearray data like this and give it to stbi_write_png
image_data_carray = array_type.from_buffer(image_data_bytearray)
image_data = cast(image_data_carray, POINTER(c_ubyte))
The numpy based way of doing it is as answered in the linked question
address = addressof(image_data.contents)
image_data_ptr = np.ctypeslib.as_array(array_type.from_address(address))
This alone however only points to the memory returned by the C function, doesn't copy into a Python-managed array object. We can copy by creating a numpy array as
image_data = np.array(image_data_ptr)
To confirm I have done an assert all(arr_np == arr_bytes) there. And arr_np.dtype is uint8.
And during writing the image, we can obtain a pointer to the numpy array's data like this
image_data = image_data_numpy.ctypes.data_as(POINTER(c_ubyte))
Your variable array_type shouldn't even be called thus as it is in fact not an initialized C array nor any kind of type, but a Python object prepared for doing the array initialization. Well, initialized array also shouldn't be called thus as well. :D
You should be doing there an equivalent of:
unsigned char array[channels*width*height];
in C. Then array is a pointer to N*types unsigned char pointing to first byte of the array. (index 0) A cast() should get a pointer to see the data's type,. So doing:
array = (c.c_ubyte*(channels*width*height))()
should do the trick. But you don't need extra allocated memory. So you can create a pointer as suggested in a comment.
But I suggest you use:
image_data = bytearray(c.string_at(image_data))
It should work, assuming, of course, that returned image is null terminated. Well, this also implies using signed chars but it doesn't have to be. If you wrote the C portion, just allocate one byte extra to the memory that will contain an image which is declared/cast to contain unsigned chars and put the last item to 0. Then leave the algorithm to work as before. If you do not null terminate it, you will still get the whole image with string_at(), but there will be a memory leak of 3 bytes or so more. Very undesirable.
I used this trick in my C module for colorspace conversion. It works extremely fast as there are no loops, No anything extra. string_at() just pulls in the buffer and creates Python string wrapper around it. Then you can use numpy.fromstring(...) or array.array("B", image_data) or use bytearray() as above etc.
Otherwise, well, I saw your answer just now. You can do it as you wrote as well, but I think that my dirty trick is better (if you can change the C code, of course).
P.S. Whoops! I just saw in a doc string that string_at() can have an optional argument size. Perhaps using it will completely ignore the termination and there wouldn't be any leakage. I am asking myself now why didn't I use it in my project but messed with null termination. Perhaps out of lazyness. Using size shouldn't require any modifications to C code. So it would be:
image_data = bytearray(c.string_at(image_data, channels*width*height))
The fundamental copying, where you know how many bytes you want to copy, where you want to take them from, and where you want to put them, can be done with memcpy:
#include <string.h>
char const* src = "888";
char dest[] = "Hello World!";
memcpy(&dest[2], src, 3);
// dest now contains "He888 World!"
That copies 3 bytes from src to &dest[2] (which is two bytes past the start). Like most things in C, it’s up to you to make sure the operation is valid.
If the length of src is variable, you can use strlen to find its length:
char const* src = "1234";
char dest[] = "Hello World!";
memcpy(&dest[2], src, strlen(src));
// dest now contains "He1234World!"
If you want to produce the result as a separate string from both inputs, you can allocate memory for a copy of the string and copy it using strdup before making any changes (this memory has to be freed with free):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char const* src = "1234";
char const* dest = "Hello World!";
char* result = strdup(dest);
if (result == NULL) {
fputs("failed to allocate memory\n", stderr);
return EXIT_FAILURE;
}
memcpy(&result[2], src, strlen(src));
// result now contains "He1234World!"
free(result);
If it’s not otherwise guaranteed that src will fit in dest, you get to experience the wonderful world of avoiding integer overflow:
size_t offset = 2;
if (offset > strlen(dest) || strlen(src) > strlen(dest) - offset) {
fprintf(
stderr,
"src (%zu bytes) is too long to copy into dest (%zu bytes) at offset %zu",
strlen(src), strlen(dest), offset
);
return EXIT_FAILURE;
}
memcpy(&result[offset], src, strlen(src));
Hoping that’s not necessary for your purposes.
You could try copying it byte by byte to your destination string.
strcpy(result, dest);
for(int i = start; i < strlen(src); i++)
{
result[i] = src[i - start];
}
start is your nth character.
You need to assign to dest inside the function
int tokenCopy(char *dest, const char *src, int destSize) {
*dest = 0;
// rest of the code
}
After getting rid of the first for loop, you can modify the second for loop to something like
for ( i = 0; *src != '\0' && *src!= ' ' && (i < destSize-1) ; i++)
{
*dest++ = *src++;
}
*dest = '\0';
return new_value;
to get what you want. The modification includes
- getting rid of the redundant check on empty string.
- combining the copying of value and the pointer increment
- adding the destination length check as a terminating condition for the loop itself
Thanks to @ErykSun the solution:
Python code
string1 = "my string 1"
string2 = "my string 2"
# create byte objects from the strings
b_string1 = string1.encode('utf-8')
b_string2 = string2.encode('utf-8')
# send strings to c function
my_c_function.argtypes = [ctypes.c_char_p, ctypes.c_char_p]
my_c_function(b_string1, b_string2)
I think you just need to use c_char_p() instead of create_string_buffer().
string1 = "my string 1"
string2 = "my string 2"
# create byte objects from the strings
b_string1 = string1.encode('utf-8')
b_string2 = string2.encode('utf-8')
# send strings to c function
my_c_function(ctypes.c_char_p(b_string1),
ctypes.c_char_p(b_string2))
If you need mutable strings then use create_string_buffer() and cast those to c_char_p using ctypes.cast().
(as opposed to doing something similar natively in C)
Running into an issue where a function in a DLL expects an unsigned char * as a buffer to byte data (specifically this function reads one byte from a device and stores that byte in the passed in data buffer). So I do ctypes.create_string_buffer(size) and pass that in as the data arg to this DLL function. This works sometimes, but I just spent some time debugging why it doesn't work at times, and it seems to be because when the data being read and set has a value of 0, this causes some weird behavior where this string buffer (which I know is actually a ctypes array of c_chars, but string buffer is more concise) then treats this value as a null character, and therefore goes wacky, specifically when I try to access that byte via `data.value[0]` (this causes an index out of range error). If the byte being read and set is any other value, it seems to work fine and 0 is a valid index into this string buffer.
I don't have a full 100% grasp on what's going on here, but it *seems* like there's just something under the hood with how these string buffers are used. I think in C these issues don't exist because if you're using a buffer of chars to store byte data rather than characters, then you won't ever really parse the bytes as a string and therefore the value of 0 anywhere in the buffer won't cause weird issues.
But I guess in ctypes/python it's different? Just wanted to get other opinions here to see if my current understanding is correct or at least headed in the right direction.
Let me know if anything isn't clear!
The correct answer is:
p.communicate(b"insert into egg values ('egg')")
Note the leading b, telling you that it's a string of bytes, not a string of unicode characters. Also, if you are reading this from a file:
value = open('thefile', 'rt').read()
p.communicate(value)
The change that to:
value = open('thefile', 'rb').read()
p.communicate(value)
Again, note the 'b'.
Now if your value is a string you get from an API that only returns strings no matter what, then you need to encode it.
p.communicate(value.encode('latin-1'))
Latin-1, because unlike ASCII it supports all 256 bytes. But that said, having binary data in unicode is asking for trouble. It's better if you can make it binary from the start.
You can convert it to bytes with encode method:
>>> "insert into egg values ('egg');".encode('ascii') # ascii is just an example
b"insert into egg values ('egg');"
I just fell foul of the fact that strncpy does not add an old terminator if the destination buffer is shorter than the source string. Is there a single function standard library replacement that I could drop in to the various places strncpy is used that would copy a null terminated string up to the length of the destination buffer, guaranteeing early (but correct) termination of the destination string, if the destination buffer is too short?
Edit:
-
Yes, I do need C-null terminated strings. This C API is called by something else that provides a buffer for me to copy into, with the expectation that it’s null terminated
Edit 2:
-
I know I can write a helper function that’s shared across relevant parts of the code, but I don’t want to do that because then each of those modules that need the function becomes coupled to a shared helper header file, which is fine in isolation but “oh I want to use this code in another project, better make sure I take all the misc dependencies” is best avoided. Necessary if necessary, but if possible using a standard function, even better.