The only portable way to determine if two memory ranges overlap is:
int overlap_p(void *a, void *b, size_t n)
{
char *x = a, *y = b;
for (i=0; i<n; i++) if (x+i==y || y+i==x) return 1;
return 0;
}
This is because comparison of pointers with the relational operators is undefined unless they point into the same array. In reality, the comparison does work on most real-world implementations, so you could do something like:
int overlap_p(void *a, void *b, size_t n)
{
char *x = a, *y = b;
return (x<=y && x+n>y) || (y<=x && y+n>x);
}
I hope I got that logic right; you should check it. You can simplify it even more if you want to assume you can take differences of arbitrary pointers.
Answer from R.. GitHub STOP HELPING ICE on Stack OverflowThe only portable way to determine if two memory ranges overlap is:
int overlap_p(void *a, void *b, size_t n)
{
char *x = a, *y = b;
for (i=0; i<n; i++) if (x+i==y || y+i==x) return 1;
return 0;
}
This is because comparison of pointers with the relational operators is undefined unless they point into the same array. In reality, the comparison does work on most real-world implementations, so you could do something like:
int overlap_p(void *a, void *b, size_t n)
{
char *x = a, *y = b;
return (x<=y && x+n>y) || (y<=x && y+n>x);
}
I hope I got that logic right; you should check it. You can simplify it even more if you want to assume you can take differences of arbitrary pointers.
What you want to check is the position in memory of the source relatively to the destination:
If the source is ahead of the destination (ie. source < destination), then you should start from the end. If the source is after, you start from the beginning. If they are equal, you don't have to do anything (trivial case).
Here are some crude ASCII drawings to visualize the problem.
|_;_;_;_;_;_| (source)
|_;_;_;_;_;_| (destination)
>-----^ start from the end to shift the values to the right
|_;_;_;_;_;_| (source)
|_;_;_;_;_;_| (destination)
^-----< start from the beginning to shift the values to the left
Following a very accurate comment below, I should add that you can use the difference of the pointers (destination - source), but to be on the safe side cast those pointers to char * beforehand.
In your current setting, I don't think that you can check if the operation will fail. Your memcpy prototype prevents you from doing any form of checking for that, and with the rule given above for deciding how to copy, the operation will succeed (outside of any other considerations, like prior memory corruption or invalid pointers).
I believe you mean memmove which takes care of memory overlapping as oppose to memset. but what is memory overlapping anyway?
suppose we have an array of 5 chars, where each char is a byte long
+++++++++++++++++++++++++++++++
| 'a' | 'b' | 'c' | 'd' | 'e' |
+++++++++++++++++++++++++++++++
0x100 0x101 0x102 0x103 0x104
now according to the man page of memcpy, it takes 3 arguments, a pointer to the destination block of memory, a pointer to the source block of memory, and the size of bytes to be copied.
what if the destination is 0x102, the source is 0x100 and the size is 3? memory overlapping happens here. that is, 0x100 would be copied into 0x102, 0x101 would be copied into 0x103 and 0x102 would be copied into 0x104.
notice that we first copied into 0x102 then we copied from 0x102 which means that the value which was originally in 0x102 was lost as we overwrote it with the value we copied into 0x102 before we copy from it. so we would end up with something like
+++++++++++++++++++++++++++++++
| 'a' | 'b' | 'a' | 'b' | 'a' |
+++++++++++++++++++++++++++++++
0x100 0x101 0x102 0x103 0x104
instead of
+++++++++++++++++++++++++++++++
| 'a' | 'b' | 'a' | 'b' | 'c' |
+++++++++++++++++++++++++++++++
0x100 0x101 0x102 0x103 0x104
how does a function like memmove take care of memory overlapping? according to its man page, it first copies the bytes to be copied into a temporary array then pastes them into the destination block as oppose to a function like memcpy which copies directly from the source block to the destination block.
Lets see:
memset: sets a memory segment to a constant value, so, there is no "overlapping" possible here, because there is just a unique, contiguous, memory segment to "set".
memcpy: you are reading from one memory segment and, well, copying it to another memory segment. If the memory segments coincide at some point, a "overlapping" would occur. Imagine a memory segment starts at address 0x51, and the other starts at address 0x70, and you try to copy 50 bytes from 0x51 to 0x70... at some point, the process will start reading from address at 0x70, and copying to address 0x8F. This is most likely not what you wanted to do.
At a lower level, in assembly, you should be able to find several ways of doing this, including MMX, SSE2 and other SIMD instructions. If you download glibc source code (https://www.gnu.org/software/libc/download.html), you will see some implementations done in assembly.
C is a "high-level" language, but is quite close to assembly, you can get memory address for variables and even for functions, so, it is quite powerful, allowing you to do all kind of things, like reading/writing an array after its "official" end (the OS will stop you once you try to access memory outside your process' memory), so, yes, memory overlapping is totally possible in C. Something like this would create two potentially overlapping memory "segments" (actually, the same segment, that I am manually dividing and assigning to two pointers).
This is a funny-behaving program, it is definitely, and intentionally buggy, just to show what kind of odd things can happen if memory do overlap with memcpy:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char *a,*b;
a=malloc(100*sizeof(char));
b=(a+25);
strcpy(a,"This is just a test");
strcpy(b,"And this is another test, longer test string.");
printf("a: %s\nb: %s\n",a,b);
printf("Now, I am copying b in a, and lets see what happen...\n");
memcpy(a,b,75);
printf("a: %s\nb: %s\n",a,b);
}
Save it to a .c file, like test.c, and compile it using gcc, like this:
gcc -O0 -o test test.c
Run it and then try again compiling like that:
gcc -O2 -o test test.c
It will (most likely) behave differently.
Try replacing memcpy with strncpy and see what happen.
I hope the example is useful.
c - Meaning of overlapping when using memcpy - Stack Overflow
What is memcpy in c?
How does memmove allow the copying to be done in a non-destructive manner unlike memcpy?
c - memcpy() vs memmove() - Stack Overflow
Videos
From the comments your problem is that you don't understand what "overlapping" means:
Overlapping means this:
Here the two memory regions src and dst do overlap:

But here they don't:

So if you have overlapping memory regions, then you cannot use memcpy but you have to use memmove.
Second question:
Yes, size_t is an unsigned integer type. The third argument is the number of bytes to copy, so it can hardly be anything else than an unsigned integer type.
memcpy doesn't use any temporary memory to copy from src to dst.
Let say:
srcstarts @104dststarts @108src = "abcdefgh"
Then 'a' will be @104 and 'e' will be @108.
Assuming char as 1 byte then after copying:
dst = "abcdabcd".
As n denotes length to be copied, it should always be an integer.
To copy overlapping areas, you can use memmove function which uses temporary memory to copy.
Edit: Thank you so much.
Please help.
I understand that overlapping of src and dest is allowed in memmove but not how.
If I understood correctly, it will copy up or down (depending on whether src is shorter or longer than dst) but why does that matter and how does it prevent undefined behavior?
I've read through all this but I'm still struggling to understand the implementation of memmove. 1, 2, 3, 4, 5.
I tried to test it according to the second top answer from link 1 but there is no undefined behaviour. Can someone show me a way of testing it to see the differences? I have also tried different lengths of both strings but I think I'm not testing it correctly.
I'm not entirely surprised that your example exhibits no strange behaviour. Try copying str1 to str1+2 instead and see what happens then. (May not actually make a difference, depends on compiler/libraries.)
In general, memcpy is implemented in a simple (but fast) manner. Simplistically, it just loops over the data (in order), copying from one location to the other. This can result in the source being overwritten while it's being read.
memmove does more work to ensure it handles the overlap correctly.
EDIT:
(Unfortunately, I can't find decent examples, but these will do). Compare the memcpy and memmove implementations shown here. memcpy just loops, while memmove performs a test to determine which direction to loop in to avoid corrupting the data. These implementations are rather simple. Most high-performance implementations are more complicated (involving copying word-size blocks at a time rather than bytes).
The memory in memcpy cannot overlap or you risk undefined behaviour, while the memory in memmove can overlap.
char a[16];
char b[16];
memcpy(a,b,16); // Valid.
memmove(a,b,16); // Also valid, but slower than memcpy.
memcpy(&a[0], &a[1],10); // Not valid since it overlaps.
memmove(&a[0], &a[1],10); // Valid.
Some implementations of memcpy might still work for overlapping inputs, but you cannot count on that behaviour. However, memmove must allow for overlapping inputs.
I've done some research on this in the past... on Linux, up until fairly recently, the implementation of memcpy() worked in a way that was similar enough to memmove() that overlapping memory wasn't an issue, and in my experience, other UNIXs were the same. This doesn't change the fact that this is undefined behavior according to the standard, and you are just lucky that on some platforms it sometimes works -- and memmove() is the standard-supported right answer.
However, in 2010, the glibc maintainers rolled out a new, optimized memcpy() that changed the behavior of memcpy() for some Intel core types where the C standard library is compiled to be faster, but no longer works like memmove() [1]. (I seem to recall also that this is new code triggered only for memory segments larger than 80 bytes). Interestingly, this caused things like the Linux version of Adobe's Flash player to break[2], as well as several other open-source packages (back in 2010 when Fedora Linux became the first to adopt the changed memcpy() in glibc).
- [1] https://sourceware.org/bugzilla/show_bug.cgi?id=12518
- [2] https://bugzilla.redhat.com/show_bug.cgi?id=638477
memcpy() doesn't support overlapping memory. This allows for optimizations that won't work if the buffers do overlap.
There's not much to really look into, however, because C provides an alternative that does support overlapping memory: memmove(). Its usage is identical to memcpy(). You should use it if the regions might overlap, as it accounts for that possibility.
Long time ago when i was a wee little lad learning C, before I was learning pointers, I learned how to swap values between two variables ;
uint32_t x=1; uint32_t y=2; uint32_t tmp; tmp = y; y=x; x=tmp;
Of-course, with pointers, we can do ;
void xorSwap (int* x, int* y) {
if (x != y) {
*x ^= *y;
*y ^= *x;
*x ^= *y;
}
}Yesterday at work, we write bare metal C for fw i had this uniquely weird issue, we work in RISC V environment,
where i have to use memcpy() something similar to a
memcpy( uint32_t * dest, const uint32_t * source, size_t sz );
and the issue was that, in the linker I had specified maximum buffer size too, so it should've worked.
This below is actual code:
memcpy(sample_waveform, (void *)(0x7C000), sizeof(sample_waveform));
and when i had sample_waveform which was a signed double GLOBAL array of 3K bytes, it won't copy, However it did was able to copy first 500 bytes though when I made the sample_waveform size smaller.
but when I made sample_waveform local on the stack which took a stack space of 3K, I was able to do memcpy( ).
I can't explain why this is so? I didn't want to copy and take/use that much space on the stack.
Has this happened to anyone?
How does memcpy( ) work? Does it copy to temp somewhere and then copy to your destination, or does it copy like that pointer method. How is this different from memmove()?
Thanks!
I'm trying to reproduce the behaviour of memcpy.
Bug: Wrong return value
When n == 0, memcpy() returns dest. OP's code should perform likewise.
// if (src == NULL || n == 0)
// return (NULL);
if (src == NULL) // memcpy(..., NULL, ...) is not defined. Do whatever you want.
return NULL; // or maybe `n = 0;`
if (n == 0)
return dest;
Bug: Insufficient space
With char buffer[] = "Overlap Test";, ft_memcpy(buffer + 3, buffer, strlen(buffer) + 1); is a problem as buffer + 3 does not have enough room for strlen(buffer) + 1 characters. @chqrlie
How is memcpy.c supposed to react to memory overlap?
Would such a result be expected in case of memory overlap?
Given the 2 restrict in the function signature, using overlapping buffers is undefined behavior (UB). There are no expected results.
void *memcpy(void * restrict s1, const void * restrict s2, size_t n);
Even if OP's ft_memcpy() functioned just like OP's compiler's memcpy() today, it is not certain it will tomorrow or on some other machine. It is UB.
I recommend to not test for equivalent functionality with overlapping buffers - since it is UB.
Notice, that focus on this issue missed proper behavior when n == 0.
Consider instead "How is memcpy.c memmove() supposed to react to memory overlap?"
That is trickier.
Copying takes place as if the
ncharacters from the object pointed to bys2are first copied into a temporary array ofncharacters that does not overlap the objects pointed to bys1ands2, and then thencharacters from the temporary array are copied into the object pointed to bys1.
C23dr § 7.26.2.3 2
OP's code needs changes to meet that. To do well is a challenge as the usual approach is to compare addresses. Depending on which is greater, copy from the beginning or end of the source buffer. Yet for user code, there is no defined way to compare arbitrary addresses for order. Various tricks abound that usually work.
The below does not have UB, yet may not compile nor function exactly like memmove() on all machines.
#include <stdlib.h>
#include <stdint.h>
void *ft_memmove(void *dst, const void *src, size_t n) {
// Not needed as passing `NULL` for either pointer parameter is UB in memmove().
if (dst == NULL || src == NULL) {
n = 0;
}
// Weakness: uintptr_t is an optional type.
uintptr_t d = (uintptr_t) dst;
uintptr_t s = (uintptr_t) src;
// Weakness: `d < s` is not a defined order compare of the pointers.
if (d < s) {
for (size_t i = 0; i < n; i++) {
((char*)dst)[i] = ((const char*)src)[i];
}
} else {
while (n > 0) {
n--;
((char*)dst)[n] = ((const char*)src)[n];
}
}
return dst;
}
I'm trying to reproduce the behaviour of memcpy. However when I try it with overlapping memory tests, instead of a trace trap I receive a different result.
In this case (by the C standard) the result of memcpy is undefined.
Would such a result be expected in case of memory overlap?
No, as it is UNDEFINED and you should have not been expecting anything.
I did put the restrict keywords in the function declaration, and I don't understand why the compiler wouldn't put it as a trace trap in this case.
Because restrict does not work this way. It does not add any checks. It is your promise that your function will behave a certain way, allowing compiler to do more aggressive optimizations.
Nonsense UB deliberations:
Also, many x86 glibc memcpy implementations do not use single-byte copy, so your (naive) function will not be able to reproduce their behaviour.