You don't need to copy a Python string. They are immutable, and the copy module always returns the original in such cases, as do str(), the whole string slice, and concatenating with an empty string.
Moreover, your 'hello' string is interned (certain strings are). Python deliberately tries to keep just the one copy, as that makes dictionary lookups faster.
One way you could work around this is to actually create a new string, then slice that string back to the original content:
>>> a = 'hello'
>>> b = (a + '.')[:-1]
>>> id(a), id(b)
(4435312528, 4435312432)
But all you are doing now is waste memory. It is not as if you can mutate these string objects in any way, after all.
If all you wanted to know is how much memory a Python object requires, use sys.getsizeof(); it gives you the memory footprint of any Python object.
For containers this does not include the contents; you'd have to recurse into each container to calculate a total memory size:
>>> import sys
>>> a = 'hello'
>>> sys.getsizeof(a)
42
>>> b = {'foo': 'bar'}
>>> sys.getsizeof(b)
280
>>> sys.getsizeof(b) + sum(sys.getsizeof(k) + sys.getsizeof(v) for k, v in b.items())
360
You can then choose to use id() tracking to take an actual memory footprint or to estimate a maximum footprint if objects were not cached and reused.
You don't need to copy a Python string. They are immutable, and the copy module always returns the original in such cases, as do str(), the whole string slice, and concatenating with an empty string.
Moreover, your 'hello' string is interned (certain strings are). Python deliberately tries to keep just the one copy, as that makes dictionary lookups faster.
One way you could work around this is to actually create a new string, then slice that string back to the original content:
>>> a = 'hello'
>>> b = (a + '.')[:-1]
>>> id(a), id(b)
(4435312528, 4435312432)
But all you are doing now is waste memory. It is not as if you can mutate these string objects in any way, after all.
If all you wanted to know is how much memory a Python object requires, use sys.getsizeof(); it gives you the memory footprint of any Python object.
For containers this does not include the contents; you'd have to recurse into each container to calculate a total memory size:
>>> import sys
>>> a = 'hello'
>>> sys.getsizeof(a)
42
>>> b = {'foo': 'bar'}
>>> sys.getsizeof(b)
280
>>> sys.getsizeof(b) + sum(sys.getsizeof(k) + sys.getsizeof(v) for k, v in b.items())
360
You can then choose to use id() tracking to take an actual memory footprint or to estimate a maximum footprint if objects were not cached and reused.
I'm just starting some string manipulations and found this question. I was probably trying to do something like the OP, "usual me". The previous answers did not clear up my confusion, but after thinking a little about it I finally "got it".
As long as a, b, c, d, and e have the same value, they reference to the same place. Memory is saved. As soon as the variable start to have different values, they get start to have different references. My learning experience came from this code:
import copy
a = 'hello'
b = str(a)
c = a[:]
d = a + ''
e = copy.copy(a)
print map( id, [ a,b,c,d,e ] )
print a, b, c, d, e
e = a + 'something'
a = 'goodbye'
print map( id, [ a,b,c,d,e ] )
print a, b, c, d, e
The printed output is:
[4538504992, 4538504992, 4538504992, 4538504992, 4538504992]
hello hello hello hello hello
[6113502048, 4538504992, 4538504992, 4538504992, 5570935808]
goodbye hello hello hello hello something
Safest way to copy a string?
Are there better ways to copy one string to another?
Python copy.copy() appears to make a deep copy
New to python strange copy string behavior
Videos
I just fell foul of the fact that strncpy does not add an old terminator if the destination buffer is shorter than the source string. Is there a single function standard library replacement that I could drop in to the various places strncpy is used that would copy a null terminated string up to the length of the destination buffer, guaranteeing early (but correct) termination of the destination string, if the destination buffer is too short?
Edit:
-
Yes, I do need C-null terminated strings. This C API is called by something else that provides a buffer for me to copy into, with the expectation that it’s null terminated
Edit 2:
-
I know I can write a helper function that’s shared across relevant parts of the code, but I don’t want to do that because then each of those modules that need the function becomes coupled to a shared helper header file, which is fine in isolation but “oh I want to use this code in another project, better make sure I take all the misc dependencies” is best avoided. Necessary if necessary, but if possible using a standard function, even better.