This question is a duplicate of https://stackoverflow.com/q/300522/773113, but since that question is on stackoverflow, technically it is not a duplicate. (I tried to mark it as a duplicate, but I was prevented, because the other question is not on Programmers SE.)
So, here is what is happening: it is all a matter of convention, and it is all arbitrary. Different languages and environments have their own conventions, (sometimes even self-contradictory,) and you need to learn the conventions of the language you are using, and follow it.
In the old times when C ruled, "size" was the fixed number of bytes allocated for something, while "length" was the smaller, variable number of bytes actually in use. Generally, "size" stood for something fixed, while "length" stood for something variable. But it was still not uncommon for someone to say that "a machine word is 32 bits long" instead of "the size of a machine word is 32 bits", despite the fact that the number of bits in a machine word is, of course, very fixed.
And then comes java, which has arrays of fixed size, but their size is returned via a length property, and strings of fixed size, but their size is returned via a length() method, and collections of variable size, but their length is returned via a size() method. So, java decided to turn things around.
Then came C#, which keeps the term "length" for stuff of fixed size, but for variable size stuff it uses the term "count", which would be perfect, if it was not for the unfortunate fact that besides being a noun it is also a verb, which can be taken to mean that when you get the "count" of a collection, the items in the collection will be counted one by one. (O(N) instead of O(1).)
So, go figure. There is no definitive answer, be sure to carefully study the documentation of the system that you are dealing with, and to understand the precise definition of the terms "length" and "size" within the context of that system, and be even prepared that there may be no precise definition of these terms, and they may be used interchangeably and arbitrarily.
Answer from Mike Nakis on Stack ExchangeI see the terms "length" and "size" used interchangeably in a lot of places, even in the Linux kernel. Example
- "array size" when referring to number of elements in a array.
- "len" or "length" when referring to sizeof(something) i.e. number of chars (which is a byte in 99.9% of the cases).
An example where this causes confusion is with strings, where the size of a string would likely include the terminating nul byte, wheres the length of the string (i.e. strlen) do not.
Both strlen and sizeof have been around for decades, so why do people not follow this "convention"?
Sorry for the rant!
As per the documentation, these are just synonyms. size() is there to be consistent with other STL containers (like vector, map, etc.) and length() is to be consistent with most peoples' intuitive notion of character strings. People usually talk about a word, sentence or paragraph's length, not its size, so length() is there to make things more readable.
Ruby's just the same, btw, offering both #length and #size as synonyms for the number of items in arrays and hashes (C++ only does it for strings).
Minimalists and people who believe "there ought to be one, and ideally only one, obvious way to do it" (as the Zen of Python recites) will, I guess, mostly agree with your doubts, @Naveen, while fans of Perl's "There's more than one way to do it" (or SQL's syntax with a bazillion optional "noise words" giving umpteen identically equivalent syntactic forms to express one concept) will no doubt be complaining that Ruby, and especially C++, just don't go far enough in offering such synonymical redundancy;-).
Both have the same complexity: Constant.
From the N4431 working draft, §21.4.4
size_type size() const noexcept;Returns: A count of the number of char-like objects currently in the string. Complexity: Constant time.
And
size_type length() const noexcept;Returns: size().
[...] iterates through all the characters and counts the length [...]
That's C strings you're thinking of.
If you take a look at documentation here it says that length and size are the same.
Both string::size and string::length are synonyms and return the same value.
Also if you take a look at the code, length is cached, so the complexity is O(1). (Code from MS implementation but I'm sure other libraries are done the same way.)
size_type length() const _NOEXCEPT
{ // return length of sequence
return (this->_Mysize);
}
size_type size() const _NOEXCEPT
{ // return length of sequence
return (this->_Mysize);
}
In general, length() is used when something has a constant length, while size() is used on something with a variable length. Past that, I know of no good reason for using two nearly-synonymous terms.
Ideally, count would be the number of items, and size would be the amount of storage taken up (as in sizeof).
In practice, all three (including length, which is the most ambiguous) are muddled up in many widely-used libraries, so there's no point trying to impose a pattern on them at this stage.
C arrays do keep track of their length, as the array length is a static property:
int xs[42]; /* a 42-element array */
You can't usually query this length, but you don't need to because it's static anyway – just declare a macro XS_LENGTH for the length, and you're done.
The more important issue is that C arrays implicitly degrade into pointers, e.g. when passed to a function. This does make some sense, and allows for some nice low-level tricks, but it loses the information about the length of the array. So a better question would be why C was designed with this implicit degradation to pointers.
Another matter is that pointers need no storage except the memory address itself. C allows us to cast integers to pointers, pointers to other pointers, and to treat pointers as if they were arrays. While doing this, C is not insane enough to fabricate some array length into existence, but seems to trust in the Spiderman motto: with great power the programmer will hopefully fulfill the great responsibility of keeping track of lengths and overflows.
A lot of this had to do with the computers available at the time. Not only did the compiled program have to run on a limited resource computer, but, perhaps more importantly, the compiler itself had to run on these machines. At the time Thompson developed C, he was using a PDP-7, with 8k of RAM. Complex language features that didn't have an immediate analog on the actual machine code were simply not included in the language.
A careful read through the history of C yields more understanding into the above, but it wasn't entirely a result of the machine limitations they had:
Moreover, the language (C) shows considerable power to describe important concepts, for example, vectors whose length varies at run time, with only a few basic rules and conventions. ... It is interesting to compare C's approach with that of two nearly contemporaneous languages, Algol 68 and Pascal [Jensen 74]. Arrays in Algol 68 either have fixed bounds, or are `flexible:' considerable mechanism is required both in the language definition, and in compilers, to accommodate flexible arrays (and not all compilers fully implement them.) Original Pascal had only fixed-sized arrays and strings, and this proved confining [Kernighan 81].
C arrays are inherently more powerful. Adding bounds to them restricts what the programmer can use them for. Such restrictions may be useful for programmers, but necessarily are also limiting.