Hello developers, I want to know about how can I return multiple values from a function. I have read few articles but still confused.
Videos
Named tuples were added in 2.6 for this purpose. Also see os.stat for a similar builtin example.
>>> import collections
>>> Point = collections.namedtuple('Point', ['x', 'y'])
>>> p = Point(1, y=2)
>>> p.x, p.y
1 2
>>> p[0], p[1]
1 2
In recent versions of Python 3 (3.6+, I think), the new typing library got the NamedTuple class to make named tuples easier to create and more powerful. Inheriting from typing.NamedTuple lets you use docstrings, default values, and type annotations.
Example (From the docs):
class Employee(NamedTuple): # inherit from typing.NamedTuple
name: str
id: int = 3 # default value
employee = Employee('Guido')
assert employee.id == 3
For small projects I find it easiest to work with tuples. When that gets too hard to manage (and not before) I start grouping things into logical structures, however I think your suggested use of dictionaries and ReturnValue objects is wrong (or too simplistic).
Returning a dictionary with keys "y0", "y1", "y2", etc. doesn't offer any advantage over tuples. Returning a ReturnValue instance with properties .y0, .y1, .y2, etc. doesn't offer any advantage over tuples either. You need to start naming things if you want to get anywhere, and you can do that using tuples anyway:
def get_image_data(filename):
[snip]
return size, (format, version, compression), (width,height)
size, type, dimensions = get_image_data(x)
IMHO, the only good technique beyond tuples is to return real objects with proper methods and properties, like you get from re.match() or open(file).
I think the choices need to be considered strictly from the caller's point of view: what is the consumer most likely to need to do?
And what are the salient features of each collection?
- The tuple is accessed in order and immutable
- The list is accessed in order and mutable
- The dict is accessed by key
The list and tuple are equivalent for access, but the list is mutable. Well, that doesn't matter to me the caller if I'm going to immediately unpack the results:
score, top_player = play_round(players)
# or
idx, record = find_longest(records)
There's no reason here for me to care if it's a list or a tuple, and the tuple is simpler on both sides.
On the other hand, if the returned collection is going to be kept whole and used as a collection:
points = calculate_vertices(shape)
points.append(another_point)
# Make a new shape
then it might make sense for the return to be mutable. Homogeneity is also an important factor here. Say you've written a function to search a sequence for repeated patterns. The information I get back is the index in the sequence of the first instance of the pattern, the number of repeats, and the pattern itself. Those aren't the same kinds of thing. Even though I might keep the pieces together, there's no reason that I would want to mutate the collection. This is not a list.
Now for the dictionary.
the last one creates more readable code because you have named outputs
Yes, having keys for the fields makes heterogenous data more explicit, but it also comes with some encumbrance. Again, for the case of "I'm just going to unpack the stuff", this
round_results = play_round(players)
score, top_player = round_results["score"], round_results["top_player"]
(even if you avoid literal strings for the keys), is unnecessary busywork compared to the tuple version.
The question here is threefold: how complex is the collection, how long is the collection going to be kept together, and are we going to need to use this same kind of collection in a bunch of different places?
I'd suggest that a keyed-access return value starts making more sense than a tuple when there are more than about three members, and especially where there is nesting:
shape["transform"]["raw_matrix"][0, 1]
# vs.
shape[2][4][0, 1]
That leads into the next question: is the collection going to leave this scope intact, somewhere away from the call that created it? Keyed access over there will absolutely help understandability.
The third question -- reuse -- points to a simple custom datatype as a fourth option that you didn't present.
Is the structure solely owned by this one function? Or are you creating the same dictionary layout in many places? Do many other parts of the program need to operate on this structure? A repeated dictionary layout should be factored out to a class. The bonus there is that you can attach behavior: maybe some of the functions operating on the data get encapsulated as methods.
A fifth good, lightweight, option is namedtuple(). This is in essence the immutable form of the dictionary return value.
Don't think about functions returning multiple arguments. Conceptually, it is best to think of functions as both receiving and returning a single argument. A function that appears to accept multiple arguments actually receives just a single argument of tuple (formally product) type. Similarly, a function that returns multiple arguments is simply returning a tuple.
In Python:
def func(a, b, c):
return b, c
could be rewritten as
def func(my_triple):
return (my_triple[1], my_triple[2])
to make the comparison obvious.
The first case is merely syntactic sugar for the latter; both receive a triple as an argument, but the first pattern-matches on its argument to perform automatic destructuring into its constituent components. Thus, even languages without full-on general pattern-matching admit some form of basic pattern matching on some of their types (Python admits pattern-matching on both product and record types).
To return to the question at hand: there is no single answer to your question, because it would be like asking "what should be the return type of an arbitrary function"? It depends on the function and the use case. And, incidentally, if the "multiple return values" are really independent, then they should probably be computed by separate functions.
Absolutely (for the example you provided).
Tuples are first class citizens in Python
There is a builtin function divmod() that does exactly that.
q, r = divmod(x, y) # ((x - x%y)/y, x%y) Invariant: div*y + mod == x
There are other examples: zip, enumerate, dict.items.
for i, e in enumerate([1, 3, 3]):
print "index=%d, element=%s" % (i, e)
# reverse keys and values in a dictionary
d = dict((v, k) for k, v in adict.items()) # or
d = dict(zip(adict.values(), adict.keys()))
BTW, parentheses are not necessary most of the time. Citation from Python Library Reference:
Tuples may be constructed in a number of ways:
- Using a pair of parentheses to denote the empty tuple: ()
- Using a trailing comma for a singleton tuple: a, or (a,)
- Separating items with commas: a, b, c or (a, b, c)
- Using the tuple() built-in: tuple() or tuple(iterable)
Functions should serve single purpose
Therefore they should return a single object. In your case this object is a tuple. Consider tuple as an ad-hoc compound data structure. There are languages where almost every single function returns multiple values (list in Lisp).
Sometimes it is sufficient to return (x, y) instead of Point(x, y).
Named tuples
With the introduction of named tuples in Python 2.6 it is preferable in many cases to return named tuples instead of plain tuples.
>>> import collections
>>> Point = collections.namedtuple('Point', 'x y')
>>> x, y = Point(0, 1)
>>> p = Point(x, y)
>>> x, y, p
(0, 1, Point(x=0, y=1))
>>> p.x, p.y, p[0], p[1]
(0, 1, 0, 1)
>>> for i in p:
... print(i)
...
0
1
Firstly, note that Python allows for the following (no need for the parenthesis):
q, r = divide(22, 7)
Regarding your question, there's no hard and fast rule either way. For simple (and usually contrived) examples, it may seem that it's always possible for a given function to have a single purpose, resulting in a single value. However, when using Python for real-world applications, you quickly run into many cases where returning multiple values is necessary, and results in cleaner code.
So, I'd say do whatever makes sense, and don't try to conform to an artificial convention. Python supports multiple return values, so use it when appropriate.
It's not a sign of anything, and is not neither good nor bad design or coding style.
Returning multiple values can actually be appropriate and allow to write less code. Let's take an example of a method which takes a string like "-123abc" and converts it to an integer like -123:
(bool, int) ParseInteger(string text)
{
// Code goes here.
}
returns both:
- a value indicating whether the operation was a success,
- the number converted from string.
How can we refactor this?
1. Exceptions
We can add exceptions, if the language supports them. Remember than in most languages, exceptions are expensive in resources. It means that if you have to deal with lots of non-numbers, it's better to avoid to throw an exception every time the string cannot be converted to a number.
2. New class
We can create a class and return an instance of an object of this class.
For example:
class ParsedInteger
{
bool IsSuccess { get; set; }
int Number { get; set; }
}
Is it easier to understand? Shorter to write? Does it bring anything? I don't think so.
3. Out parameters
If the language supports it, we can also use out parameters. This is the approach of C# where returning multiple values is not possible. For example, when parsing a number, we use: bool isSuccess = int.TryParse("-123abc", out i). I'm not sure how is it better to use out parameters compared to multiple values. The syntax is not obvious, and even StyleCop itself (the tool used to enforce the default Microsoft style rules on the code) complains about those parameters, suggesting to remove them when possible.
Finally, in languages as C# where there is no such a thing as returning multiple values, things are progressively added to imitate the behavior. For example, Tuple was added to allow returning several values without having to write your own class or use out parameters.
When your function returns a reference to an object that contains multiple members, is it returning one value or many? In the example you show, the function is actually returning an object of type tuple. Python just happens to support syntactic 'sugar' so that you don't have to explicitly dereference the members when making assignments from the return value of the function.