from itertools import product
def horizontal():
for x, y in product(range(20), range(17)):
print 1 + sum(int(n) for n in grid[x][y: y + 4])
You should be using the sum function. Of course you can't if you shadow it with a variable, so I changed it to my_sum
from itertools import product
def horizontal():
for x, y in product(range(20), range(17)):
print 1 + sum(int(n) for n in grid[x][y: y + 4])
You should be using the sum function. Of course you can't if you shadow it with a variable, so I changed it to my_sum
grid = [range(20) for i in range(20)]
sum(sum( 1 + sum(grid[x][y: y + 4]) for y in range(17)) for x in range(20))
The above outputs 13260, for the particular grid created in the first line of code. It uses sum() three times. The innermost sum adds up the numbers in grid[x][y: y + 4], plus the slightly strange initial value sum = 1 shown in the code in the question. The middle sum adds up those values for the 17 possible y values. The outer sum adds up the middle values over possible x values.
If elements of grid are strings instead of numbers, replace
sum(grid[x][y: y + 4])
with
sum(int(n) for n in grid[x][y: y + 4]
My mind blanks as how replace nested loops with a faster and perhaps a less wasteful way? I know they can’t always be replaced but when is a good time to replace them.
I figure this will come with time but does anyone have any general or specific tips, use cases or examples of how to replace loops or improve on this. I feel like this what’s turning good ideas into lame or unimpressive by speed issues.
Been practicing algorithms in c, JavaScript and python for about a year and I seem to constantly get slower solutions when doing leetcode for example. Self taught udemy, books and cs50x.
Ideally I like to refactor and improve a python script I created with interns and others that read 100,000 reviews and creates moods similarity recommendation engine that indexed and sorts by mood but it take 7-8 days to update it. But then I upload csv to database and works in fast. But had times where something went wrong and took 21 days to get a file. But before doing that I like to learn in general so I can make better decisions coding.
Given 2 lists:
list1 = [[1,2,3],[4,5,6],[7,8,9]]
list2 = []
Expected result:
list2 = [[1,4,7],[2,5,8],[3,6,9]]
I can do that by nesting For loops:
for i in range(0,len(list1[0])):
temp = []
for element in list1:
temp.append(element[i])
list2.append(temp)
i+=1But this is very memory inefficient, so sometimes this procedure can't even return a result. Is there any alternative to this?
You can reduce the number of lines of code and make it easier to understand like this:
tasks_id = [task in tasks if task["id"] == entry["item_id"]]
folders_dict = dict()
for folder in folders:
folders_dict[folder["id"]] = folder
for task in tasks_id:
if task["parent_id"] in folders_dict.keys() and folder_dict[task["parent_id"]] in forbiddenFolders:
body_of_email +=( "Using a task within a folder that is not permiited " + forbiddenFolder + "\r\n" )
What you're trying to do here is look up an item (task, folder) based on the id. Python's dictionaries provide an easy way to do this. You will only save if you'll be doing the searches several times (e.g. if there are many tasks with the same id, or if you'll be running the function several times).
Additionally, for forbiddenFolders you just have a list of names (you're not looking up an item, you're just checking if it's present) for which Python's sets are suitable.
Anyway, here is how you build the dictionaries and sets:
tasks_dict = dict((task['id'], task) for task in tasks)
folders_dict = dict((folder['id'], folder) for folder in folders)
forbidden_folders_set = set(forbiddenFolders)
Now, task = tasks_dict[id] is a task such that task['id'] == id, and similarly for folders, so you can replace the loops above with these expressions. The set doesn't allow this, but it allows you to check for presence with folder in forbidden_folders_set.
(Bear in mind that each of those dict(...) operations may take longer than running through one of the for loops above, but they are an investment for faster lookup in future.)
if entry['item_id'] in tasks_dict:
task = tasks_dict[entry['item_id']]
if task['parent_id'] in folders_dict:
folder = folders_dict[task['parent_id']]
if folder in forbidden_folders_set:
body_of_email += ...
The x in y and ..._dict[x] operations above are very efficient.
What you show is 'pythonic' in the sense that it uses a Python list and iteration approach. The only use of numpy is in assigning the values, M{i,j] =. Lists don't take that kind of index.
To make most use of numpy, make index grids or arrays, and calculate all values at once, without explicit loop. For example, in your case:
In [333]: N=10
In [334]: I,J = np.ogrid[0:10,0:10]
In [335]: I
Out[335]:
array([[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
In [336]: J
Out[336]: array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
In [337]: M = 1/((I + 2*J + 1)**2)
In [338]: M
Out[338]:
array([[ 1. , 0.11111111, 0.04 , 0.02040816, 0.01234568,
0.00826446, 0.00591716, 0.00444444, 0.00346021, 0.00277008],
...
[ 0.01 , 0.00694444, 0.00510204, 0.00390625, 0.00308642,
0.0025 , 0.00206612, 0.00173611, 0.00147929, 0.00127551]])
ogrid is one of several ways of construction sets of arrays that can be 'broadcast' together. meshgrid is another common function.
In your case, the equation is one that works well with 2 arrays like this. It depends very much on broadcasting rules, which you should study.
If the function only takes scalar inputs, we will have to use some form of iteration. That has been a frequent SO question; search for [numpy] vectorize.
np.fromfunction is intended for that :
def f(i,j) : return 1/((i+2*j+1)**2)
M = np.fromfunction(f,(N,N))
it's slighty slower that the 'hand made' vectorised way , but easy to understand.
I wrote a small script to solve a puzzle (from West of Loathing, FWIW): You have three knobs that increase a value by a given amount:
-
Knob A increases the value by 411
-
Knob B increases the value by 295
-
Knob C increases the value by 161
You can turn each knob up to 8 times, and need to arrive at a value of 3200 (starting from 0)
My script looks like this:
import sys
for a in range(0,9):
for b in range(0,9):
for c in range(0,9):
print(f"a={a} b={b} c={c}")
if(a*411+b*295+c*161) == 3200:
print("that's it!")
sys.exit()I love it not, for it is clumsy. How could I have done this better?
This seems to be a linear diophantine equation: 411a + 295b + 161c = 3200.
You can try searching how to solve it. Or you could try using sympy library:
https://docs.sympy.org/latest/modules/solvers/diophantine.html
Either mathematically, or you could use backtracking (https://en.m.wikipedia.org/wiki/Backtracking)
Since I cannot create N nested for loops, what is the alternative?
Recursion!
Have a function taking extra arguments, such as the number of terms to sum and the target number, then in its implementation call itself again with one fewer term.
Your stopping condition is when the number of terms to sum is zero. In that case, if the target number to reach is zero, it means you found a valid sum. If it's non-zero, it means you didn't. (Similarly, you could do the last check at 1 term left, checking whether you can pick a final number to match it.)
Since you only need to find sets of distinct numbers that sum up to one, you can assume x > y > z > a > b (or the opposite ordering), to make sure you're not finding the same sequence over and over again, just in a different order.
Also, iterating down from the limit means the reciprocals will grow as you proceed in the iteration. Which also means you can stop looking once the sum goes past one (or the target gets negative), which should help you quickly prune loops that won't ever yield new values.
Finally, Python also supports fractions, which means you can make these calculations with exact precision, without worrying about the rounding issues of floats.
Putting it all together:
from fractions import Fraction
def reciprocal_sums(n=5, limit=30, target=1, partial=()):
if n == 0:
if target == 0:
yield partial
return
for i in range(limit, 0, -1):
new_target = target - Fraction(1, i)
if new_target < 0:
return
yield from reciprocal_sums(
n - 1, i - 1, new_target, partial + (i,))
Testing it for n=5 (default):
>>> list(reciprocal_sums())
[(30, 20, 12, 3, 2),
(30, 20, 6, 4, 2),
(30, 10, 6, 5, 2),
(28, 21, 12, 3, 2),
(28, 21, 6, 4, 2),
(28, 14, 7, 4, 2),
(24, 12, 8, 4, 2),
(20, 12, 6, 5, 2),
(20, 6, 5, 4, 3),
(18, 12, 9, 4, 2),
(15, 12, 10, 4, 2)]
For n=4:
>>> list(reciprocal_sums(4))
[(24, 8, 3, 2),
(20, 5, 4, 2),
(18, 9, 3, 2),
(15, 10, 3, 2),
(12, 6, 4, 2)]
And n=6:
>>> list(reciprocal_sums(6))
[(30, 28, 21, 20, 3, 2),
(30, 24, 20, 8, 4, 2),
(30, 24, 10, 8, 5, 2),
(30, 20, 18, 9, 4, 2),
(30, 20, 15, 10, 4, 2),
(30, 18, 10, 9, 5, 2),
(30, 12, 10, 5, 4, 3),
(28, 24, 21, 8, 4, 2),
(28, 21, 20, 6, 5, 2),
(28, 21, 18, 9, 4, 2),
(28, 21, 15, 10, 4, 2),
(28, 20, 14, 7, 5, 2),
(28, 14, 12, 7, 6, 2),
(28, 14, 7, 6, 4, 3),
(24, 20, 12, 8, 5, 2),
(24, 20, 8, 5, 4, 3),
(24, 18, 9, 8, 6, 2),
(24, 15, 10, 8, 6, 2),
(24, 12, 8, 6, 4, 3),
(20, 18, 12, 9, 5, 2),
(20, 18, 9, 5, 4, 3),
(20, 15, 12, 10, 5, 2),
(20, 15, 10, 5, 4, 3),
(18, 15, 10, 9, 6, 2),
(18, 12, 9, 6, 4, 3),
(15, 12, 10, 6, 4, 3)]
This solution is pretty fast. Running on a Snapdragon 845 ARM CPU:
%timeit list(reciprocal_sums(4))
365 ms ± 5.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit list(reciprocal_sums(5))
1.94 s ± 8.93 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit list(reciprocal_sums(6))
8.26 s ± 56.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
The ordering (lowering the limit at each level) together with pruning the last levels after going above the target will make this solution much faster than ones that evaluate all possible permutations or combinations.
itertools.product(list(range(N)),repeat=N)
maybe what you want
Hi, i'm currently in situation where my function needs run with multiple variations and at this point it takes a two days to calculate what I need. Multiprocessing not helping. So I want to ask how to improve nested for loops that currently structure looks something like this?
for i in range(50):
for j in range(50):
for k in range(50):
for m in range(50):
func(i, j, k, m, big_list)You can try to take a look at itertools.product
Equivalent to nested for-loops in a generator expression. For example, product(A, B) returns the same as ((x,y) for x in A for y in B).
The nested loops cycle like an odometer with the rightmost element advancing on every iteration. This pattern creates a lexicographic ordering so that if the input’s iterables are sorted, the product tuples are emitted in sorted order.
Also no need in 0 while calling range(0, I) and etc - use just range(I)
So in your case it can be:
import itertools
def vectorr(I, J, K):
return itertools.product(range(K), range(J), range(I))
You said you want it to be faster. Let's use NumPy!
import numpy as np
def vectorr(I, J, K):
arr = np.empty((I*J*K, 3), int)
arr[:,0] = np.tile(np.arange(I), J*K)
arr[:,1] = np.tile(np.repeat(np.arange(J), I), K)
arr[:,2] = np.repeat(np.arange(K), I*J)
return arr
There may be even more elegant tweaks possible here, but that's a basic tiling that gives the same result (but as a 2D array rather than a list of lists). The code for this is all implemented in C, so it's very, very fast--this may be important if the input values may get somewhat large.