The httpx documentation covers the ReadTimeout you are experiencing:
HTTPX is careful to enforce timeouts everywhere by default. The default behavior is to raise a TimeoutException after 5 seconds of network inactivity. The read timeout specifies the maximum duration to wait for a chunk of data to be received (for example, a chunk of the response body). If HTTPX is unable to receive data within this time frame, a
ReadTimeoutexception is raised.
Try first disabling the timeout duration for reads (adapted from the example in the above link):
timeout = httpx.Timeout(10.0, read_timeout=None)
response = await httpx.get(rss, headers=headers, verify=False, timeout=timeout)
And then experiment with different timeout durations to see what is reasonable for your use-case.
EDIT: the API has been updated, parameter name has been changed from read_timeout to read:
timeout = httpx.Timeout(10.0, read=None)
Answer from ParkerD on Stack Overflowpython - Httpx requests timing out when it shouldn't - Stack Overflow
python 3.x - Strange errors with asynchronous requests - Stack Overflow
httpx.TimeoutException with o3 deepresearch and o3 pro models
httpx does not wrap asyncio.exceptions.TimeoutError as TimeoutException
Hello, I have an asynchronous script that sends some requests. But I have a problem. The urls times out when it should not : Here's the function
async with httpx.AsyncClient() as client: response = await client.get(url, follow_redirects=True, headers=headers, timeout=timeout)
where
timeout = 30
input :
https://gmail.com https://webasto.com https://unicopower.com https://bt2energy.com https://bulletev.com` 420 other urls
I get this in the logs :
2024-04-01T22:13:58.375Z Request to https://gmail.com timed out. Moving to the next url... 2024-04-01T22:13:58.739Z Request to https://webasto.com timed out. Moving to the next url... 2024-04-01T22:13:58.755Z Request to https://unicopower.com timed out. Moving to the next url... 2024-04-01T22:13:58.859Z Request to https://bt2energy.com timed out. Moving to the next url... 2024-04-01T22:13:58.862Z Request to https://bulletev.com timed out. Moving to the next url...
However, when I run the script with those urls only it works : input :
https://gmail.com https://webasto.com https://unicopower.com https://bt2energy.com https://bulletev.com
Output :
2024-04-01T22:21:58.787Z INFO Initializing actor... 2024-04-01T22:21:58.789Z INFO System info ({"apify_sdk_version": "1.1.5", "apify_client_version": "1.4.1", "python_version": "3.11.8", "os": "linux"}) 2024-04-01T22:21:59.091Z Processing https://bt2energy.com... 2024-04-01T22:21:59.379Z Processing https://bulletev.com... 2024-04-01T22:21:59.817Z Processing https://gmail.com... 2024-04-01T22:21:59.981Z list index out of range https://gmail.com 2024-04-01T22:21:59.983Z Processing https://unicopower.com... 2024-04-01T22:22:00.549Z Processing https://webasto.com... 2024-04-01T22:22:00.705Z INFO Exiting actor ({"exit_code": 0})it works. Why does those urls timeout when being processed with hundreds of other urls but when I process them seperatily it works? Note: it does that for other urls aswell
» pip install pytest-httpx