Based off Greg's answers to my questions I think the following will work best:

First you'll need something to wrap your open file so that it limits how much data can be read:

class FileLimiter(object):
    def __init__(self, file_obj, read_limit):
        self.read_limit = read_limit
        self.amount_seen = 0
        self.file_obj = file_obj

        # So that requests doesn't try to chunk the upload but will instead stream it:
        self.len = read_limit

    def read(self, amount=-1):
        if self.amount_seen >= self.read_limit:
            return b''
        remaining_amount = self.read_limit - self.amount_seen
        data = self.file_obj.read(min(amount, remaining_amount))
        self.amount_seen += len(data)
        return data

This should roughly work as a good wrapper object. Then you would use it like so:

 with open('my_large_file', 'rb') as file_obj:
     file_obj.seek(my_offset)
     upload = FileLimiter(file_obj, my_chunk_limit)
     r = requests.post(url, data=upload, headers={'Content-Type': 'application/octet-stream'})

The headers are obviously optional, but when streaming data to a server, it's a good idea to be a considerate user and tell the server what the type of the content is that you're sending.

Answer from Ian Stapleton Cordasco on Stack Overflow
🌐
GitHub
gist.github.com › nbari › 7335384
python chunk upload files · GitHub
Thank you) But I have problem: Exception: Error: Requested URL Not Found: 400 Client Error: Bad Request for url Without headers['Content-Range'], I can upload my file but only last piece. Maybe you know, how I can solve this problem? ... This will cause multiple requests to the '/test_upload' api. Is there any way where you can read in chunks and send in the same request (1 request).
🌐
Stack Overflow
stackoverflow.com › questions › 61345601 › python-requests-modules-post-very-large-files-in-chunks-to-monitor-progress
Python Requests modules: post very large files in chunks to monitor progress - Stack Overflow
April 21, 2020 - In order to provide the status of the file upload, I created a generator function similar to the example shown below. def read_in_chunks(file_object, chunk_size=1024): """Generator to read a file piece by piece. Default chunk size: 1k.""" while True: data = file_object.read(chunk_size) if not data: break yield data · I then iterated over the generator object that was created like this: with open('Some-File.zip', 'rb') as file_obj: for file_chunk in read_in_chunks(file_obj): requests.Response.post('Some-URL', data=file_chunk.encode('utf-8'), header=header)
Discussions

python - requests - how to stream upload - partial file - Stack Overflow
My goal is to do a PUT of part of a file using requests and stream the file (i.e., not load it into memory and then do the PUT). This page explains how you would do that for an entire file: Req... More on stackoverflow.com
🌐 stackoverflow.com
Can you upload large tarballs using Python3 requests by chunking the file?
I have been able to successfully upload some of the tarballs using python3 requests.post, but some of the tarballs are too big and I get a MemoryError from python. Through some research, I found the best solution for my situation would be to chunk the tarballs into smaller, multipart uploads. More on community.sonatype.com
🌐 community.sonatype.com
1
0
August 4, 2022
Requests - How to upload large chunks of file in requests? - Stack Overflow
Pls.note I cannot send the fileobj in the requests as that will use less memory but I will not be able to break the chunk at 10MB and entire 5GB data will go to the request, which I do not want. Is there any way via requests. I see the httplib has it. https://github.com/python/cpython/blob... More on stackoverflow.com
🌐 stackoverflow.com
July 9, 2021
python - How do I upload files to the server using the drf-chunked-upload library? - Stack Overflow
Sign up to request clarification or add additional context in comments. ... thank you very much for the answer, but there was a new error on the server when sending the file. ValueError: Cannot use ModelSerializer with Abstract Models. Do you know what this might be related to? I didn't change the model and view of the source library. 2021-09-10T09:36:02.477Z+00:00 ... Hi.. maybe you are using the ChunkedUpload model from drf-chunked-upload ... More on stackoverflow.com
🌐 stackoverflow.com
🌐
api.video
api.video › blog › tutorials › upload-a-big-video-file-using-python
Upload a big video file using Python
To send our data chunk, we need to put it into a one item file dictionary. We do that, and then for readability assign it to a variable. Then we send our request! This will loop through until we're out of data chunks (the last chunk will probably ...
🌐
GitHub
github.com › mesuutt › python-chunked-upload-example
GitHub - mesuutt/python-chunked-upload-example: Resumable chunked file upload via Rest API
Resumable chunked file upload via Rest API. Contribute to mesuutt/python-chunked-upload-example development by creating an account on GitHub.
Starred by 23 users
Forked by 6 users
Languages   Python 100.0% | Python 100.0%
Top answer
1 of 2
11

Based off Greg's answers to my questions I think the following will work best:

First you'll need something to wrap your open file so that it limits how much data can be read:

class FileLimiter(object):
    def __init__(self, file_obj, read_limit):
        self.read_limit = read_limit
        self.amount_seen = 0
        self.file_obj = file_obj

        # So that requests doesn't try to chunk the upload but will instead stream it:
        self.len = read_limit

    def read(self, amount=-1):
        if self.amount_seen >= self.read_limit:
            return b''
        remaining_amount = self.read_limit - self.amount_seen
        data = self.file_obj.read(min(amount, remaining_amount))
        self.amount_seen += len(data)
        return data

This should roughly work as a good wrapper object. Then you would use it like so:

 with open('my_large_file', 'rb') as file_obj:
     file_obj.seek(my_offset)
     upload = FileLimiter(file_obj, my_chunk_limit)
     r = requests.post(url, data=upload, headers={'Content-Type': 'application/octet-stream'})

The headers are obviously optional, but when streaming data to a server, it's a good idea to be a considerate user and tell the server what the type of the content is that you're sending.

2 of 2
5

I'm just throwing 2 other answers together so bear with me if it doesn't work out of the box—I have no means of testing this:

Lazy Method for Reading Big File in Python?

http://docs.python-requests.org/en/latest/user/advanced/#chunk-encoded-requests

def read_in_chunks(file_object, blocksize=1024, chunks=-1):
    """Lazy function (generator) to read a file piece by piece.
    Default chunk size: 1k."""
    while chunks:
        data = file_object.read(blocksize)
        if not data:
            break
        yield data
        chunks -= 1


requests.post('http://some.url/chunked', data=read_in_chunks(f))
🌐
Code Calamity
codecalamity.com › home › uploading large files by chunking – featuring python flask and dropzone.js
Uploading large files by chunking - featuring Python Flask and Dropzone.js - Code Calamity
February 22, 2021 - We are going to add some custom ... 1000000 } </script> </body> When enabling chunking, it will break up any files larger than the chunkSize and send them to the server over multiple requests....
🌐
Medium
medium.com › codex › chunked-uploads-with-binary-files-in-python-f0c48e373a91
Chunked Uploads with Binary Files in Python | by Erikka Innes | CodeX | Medium
March 3, 2021 - My goal today, is to go over some of the sticking points you might encounter when trying to upload a large file in chunks that isn’t text. If you have something besides a text file, the first sticking point you’ll encounter is accidentally treating your file like a text file. If you find a tutorial that’s accurate for a text file, it will often work for a binary file as long as you include a few tweaks that help Python recognize the difference in your file.
🌐
Sonatype Community
community.sonatype.com › sonatype nexus repository
Can you upload large tarballs using Python3 requests by chunking the file? - Sonatype Nexus Repository - Sonatype Community
August 4, 2022 - I have been able to successfully upload some of the tarballs using python3 requests.post, but some of the tarballs are too big and I get a MemoryError from python. Through some research, I found the best solution for my situation would be to chunk the tarballs into smaller, multipart uploads.
Find elsewhere
🌐
ProxiesAPI
proxiesapi.com › articles › streaming-uploads-in-python-requests-using-file-like-objects
Streaming Uploads in Python Requests using File-Like Objects | ProxiesAPI
February 3, 2024 - with open('large_video.mp4', 'rb') as f: requests.post('https://example.com/upload', files={'video': f}) Instead, we can create a file-like object that streams the data in chunks: ... This streams the file data in memory-efficient chunks without ...
🌐
PyPI
pypi.org › project › drf-chunked-upload
drf-chunked-upload · PyPI
License: MIT-Zero. ... INSTALLED_APPS = ( # ... 'drf_chunked_upload', ) An initial PUT request is sent to the url linked to ChunkedUploadView (or any subclass) with the first chunk of the file.
      » pip install drf-chunked-upload
    
Published   Aug 13, 2023
Version   0.6.0
🌐
py4u
py4u.org › blog › python-requests-upload-large-file-with-additional-data
How to Upload Large Files with Additional Data Using Python Requests Without MemoryError: A Complete Guide
The requests library natively supports streaming uploads when given a "file-like object," allowing it to send data in chunks without overwhelming memory. Before we start, ensure you have the following tools installed: Python 3.6+: Download from ...
🌐
Stack Overflow
stackoverflow.com › questions › 68317316 › requests-how-to-upload-large-chunks-of-file-in-requests
Requests - How to upload large chunks of file in requests? - Stack Overflow
July 9, 2021 - Pls.note I cannot send the fileobj in the requests as that will use less memory but I will not be able to break the chunk at 10MB and entire 5GB data will go to the request, which I do not want. Is there any way via requests. I see the httplib has it. https://github.com/python/cpython/blob/3.9/Lib/http/client.py - I will call the send(fh.read(4096) function here in loop till I complete 10MB and will complete one request of 10MB without heavy memory usage.
🌐
pythontutorials
pythontutorials.net › blog › requests-how-to-stream-upload-partial-file
How to Stream Upload a Partial File Chunk with Python Requests (Without Loading into Memory) — pythontutorials.net
By uploading a file in smaller, manageable pieces (chunks) and sending only one chunk at a time, you avoid loading the entire file into memory. This approach is critical for resumable uploads (where you can restart after a failure) and efficient ...
Top answer
1 of 1
2

For uploading a file to Google Cloud Storage in chunks, and actually benefit from using FastAPI/Starlette's request.stream()—which would allow you to receive a file/request body in chunks in your FastAPI backend; thus, avoiding loading the entire file/request body into memory (see the "Update" section of this answer for more details)—you should rather use resumable uploads. Note that, as described in this comment, one had to pay for multiple resumable upload operations in the past. However, this might have changed since then, and you should thus check for any recent updates concerning that matter.

Below is an example as given in this article, but adapted to a FastAPI application. The chunk size is set to 256 KB, as suggested in th official documentation for multiple chunk upload. Larger chunk sizes typically make uploads faster, but note that there's a tradeoff between speed and memory usage. The example is implemented using Google's offical Python Client for Google Cloud Storage (see the source code of the python-storage package along with the given samples—python-storage is also part of google-cloud-python package). For resumable uploads, Google's google-resumable-media-python package is used, which has asyncio support, but is still in development.

Since the example below does not use the asynchronous method for resumable uploads, this could cause blocking of the main thread until a chunk upload operation is completed. Thus, one might use one might have a look at this answer and the solutions provided in it, when running blocking I/O-bound or CPU-bound operations within async def endpoints (e.g., await run_in_threadpool(s.write, chunk)). However, when the async for block, in the example below, requests the next chunk from the asynchronous iterator (i.e., request.stream()), the surrounding coroutine will be suspended and pass function control back to the event loop, thus allowing other tasks in the event loop to run (regardless), until that operation is completed. Hence, even though the event loop might get blocked when the synchronous s.write(chunk) is called, the async for block would still give up time for other tasks in the event loop to run.

gcs.py

from google.auth.transport.requests import AuthorizedSession
from google.resumable_media import requests, common
from google.cloud import storage


class GCSObjectStreamUpload(object):
    def __init__(
            self, 
            client: storage.Client,
            bucket_name: str,
            blob_name: str,
            chunk_size: int=256 * 1024
        ):
        self._client = client
        self._bucket = self._client.bucket(bucket_name)
        self._blob = self._bucket.blob(blob_name)

        self._buffer = b''
        self._buffer_size = 0
        self._chunk_size = chunk_size
        self._read = 0

        self._transport = AuthorizedSession(
            credentials=self._client._credentials
        )
        self._request = None  # type: requests.ResumableUpload


    def __enter__(self):
        self.start()
        return self


    def __exit__(self, exc_type, *_):
        if exc_type is None:
            self.stop()


    def start(self):
        url = (
            f'https://storage.googleapis.com/upload/storage/v1/b/'
            f'{self._bucket.name}/o?uploadType=resumable'
        )
        self._request = requests.ResumableUpload(
            upload_url=url, chunk_size=self._chunk_size
        )
        self._request.initiate(
            transport=self._transport,
            content_type='application/octet-stream',
            stream=self,
            stream_final=False,
            metadata={'name': self._blob.name},
        )


    def stop(self):
        self._request.transmit_next_chunk(self._transport)


    def write(self, data: bytes) -> int:
        data_len = len(data)
        self._buffer_size += data_len
        self._buffer += data
        del data
        while self._buffer_size >= self._chunk_size:
            try:
                self._request.transmit_next_chunk(self._transport)
            except common.InvalidResponse:
                self._request.recover(self._transport)
        return data_len


    def read(self, chunk_size: int) -> bytes:
        to_read = min(chunk_size, self._buffer_size)
        memview = memoryview(self._buffer)
        self._buffer = memview[to_read:].tobytes()
        self._read += to_read
        self._buffer_size -= to_read
        return memview[:to_read].tobytes()


    def tell(self) -> int:
        return self._read

app.py

from fastapi import FastAPI, Request, HTTPException
from gcs import GCSObjectStreamUpload
from google.cloud import storage

app = FastAPI()

        
@app.post('/upload')
async def upload(request: Request):
    try:
        client = storage.Client()
        with GCSObjectStreamUpload(client=client, bucket_name='test-bucket', blob_name='test-blob') as s:
            async for chunk in request.stream():
                s.write(chunk)
    except Exception:
        raise HTTPException(status_code=500, detail='Something went wrong')
    
    return {"message": f"File successfuly uploaded"}

Other approaches for uploading files to Google Cloud Storage can be found in this answer.

🌐
PyPI
pypi.org › project › django-chunked-upload
django-chunked-upload · PyPI
Upload large files to Django in multiple chunks, with the ability to resume if the upload is interrupted.
      » pip install django-chunked-upload
    
Published   Dec 15, 2019
Version   2.0.0
🌐
cyberangles
cyberangles.org › blog › using-python-requests-to-bridge-a-file-without-loading-into-memory
How to Stream Large Files from URL to Multipart POST with Python Requests (Without Loading into Memory) — CyberAngles.org
When enabled, requests does not ... multipart POST requests, the standard requests API can upload files using files={'field': (filename, fileobj)}, but this may still load small files into memory....
🌐
Stack Overflow
stackoverflow.com › questions › 33103873 › python-upload-a-file-in-chunk-to-a-server-including-additional-fields
http - Python: Upload a file in chunk to a server including additional fields - Stack Overflow
April 5, 2017 - However pay attention at the formatting used in this function because it is programmed to be pretty printed and may differ from the actual request. """ print('{}\n{}\n{}\n\n{}'.format( '-----------START-----------', req.method + ' ' + req.url, '\n'.join('{}: {}'.format(k, v) for k, v in req.headers.items()), req.body, )) url = 'http://httpbin.org/post' data = {'input_name': json.dumps({ 'json': 'here' })} files = { 'file1': ('doc.txt', open('/tmp/doc.txt', 'rb'), 'text/plain'), 'file2': ('doc2.html', open('/tmp/doc2.html', 'rb'), 'text/html'), } r = requests.post(url, data=data, files=files) pretty_print_POST(r.request) print r.text
🌐
Stack Overflow
stackoverflow.com › questions › 44799224 › python-chunked-uploads-via-requests-in-one-connection
file - Python: Chunked uploads via requests in one connection? - Stack Overflow
June 28, 2017 - with open(local_path, 'rb') as file_obj: files = MultipartEncoder({'files[]': (filename, file_obj, 'application/octet-stream')}) UploadFile = requests.post(self.UploadURL, data=files, headers={'Content-Type': files.content_type}, stream=True) How to upload whole file, but sended in 1MB chunks, in one connection?
🌐
GitHub
github.com › zalando › connexion › issues › 1332
Trying to upload large files into chunks using multipart/form-data but connexion.request.stream seems to be empty · Issue #1332 · spec-first/connexion
December 23, 2020 - Apparently, the stream would be ... or request.get_data(), I can't find a way to split the bytestring into chunks to save memory and write to the right destination. If I use application/octet-stream, I can stream the file directly but I'm loosing any chance to upload additional ...
Author   MajorSquirrelTVS