python requests upload file in chunks

requests - how to stream upload - partial file

stackoverflow.com › questions › 29775247 › requests-how-to-stream-upload-partial-file

Based off Greg's answers to my questions I think the following will work best:

First you'll need something to wrap your open file so that it limits how much data can be read:

class FileLimiter(object):
    def __init__(self, file_obj, read_limit):
        self.read_limit = read_limit
        self.amount_seen = 0
        self.file_obj = file_obj

        # So that requests doesn't try to chunk the upload but will instead stream it:
        self.len = read_limit

    def read(self, amount=-1):
        if self.amount_seen >= self.read_limit:
            return b''
        remaining_amount = self.read_limit - self.amount_seen
        data = self.file_obj.read(min(amount, remaining_amount))
        self.amount_seen += len(data)
        return data

This should roughly work as a good wrapper object. Then you would use it like so:

 with open('my_large_file', 'rb') as file_obj:
     file_obj.seek(my_offset)
     upload = FileLimiter(file_obj, my_chunk_limit)
     r = requests.post(url, data=upload, headers={'Content-Type': 'application/octet-stream'})

The headers are obviously optional, but when streaming data to a server, it's a good idea to be a considerate user and tell the server what the type of the content is that you're sending.

Answer from Ian Stapleton Cordasco on Stack Overflow

GitHub

gist.github.com › nbari › 7335384

python chunk upload files · GitHub

Thank you) But I have problem: Exception: Error: Requested URL Not Found: 400 Client Error: Bad Request for url Without headers['Content-Range'], I can upload my file but only last piece. Maybe you know, how I can solve this problem? ... This will cause multiple requests to the '/test_upload' api. Is there any way where you can read in chunks and send in the same request (1 request).

Stack Overflow

stackoverflow.com › questions › 61345601 › python-requests-modules-post-very-large-files-in-chunks-to-monitor-progress

Python Requests modules: post very large files in chunks to monitor progress - Stack Overflow

April 21, 2020 - In order to provide the status of the file upload, I created a generator function similar to the example shown below. def read_in_chunks(file_object, chunk_size=1024): """Generator to read a file piece by piece. Default chunk size: 1k.""" while True: data = file_object.read(chunk_size) if not data: break yield data · I then iterated over the generator object that was created like this: with open('Some-File.zip', 'rb') as file_obj: for file_chunk in read_in_chunks(file_obj): requests.Response.post('Some-URL', data=file_chunk.encode('utf-8'), header=header)

Discussions

python - requests - how to stream upload - partial file - Stack Overflow

My goal is to do a PUT of part of a file using requests and stream the file (i.e., not load it into memory and then do the PUT). This page explains how you would do that for an entire file: Req... More on stackoverflow.com

stackoverflow.com

Can you upload large tarballs using Python3 requests by chunking the file?

I have been able to successfully upload some of the tarballs using python3 requests.post, but some of the tarballs are too big and I get a MemoryError from python. Through some research, I found the best solution for my situation would be to chunk the tarballs into smaller, multipart uploads. More on community.sonatype.com

community.sonatype.com

August 4, 2022

Requests - How to upload large chunks of file in requests? - Stack Overflow

Pls.note I cannot send the fileobj in the requests as that will use less memory but I will not be able to break the chunk at 10MB and entire 5GB data will go to the request, which I do not want. Is there any way via requests. I see the httplib has it. https://github.com/python/cpython/blob... More on stackoverflow.com

stackoverflow.com

July 9, 2021

python - How do I upload files to the server using the drf-chunked-upload library? - Stack Overflow

Sign up to request clarification or add additional context in comments. ... thank you very much for the answer, but there was a new error on the server when sending the file. ValueError: Cannot use ModelSerializer with Abstract Models. Do you know what this might be related to? I didn't change the model and view of the source library. 2021-09-10T09:36:02.477Z+00:00 ... Hi.. maybe you are using the ChunkedUpload model from drf-chunked-upload ... More on stackoverflow.com

stackoverflow.com

api.video

api.video › blog › tutorials › upload-a-big-video-file-using-python

Upload a big video file using Python

To send our data chunk, we need to put it into a one item file dictionary. We do that, and then for readability assign it to a variable. Then we send our request! This will loop through until we're out of data chunks (the last chunk will probably ...

GitHub

github.com › mesuutt › python-chunked-upload-example

GitHub - mesuutt/python-chunked-upload-example: Resumable chunked file upload via Rest API

Resumable chunked file upload via Rest API. Contribute to mesuutt/python-chunked-upload-example development by creating an account on GitHub.

Starred by 23 users

Forked by 6 users

Languages Python 100.0% | Python 100.0%

Stack Overflow

stackoverflow.com › questions › 29775247 › requests-how-to-stream-upload-partial-file

python - requests - how to stream upload - partial file - Stack Overflow

Top answer

1 of 2

Based off Greg's answers to my questions I think the following will work best:

First you'll need something to wrap your open file so that it limits how much data can be read:

class FileLimiter(object):
    def __init__(self, file_obj, read_limit):
        self.read_limit = read_limit
        self.amount_seen = 0
        self.file_obj = file_obj

        # So that requests doesn't try to chunk the upload but will instead stream it:
        self.len = read_limit

    def read(self, amount=-1):
        if self.amount_seen >= self.read_limit:
            return b''
        remaining_amount = self.read_limit - self.amount_seen
        data = self.file_obj.read(min(amount, remaining_amount))
        self.amount_seen += len(data)
        return data

This should roughly work as a good wrapper object. Then you would use it like so:

 with open('my_large_file', 'rb') as file_obj:
     file_obj.seek(my_offset)
     upload = FileLimiter(file_obj, my_chunk_limit)
     r = requests.post(url, data=upload, headers={'Content-Type': 'application/octet-stream'})

The headers are obviously optional, but when streaming data to a server, it's a good idea to be a considerate user and tell the server what the type of the content is that you're sending.

2 of 2

I'm just throwing 2 other answers together so bear with me if it doesn't work out of the box—I have no means of testing this:

Lazy Method for Reading Big File in Python?

http://docs.python-requests.org/en/latest/user/advanced/#chunk-encoded-requests

def read_in_chunks(file_object, blocksize=1024, chunks=-1):
    """Lazy function (generator) to read a file piece by piece.
    Default chunk size: 1k."""
    while chunks:
        data = file_object.read(blocksize)
        if not data:
            break
        yield data
        chunks -= 1


requests.post('http://some.url/chunked', data=read_in_chunks(f))

Code Calamity

codecalamity.com › home › uploading large files by chunking – featuring python flask and dropzone.js

Uploading large files by chunking - featuring Python Flask and Dropzone.js - Code Calamity

February 22, 2021 - We are going to add some custom ... 1000000 } </script> </body> When enabling chunking, it will break up any files larger than the chunkSize and send them to the server over multiple requests....

Medium

medium.com › codex › chunked-uploads-with-binary-files-in-python-f0c48e373a91

Chunked Uploads with Binary Files in Python | by Erikka Innes | CodeX | Medium

March 3, 2021 - My goal today, is to go over some of the sticking points you might encounter when trying to upload a large file in chunks that isn’t text. If you have something besides a text file, the first sticking point you’ll encounter is accidentally treating your file like a text file. If you find a tutorial that’s accurate for a text file, it will often work for a binary file as long as you include a few tweaks that help Python recognize the difference in your file.

Sonatype Community

community.sonatype.com › sonatype nexus repository

Can you upload large tarballs using Python3 requests by chunking the file? - Sonatype Nexus Repository - Sonatype Community

August 4, 2022 - I have been able to successfully upload some of the tarballs using python3 requests.post, but some of the tarballs are too big and I get a MemoryError from python. Through some research, I found the best solution for my situation would be to chunk the tarballs into smaller, multipart uploads.

Find elsewhere

Google Bing Mojeek

ProxiesAPI

proxiesapi.com › articles › streaming-uploads-in-python-requests-using-file-like-objects

Streaming Uploads in Python Requests using File-Like Objects | ProxiesAPI

February 3, 2024 - with open('large_video.mp4', 'rb') as f: requests.post('https://example.com/upload', files={'video': f}) Instead, we can create a file-like object that streams the data in chunks: ... This streams the file data in memory-efficient chunks without ...

PyPI

pypi.org › project › drf-chunked-upload

drf-chunked-upload · PyPI

License: MIT-Zero. ... INSTALLED_APPS = ( # ... 'drf_chunked_upload', ) An initial PUT request is sent to the url linked to ChunkedUploadView (or any subclass) with the first chunk of the file.

      » pip install drf-chunked-upload

Published Aug 13, 2023

Version 0.6.0

Homepage https://github.com/jkeifer/drf-chunked-upload

py4u

py4u.org › blog › python-requests-upload-large-file-with-additional-data

How to Upload Large Files with Additional Data Using Python Requests Without MemoryError: A Complete Guide

The requests library natively supports streaming uploads when given a "file-like object," allowing it to send data in chunks without overwhelming memory. Before we start, ensure you have the following tools installed: Python 3.6+: Download from ...

Stack Overflow

stackoverflow.com › questions › 68317316 › requests-how-to-upload-large-chunks-of-file-in-requests

Requests - How to upload large chunks of file in requests? - Stack Overflow

July 9, 2021 - Pls.note I cannot send the fileobj in the requests as that will use less memory but I will not be able to break the chunk at 10MB and entire 5GB data will go to the request, which I do not want. Is there any way via requests. I see the httplib has it. https://github.com/python/cpython/blob/3.9/Lib/http/client.py - I will call the send(fh.read(4096) function here in loop till I complete 10MB and will complete one request of 10MB without heavy memory usage.

Stack Overflow

stackoverflow.com › questions › 69111600 › how-do-i-upload-files-to-the-server-using-the-drf-chunked-upload-library

python - How do I upload files to the server using the drf-chunked-upload library? - Stack Overflow

Example

import hashlib
import os

import requests
from requests.auth import HTTPBasicAuth

auth = HTTPBasicAuth(username='username', password='password')

file = 'prova.txt'

size = os.path.getsize(file)

hash_md5 = hashlib.md5()

CHUNK_SIZE = 100

with open(file, 'rb') as f:
    url = 'http://localhost:8000/'
    offset = 0
    for chunk in iter(lambda: f.read(CHUNK_SIZE), b''):
        hash_md5.update(chunk)
        res = requests.put(
            url,
            data={'filename': 'my_new_file'},
            files={'file': chunk},
            headers={
                'Content-Range': f'bytes {offset}-{offset + len(chunk) -1}/{size}'
            },
            auth=auth
        )
        offset = int(res.json().get('offset'))
        url = res.json().get('url')
    finalize = requests.post(url, data={'md5': hash_md5.hexdigest()}, auth=auth)
    print(finalize.status_code)
    print(finalize.json())

pythontutorials

pythontutorials.net › blog › requests-how-to-stream-upload-partial-file

How to Stream Upload a Partial File Chunk with Python Requests (Without Loading into Memory) — pythontutorials.net

By uploading a file in smaller, manageable pieces (chunks) and sending only one chunk at a time, you avoid loading the entire file into memory. This approach is critical for resumable uploads (where you can restart after a failure) and efficient ...

Stack Overflow

stackoverflow.com › questions › 79360240 › how-to-upload-file-in-chunks-to-google-cloud-storage-using-fastapis-request-str

python - How to upload file in chunks to Google Cloud Storage using FastAPI's request.stream()? - Stack Overflow

Top answer

1 of 1

For uploading a file to Google Cloud Storage in chunks, and actually benefit from using FastAPI/Starlette's request.stream()—which would allow you to receive a file/request body in chunks in your FastAPI backend; thus, avoiding loading the entire file/request body into memory (see the "Update" section of this answer for more details)—you should rather use resumable uploads. Note that, as described in this comment, one had to pay for multiple resumable upload operations in the past. However, this might have changed since then, and you should thus check for any recent updates concerning that matter.

Below is an example as given in this article, but adapted to a FastAPI application. The chunk size is set to 256 KB, as suggested in th official documentation for multiple chunk upload. Larger chunk sizes typically make uploads faster, but note that there's a tradeoff between speed and memory usage. The example is implemented using Google's offical Python Client for Google Cloud Storage (see the source code of the python-storage package along with the given samples—python-storage is also part of google-cloud-python package). For resumable uploads, Google's google-resumable-media-python package is used, which has asyncio support, but is still in development.

Since the example below does not use the asynchronous method for resumable uploads, this could cause blocking of the main thread until a chunk upload operation is completed. Thus, one might use one might have a look at this answer and the solutions provided in it, when running blocking I/O-bound or CPU-bound operations within async def endpoints (e.g., await run_in_threadpool(s.write, chunk)). However, when the async for block, in the example below, requests the next chunk from the asynchronous iterator (i.e., request.stream()), the surrounding coroutine will be suspended and pass function control back to the event loop, thus allowing other tasks in the event loop to run (regardless), until that operation is completed. Hence, even though the event loop might get blocked when the synchronous s.write(chunk) is called, the async for block would still give up time for other tasks in the event loop to run.

gcs.py

from google.auth.transport.requests import AuthorizedSession
from google.resumable_media import requests, common
from google.cloud import storage


class GCSObjectStreamUpload(object):
    def __init__(
            self, 
            client: storage.Client,
            bucket_name: str,
            blob_name: str,
            chunk_size: int=256 * 1024
        ):
        self._client = client
        self._bucket = self._client.bucket(bucket_name)
        self._blob = self._bucket.blob(blob_name)

        self._buffer = b''
        self._buffer_size = 0
        self._chunk_size = chunk_size
        self._read = 0

        self._transport = AuthorizedSession(
            credentials=self._client._credentials
        )
        self._request = None  # type: requests.ResumableUpload


    def __enter__(self):
        self.start()
        return self


    def __exit__(self, exc_type, *_):
        if exc_type is None:
            self.stop()


    def start(self):
        url = (
            f'https://storage.googleapis.com/upload/storage/v1/b/'
            f'{self._bucket.name}/o?uploadType=resumable'
        )
        self._request = requests.ResumableUpload(
            upload_url=url, chunk_size=self._chunk_size
        )
        self._request.initiate(
            transport=self._transport,
            content_type='application/octet-stream',
            stream=self,
            stream_final=False,
            metadata={'name': self._blob.name},
        )


    def stop(self):
        self._request.transmit_next_chunk(self._transport)


    def write(self, data: bytes) -> int:
        data_len = len(data)
        self._buffer_size += data_len
        self._buffer += data
        del data
        while self._buffer_size >= self._chunk_size:
            try:
                self._request.transmit_next_chunk(self._transport)
            except common.InvalidResponse:
                self._request.recover(self._transport)
        return data_len


    def read(self, chunk_size: int) -> bytes:
        to_read = min(chunk_size, self._buffer_size)
        memview = memoryview(self._buffer)
        self._buffer = memview[to_read:].tobytes()
        self._read += to_read
        self._buffer_size -= to_read
        return memview[:to_read].tobytes()


    def tell(self) -> int:
        return self._read

app.py

from fastapi import FastAPI, Request, HTTPException
from gcs import GCSObjectStreamUpload
from google.cloud import storage

app = FastAPI()

        
@app.post('/upload')
async def upload(request: Request):
    try:
        client = storage.Client()
        with GCSObjectStreamUpload(client=client, bucket_name='test-bucket', blob_name='test-blob') as s:
            async for chunk in request.stream():
                s.write(chunk)
    except Exception:
        raise HTTPException(status_code=500, detail='Something went wrong')
    
    return {"message": f"File successfuly uploaded"}

Other approaches for uploading files to Google Cloud Storage can be found in this answer.

PyPI

pypi.org › project › django-chunked-upload

django-chunked-upload · PyPI

Upload large files to Django in multiple chunks, with the ability to resume if the upload is interrupted.

      » pip install django-chunked-upload

Published Dec 15, 2019

Version 2.0.0

Homepage https://github.com/juliomalegria/django-chunked-upload

cyberangles

cyberangles.org › blog › using-python-requests-to-bridge-a-file-without-loading-into-memory

How to Stream Large Files from URL to Multipart POST with Python Requests (Without Loading into Memory) — CyberAngles.org

When enabled, requests does not ... multipart POST requests, the standard requests API can upload files using files={'field': (filename, fileobj)}, but this may still load small files into memory....

Stack Overflow

stackoverflow.com › questions › 33103873 › python-upload-a-file-in-chunk-to-a-server-including-additional-fields

http - Python: Upload a file in chunk to a server including additional fields - Stack Overflow

April 5, 2017 - However pay attention at the formatting used in this function because it is programmed to be pretty printed and may differ from the actual request. """ print('{}\n{}\n{}\n\n{}'.format( '-----------START-----------', req.method + ' ' + req.url, '\n'.join('{}: {}'.format(k, v) for k, v in req.headers.items()), req.body, )) url = 'http://httpbin.org/post' data = {'input_name': json.dumps({ 'json': 'here' })} files = { 'file1': ('doc.txt', open('/tmp/doc.txt', 'rb'), 'text/plain'), 'file2': ('doc2.html', open('/tmp/doc2.html', 'rb'), 'text/html'), } r = requests.post(url, data=data, files=files) pretty_print_POST(r.request) print r.text

Stack Overflow

stackoverflow.com › questions › 44799224 › python-chunked-uploads-via-requests-in-one-connection

file - Python: Chunked uploads via requests in one connection? - Stack Overflow

June 28, 2017 - with open(local_path, 'rb') as file_obj: files = MultipartEncoder({'files[]': (filename, file_obj, 'application/octet-stream')}) UploadFile = requests.post(self.UploadURL, data=files, headers={'Content-Type': files.content_type}, stream=True) How to upload whole file, but sended in 1MB chunks, in one connection?

GitHub

github.com › zalando › connexion › issues › 1332

Trying to upload large files into chunks using multipart/form-data but connexion.request.stream seems to be empty · Issue #1332 · spec-first/connexion

December 23, 2020 - Apparently, the stream would be ... or request.get_data(), I can't find a way to split the bytestring into chunks to save memory and write to the right destination. If I use application/octet-stream, I can stream the file directly but I'm loosing any chance to upload additional ...

Author MajorSquirrelTVS