There are two large differences:
The PHP code posts a field named
file, your Python code posts a field namedbulk_test2.mov.Your Python code posts an empty file. There Content-Length header is 160 bytes, exactly the amount of space the multipart boundaries and
Content-Dispositionpart header take up. Either thebulk_test2.movfile is indeed empty, or you tried to post the file multiple times without rewinding or reopening the file object.
To fix the first problem, use 'file' as the key in your files dictionary:
files = {'file': open('bulk_test2.mov', 'rb')}
response = requests.post(url, files=files)
I used just the open file object as the value; requests will get the filename directly from the file object in that case.
The second issue is something only you can fix. Make sure you don't reuse files when repeatedly posting. Reopen, or use files['file'].seek(0) to rewind the read position back to the start.
The Expect: 100-continue header is an optional client feature that asks the server to confirm that the body upload can go ahead; it is not a required header and any failure to post your file object is not going to be due to requests using this feature or not. If an HTTP server were to misbehave if you don't use this feature, it is in violation of the HTTP RFCs and you'll have bigger problems on your hands. It certainly won't be something requests can fix for you.
If you do manage to post actual file data, any small variations in Content-Length are due to the (random) boundary being a different length between Python and PHP. This is normal, and not the cause of upload problems, unless your target server is extremely broken. Again, don't try to fix such brokenness with Python.
However, I'd assume you overlooked something much simpler. Perhaps the server blacklists certain User-Agent headers, for example. You could clear some of the default headers requests sets by using a Session object:
files = {'file': open('bulk_test2.mov', 'rb')}
session = requests.Session()
del session.headers['User-Agent']
del session.headers['Accept-Encoding']
response = session.post(url, files=files)
and see if that makes a difference.
If the server fails to handle your request because it fails to handle HTTP persistent connections, you could try to use the session as a context manager to ensure that all session connections are closed:
files = {'file': open('bulk_test2.mov', 'rb')}
with requests.Session() as session:
response = session.post(url, files=files, stream=True)
and you could add:
response.raw.close()
for good measure.
Answer from Martijn Pieters on Stack OverflowAs Mark Ma mentioned, you can get it done without leaving the standard library by utilizing urllib2. I like to use Requests, so I cooked this up:
import os
import requests
dump_directory = os.path.join(os.getcwd(), 'mp3')
os.makedirs(dump_directory, exist_ok=True)
def dump_mp3_for(resource):
payload = {
'api': 'advanced',
'format': 'JSON',
'video': resource
}
initial_request = requests.get('http://youtubeinmp3.com/fetch/', params=payload)
if initial_request.status_code == 200: # good to go
download_mp3_at(initial_request)
def download_mp3_at(initial_request):
j = initial_request.json()
filename = '{0}.mp3'.format(j['title'])
r = requests.get(j['link'], stream=True)
with open(os.path.join(dump_directory, filename), 'wb') as f:
print('Dumping "{0}"...'.format(filename))
for chunk in r.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
f.flush()
It's then trivial to iterate over a list of YouTube video links and pass them into dump_mp3_for() one-by-one.
for video in ['http://www.youtube.com/watch?v=i62Zjga8JOM']:
dump_mp3_for(video)
In its API Doc, it provides one version of URL which returns download link as JSON: http://youtubeinmp3.com/fetch/?api=advanced&format=JSON&video=http://www.youtube.com/watch?v=i62Zjga8JOM
Ok Then we can use urllib2 to call the API and fetch API result, then unserialize with json.loads(), and download mp3 file using urllib2 again.
import urllib2
import json
r = urllib2.urlopen('http://youtubeinmp3.com/fetch/?api=advanced&format=JSON&video=http://www.youtube.com/watch?v=i62Zjga8JOM')
content = r.read()
# extract download link
download_url = json.loads(content)['link']
download_content = urllib2.urlopen(download_url).read()
# save downloaded content to file
f = open('test.mp3', 'wb')
f.write(download_content)
f.close()
Notice the file should be opened using mode 'wb', otherwise the mp3 file cannot be played correctly. If the file is big, downloading will be a time-consuming progress. And here is a post describes how to display downloading progress in GUI (PySide)
Using requests.get() in python to view video - Stack Overflow
error when trying to send a video using requests library Python - Stack Overflow
Help with python-requests
Requests should do what you want:
>>> r = requests.post('https://io.cimediacloud.com/upload', files={'video.mp4': open('files/video.mp4', 'rb')})
http://docs.python-requests.org/en/latest/user/quickstart/#post-a-multipart-encoded-file
More on reddit.comhttp - how to input data via a post request using requests in python - Stack Overflow
Videos
Help! I've been trying to upload a file in Python, but the API server keeps returning a "MissingOrInvalidFileName" error. I am porting Java code over to Python, so I know the API call works, at least in Java. What is different about these two HTTP requests that is causing the problem?
Java uploads fine:
private String uploadAssetAsSinglePartHttp(String fileName, String workspaceId, String
folderId) throws ClientProtocolException, IOException {
HttpClient client = buildAuthorizedClient();
//Set up post
String url = "https://io.cimediacloud.com/upload";
HttpPost request = new HttpPost(url);
//add file to request
MultipartEntityBuilder builder = MultipartEntityBuilder.create();
File video = new File("files/"+fileName);
builder.addPart("file", new FileBody(video));
//folder info in json
JsonObject fileInfo = new JsonObject();
if(workspaceId != null) {
fileInfo.addProperty("workspaceId", workspaceId);
}
if(folderId != null) {
fileInfo.addProperty("folderId", folderId);
}
String fileJson = gson.toJson(fileInfo);
builder.addTextBody("metadata", fileJson);
//add json to request
HttpEntity entity = builder.build();
request.setEntity(entity);
//execute request
HttpResponse response = null;
try {
response = client.execute(request);
System.out.println(response.getStatusLine().getReasonPhrase());
return getResponseJsonProperty(response, "assetId");
} catch (IOException e) {
e.printStackTrace();
} finally {
request.releaseConnection();
}
return null;
}
Python MissingOrInvalidFileName Error:
def upload_singlepart(folder_id=None):
url = 'https://io.cimediacloud.com/upload'
files = {'file': ('video.mp4', open('files/video.mp4', 'rb')),}
metadata = {
'workspaceId': your_workspace_id,
'folderId': folder_id
}
data = {'metadata': json.dumps(metadata)}
r = session.post(url, files=files, data=data)
return r.json()['assetId']
If it helps, here is the cURL equivalent.
curl -XPOST -i "https://io.cimediacloud.com/upload" \
-H "Authorization: Bearer ACCESS_TOKEN" \
-F filename=@Movie.mov
-F metadata="{'workspaceId' : 'a585b641a60843498543597d16ba0108', 'folderId' :
'a585b641a60843498543597d16ba0108' }"
Note: the metadata is optional, but i want to send it along with the file.
I'm about to switch to urllib3 for more detailed control over my requests -_- but first, I'll ask you guys. Thanks much.
P.S. I hear people talking about requests-toolbelt. Would that help? Thanks.