Basically what you do is correct. Looking at redmine docs you linked to, it seems that suffix after the dot in the url denotes type of posted data (.json for JSON, .xml for XML), which agrees with the response you get - Processing by AttachmentsController#upload as XML. I guess maybe there's a bug in docs and to post binary data you should try using http://redmine/uploads url instead of http://redmine/uploads.xml.
Btw, I highly recommend very good and very popular Requests library for http in Python. It's much better than what's in the standard lib (urllib2). It supports authentication as well but I skipped it for brevity here.
import requests
with open('./x.png', 'rb') as f:
data = f.read()
res = requests.post(url='http://httpbin.org/post',
data=data,
headers={'Content-Type': 'application/octet-stream'})
# let's check if what we sent is what we intended to send...
import json
import base64
assert base64.b64decode(res.json()['data'][len('data:application/octet-stream;base64,'):]) == data
UPDATE
To find out why this works with Requests but not with urllib2 we have to examine the difference in what's being sent. To see this I'm sending traffic to http proxy (Fiddler) running on port 8888:
Using Requests
import requests
data = 'test data'
res = requests.post(url='http://localhost:8888',
data=data,
headers={'Content-Type': 'application/octet-stream'})
we see
POST http://localhost:8888/ HTTP/1.1
Host: localhost:8888
Content-Length: 9
Content-Type: application/octet-stream
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/1.0.4 CPython/2.7.3 Windows/Vista
test data
and using urllib2
import urllib2
data = 'test data'
req = urllib2.Request('http://localhost:8888', data)
req.add_header('Content-Length', '%d' % len(data))
req.add_header('Content-Type', 'application/octet-stream')
res = urllib2.urlopen(req)
we get
POST http://localhost:8888/ HTTP/1.1
Accept-Encoding: identity
Content-Length: 9
Host: localhost:8888
Content-Type: application/octet-stream
Connection: close
User-Agent: Python-urllib/2.7
test data
I don't see any differences which would warrant different behavior you observe. Having said that it's not uncommon for http servers to inspect User-Agent header and vary behavior based on its value. Try to change headers sent by Requests one by one making them the same as those being sent by urllib2 and see when it stops working.
Basically what you do is correct. Looking at redmine docs you linked to, it seems that suffix after the dot in the url denotes type of posted data (.json for JSON, .xml for XML), which agrees with the response you get - Processing by AttachmentsController#upload as XML. I guess maybe there's a bug in docs and to post binary data you should try using http://redmine/uploads url instead of http://redmine/uploads.xml.
Btw, I highly recommend very good and very popular Requests library for http in Python. It's much better than what's in the standard lib (urllib2). It supports authentication as well but I skipped it for brevity here.
import requests
with open('./x.png', 'rb') as f:
data = f.read()
res = requests.post(url='http://httpbin.org/post',
data=data,
headers={'Content-Type': 'application/octet-stream'})
# let's check if what we sent is what we intended to send...
import json
import base64
assert base64.b64decode(res.json()['data'][len('data:application/octet-stream;base64,'):]) == data
UPDATE
To find out why this works with Requests but not with urllib2 we have to examine the difference in what's being sent. To see this I'm sending traffic to http proxy (Fiddler) running on port 8888:
Using Requests
import requests
data = 'test data'
res = requests.post(url='http://localhost:8888',
data=data,
headers={'Content-Type': 'application/octet-stream'})
we see
POST http://localhost:8888/ HTTP/1.1
Host: localhost:8888
Content-Length: 9
Content-Type: application/octet-stream
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/1.0.4 CPython/2.7.3 Windows/Vista
test data
and using urllib2
import urllib2
data = 'test data'
req = urllib2.Request('http://localhost:8888', data)
req.add_header('Content-Length', '%d' % len(data))
req.add_header('Content-Type', 'application/octet-stream')
res = urllib2.urlopen(req)
we get
POST http://localhost:8888/ HTTP/1.1
Accept-Encoding: identity
Content-Length: 9
Host: localhost:8888
Content-Type: application/octet-stream
Connection: close
User-Agent: Python-urllib/2.7
test data
I don't see any differences which would warrant different behavior you observe. Having said that it's not uncommon for http servers to inspect User-Agent header and vary behavior based on its value. Try to change headers sent by Requests one by one making them the same as those being sent by urllib2 and see when it stops working.
This has nothing to do with a malformed upload. The HTTP error clearly specifies 401 unauthorized, and tells you the CSRF token is invalid. Try sending a valid CSRF token with the upload.
More about csrf tokens here:
What is a CSRF token ? What is its importance and how does it work?
How to send binary data in body HTTP POST
c++ - Binary data in post request - Stack Overflow
Download binary data in POST request to a file
How to post request as binary data
I'm uploading images to an S3 bucket using an API Gateway. The URL generated works fine with binary data in Postman. What I'm having trouble figuring out is, how do I take the file input and pass the image into the body of the request?
I thought I might need to use FileReader to send as binary. (i've never used FileReader so this is probably wrong).
var file = document.getElementById("files").files[0];
var reader = new FileReader();
reader.readAsArrayBuffer(file);
var readerResult;
reader.onload = function() {
readerResult = reader.result;
};How do I store my binary data for the following request? (Postman generated)
var myHeaders = new Headers();
myHeaders.append("Content-Type", "image/jpeg");
var file = "<file contents here>";
var requestOptions = {
method: 'PUT',
headers: myHeaders,
body: file, //need to send the file as binary here...
redirect: 'follow'
};
fetch("https://myURL.amazonaws.com/v1/bucket/myImage.jpg", requestOptions)
.then(response => response.text())
.then(result => console.log(result))
.catch(error => console.log('error', error));Will simply sending base64 encoded data work?
There is no need to use base 64 encoding - this will simply increase the number of bytes you must transfer. Mobile operators normally limit mangling of responses to content types that they understand - i.e. images, stylesheets, etc.
How are the HTTP sessions handled?
HTTP sessions are normally handled either via a URL query parameter or via a cookie value. However, from what you have said it doesn't sound like sessions are necessary.
Arbitrary sockets can be kept alive for a long time, but HTTP verbs are usually short lived. Does this mean I will need to create a new connection for each packet of data?
HTTP requests can last for an arbitrarily long period of time, just as for raw TCP sockets. A GET request can last for hours if necessary. You need not create a new connection for each request — take a look at the Connection: Keep-Alive HTTP header.
Or is there a way to send server responses in chunks, over a single connection?
If you don't know the length of the response you can either omit a Content-Length header or, preferably, use the Transfer-Encoding: chunked HTTP header.
In what ways can an ISP proxy mess with the data, or the headers? For example, a proxy can sometimes keep a connection alive, even if the server closes it.
ISPs don't tend to reveal the changes they make to HTTP responses. If you are concerned about this a simple solution would be to encrypt the data and specify a Content-Encoding HTTP header. This would require you to control both the HTTP client and server.
If possible, you could just send the data as HTTP requests and responses.
HTTP is perfectly capable of handling binary data: images are sent over HTTP all the time, and they're binary. People upload and download files of arbitrary data types all the time with no problem.
Just give it a mime type of "application/octet-stream" -- which is basically a generic mime type for binary data with no further specification of just what sort -- and any proxies along the way should leave it alone.
As expected, this is the same as the non-binary request. So, with the results being the same, what exactly is the difference between is passing a binary vs a non-binary request body? Are there any benefits to using binary data? If so, what are some examples when I would want to use binary data instead?
Many questions at once, let's decipher this:
So, with the results being the same, what exactly is the difference between is passing a binary vs a non-binary request body?
This has been mentioned in the comments and also in answers. The difference is that when not using "binary data" parsing might occur changing the meaning of the data (as raw binary data) to something different, like a file to upload or character encoding of the shell (which might apply as well to binary data but then it being less unexpected).
Are there any benefits to using binary data?
Yes, it's more expressive. You more correctly say which data you want to transfer under any circumstances (still modifications of option argument based on your shell rule might still apply).
If so, what are some examples when I would want to use binary data instead?
I'm lazy to provide these, so just do two tests, one with and one without. As you're testing and from your question you're expecting both to be the same. If you encode that expectation in the same test-case, you can add a data-provider and then add regressions when you run over them. These changes will answer your question(s) over time. Point in case is here, that if you can't answer the questiuon and even after feedback from Stackoverflow and other resources, you need to ensure that your expectations are fulfilled, even if these are two at once.
These are your tests. Write them as you understand them. If in error, fix them later. This is what version control is for. If you are in error, your tests will tell you. Use tests for your own needs. That is basically it. You might need to do changes in future as you were wrong. However the tests normally reflect your state of mind at the time you write them. So in this case, your tests should have alreay specified that you don't understand the difference between those two options, so just assert that both do the same thing. Document that by writing your test (but don't skip one option while you think it's the same but you're not really sure it is), and you will write the right test. You can fix tests later if you become aware of a mistake and at that time the change should fully document what you expect should be done. Don't hide your questions, assert the answers wihtin the test instead. A test is easy to run again, so you can easily check what you expect.
According to the curl man page:
--data-binary
(HTTP) This posts data exactly as specified with no extra processing whatsoever.
If you start the data with the letter @, the rest should be a filename. Data is posted in a similar manner as -d, --data does, except that newlines and carriage returns are preserved and conversions are never done.
If this option is used several times, the ones following the first will append data as described in -d, --data.
You should not receive base64 encoded data using the --data-binary option. If you do, its not curl related.
Straight to the question - the only benefit I see is that curl will not process the data passed. If you need to preserve newlines etc it makes sense to use it.
Yes. HTTP/1.1 message header blocks are text, but the payload of messages can be arbitrary binary data.
RFC 2046 defines the octet-stream subtype as follows:
4.5.1. Octet-Stream Subtype
The "octet-stream" subtype is used to indicate that a body contains arbitrary binary data.
And RFC 2045 defines binary data in context of MIME messages as follows:
2.9. Binary Data
"Binary data" refers to data where any sequence of octets whatsoever is allowed.