Html 奇怪的 Base64 编码/解码问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5628738/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Strange Base64 encode/decode problem
提问by Rich Sadowsky
I'm using Grails 1.3.7. I have some code that uses the built-in base64Encode function and base64Decode function. It all works fine in simple test cases where I encode some binary data and then decode the resulting string and write it to a new file. In this case the files are identical.
我正在使用 Grails 1.3.7。我有一些使用内置 base64Encode 函数和 base64Decode 函数的代码。在我编码一些二进制数据然后解码结果字符串并将其写入新文件的简单测试用例中,这一切都很好。在这种情况下,文件是相同的。
But then I wrote a web service that took the base64 encoded data as a parameter in a POST call. Although the length of the base64 data is identical to the string I passed into the function, the contents of the base64 data are being modified. I spend DAYS debugging this and finally wrote a test controller that passed the data in base64 to post and also took the name of a local file with the correct base64 encoded data, as in:
但后来我编写了一个 Web 服务,它将 base64 编码数据作为 POST 调用中的参数。虽然base64数据的长度与我传入函数的字符串相同,但是base64数据的内容正在被修改。我花了几天的时间调试这个,最后写了一个测试控制器,它通过 base64 中的数据来发布,并使用正确的 base64 编码数据获取本地文件的名称,如下所示:
data=AAA-base-64-data...&testFilename=/name/of/file/with/base64data
Within the test function I compared every byte in the incoming data parameter with the appropriate byte in the test file. I found that somehow every "+" character in the input data parameter had been replaced with a " " (space, ordinal ascii 32). Huh? What could have done that?
在测试函数中,我将传入数据参数中的每个字节与测试文件中的相应字节进行了比较。我发现不知何故输入数据参数中的每个“+”字符都被替换为“”(空格,序数 ascii 32)。嗯?什么能做到这一点?
To be sure I was correct, I added a line that said:
为了确保我是正确的,我添加了一行说:
data = data.replaceAll(' ', '+')
and sure enough the data decoded exactly right. I tried it with arbitrarily long binary files and it now works every time. But I can't figure out for the life of me what would be modifying the data parameter in the post to convert the ord(43) character to ord(32)? I know that the plus sign is one of the 2 somewhat platform dependent characters in the base64 spec, but given that I am doing the encoding and decoding on the same machine for now I am super puzzled what caused this. Sure I have a "fix" since I can make it work, but I am nervous about "fixes" that I don't understand.
果然,数据解码完全正确。我用任意长的二进制文件尝试过它,现在每次都可以工作。但是我一生都无法弄清楚如何修改帖子中的数据参数以将 ord(43) 字符转换为 ord(32)?我知道加号是 base64 规范中两个有点平台相关的字符之一,但鉴于我现在在同一台机器上进行编码和解码,我非常困惑是什么导致了这种情况。当然我有一个“修复”,因为我可以让它工作,但我对我不理解的“修复”感到紧张。
The code is too big to post here, but I get the base64 encoding like so:
代码太大,无法在此处发布,但我得到了 base64 编码,如下所示:
def inputFile = new File(inputFilename)
def rawData = inputFile.getBytes()
def encoded = rawData.encodeBase64().toString()
I then write that encoded string out to new a file so I can use it for testing later. If I load that file back in as so I get the same rawData:
然后我将编码后的字符串写入新文件,以便稍后使用它进行测试。如果我重新加载该文件,我会得到相同的 rawData:
def encodedFile = new File(encodedFilename)
String encoded = encodedFile.getText()
byte[] rawData = encoded.decodeBase64()
So all that is good. Now assume I take the "encoded" variable and add it to a param to a POST function like so:
所以这一切都很好。现在假设我将“编码”变量添加到 POST 函数的参数中,如下所示:
String queryString = "data=$encoded"
String url = "http://localhost:8080/some_web_service"
def results = urlPost(url, queryString)
def urlPost(String urlString, String queryString) {
def url = new URL(urlString)
def connection = url.openConnection()
connection.setRequestMethod("POST")
connection.doOutput = true
def writer = new OutputStreamWriter(connection.outputStream)
writer.write(queryString)
writer.flush()
writer.close()
connection.connect()
return (connection.responseCode == 200) ? connection.content.text : "error $connection.responseCode, $connection.responseMessage"
}
on the web service side, in the controller I get the parameter like so:
在 Web 服务端,在控制器中我得到如下参数:
String data = params?.data
println "incoming data parameter has length of ${data.size()}" //confirm right size
//unless I run the following line, the data does not decode to the same source
data = data.replaceAll(' ', '+')
//as long as I replace spaces with plus, this decodes correctly, why?
byte[] bytedata = data.decodeBase64()
Sorry for the long rant, but I'd really love to understand why I had to do the "replace space with plus sign" to get this to decode correctly. Is there some problem with the plus sign being used in a request parameter?
对不起,长篇大论,但我真的很想理解为什么我必须“用加号替换空格”才能正确解码。在请求参数中使用加号是否有问题?
采纳答案by ikegami
Whatever populates params
expects the request to be a URL-encoded form (specifically, application/x-www-form-urlencoded
, where "+" means space), but you didn't URL-encode it. I don't know what functions your language provides, but in pseudo code, queryString
should be constructed from
无论填充什么,都params
希望请求是 URL 编码形式(特别是application/x-www-form-urlencoded
,其中“+”表示空格),但您没有对其进行 URL 编码。我不知道你的语言提供了什么功能,但在伪代码中,queryString
应该从
concat(uri_escape("data"), "=", uri_escape(base64_encode(rawBytes)))
which simplifies to
这简化为
concat("data=", uri_escape(base64_encode(rawBytes)))
The "+
" characters will be replaced with "%2B
".
“ +
”字符将被替换为“ %2B
”。
回答by Polak
You have to use a special base64encode which is also url-safe. The problem is that standard base64encode includes +
, /
and =
characters which are replaced by the percent-encoded version.
您必须使用特殊的 base64encode,它也是 url 安全的。问题是标准 base64encode 包括+
,/
和=
被百分比编码版本替换的字符。
http://en.wikipedia.org/wiki/Base64#URL_applications
http://en.wikipedia.org/wiki/Base64#URL_applications
I'm using the following code in php:
我在 php 中使用以下代码:
/**
* Custom base64 encoding. Replace unsafe url chars
*
* @param string $val
* @return string
*/
static function base64_url_encode($val) {
return strtr(base64_encode($val), '+/=', '-_,');
}
/**
* Custom base64 decode. Replace custom url safe values with normal
* base64 characters before decoding.
*
* @param string $val
* @return string
*/
static function base64_url_decode($val) {
return base64_decode(strtr($val, '-_,', '+/='));
}
回答by Richard Schneider
Because it is a parameter to a POST you must URL encode the data.
因为它是 POST 的参数,所以您必须对数据进行 URL 编码。
回答by han
paraquote from the wikipedia link
来自维基百科链接的paraquote
The encoding used by default is based on a very early version of the general URI percent-encoding rules, with a number of modifications such as newline normalization and replacing spaces with "+" instead of "%20"
默认使用的编码基于通用 URI 百分比编码规则的早期版本,并进行了许多修改,例如换行规范化和用“+”代替“%20”替换空格
another hidden pitfall everyday web developers like myself know little about
另一个隐藏的陷阱 像我这样的日常 Web 开发人员对此知之甚少