C# 如何解码嵌入在 json 字符串中的 HTML 编码字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1101532/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 08:15:33  来源:igfitidea点击:

How to decode HTML encoded character embedded in a json string

c#json

提问by mberube.Net

I've a little question about decoding special characters from a JSon result (in my case, \x27 but it could be any valid html encoded character). If the result doesn't contains any escaped characters, it works well but if not, I get a Unrecognized escape sequence exception. I try to do an HttpUtility.HtmlDecode on the Json string before deserializing using JavascriptSerializer, it doesn't work, the character still in encoded format.

我有一个关于从 JSon 结果解码特殊字符的小问题(在我的例子中, \x27 但它可以是任何有效的 html 编码字符)。如果结果不包含任何转义字符,则它运行良好,但如果不包含,我会收到无法识别的转义序列异常。我尝试在使用 JavascriptSerializer 反序列化之前对 Json 字符串执行 HttpUtility.HtmlDecode,但它不起作用,该字符仍为编码格式。

Here's a code snippet:

这是一个代码片段:

public IEnumerable<QuoteInfo> ParseJson(string json)
{
    System.Web.Script.Serialization.JavaScriptSerializer jss = new System.Web.Script.Serialization.JavaScriptSerializer();
    List<QuoteInfo> result = jss.Deserialize<List<QuoteInfo>>(System.Web.HttpUtility.HtmlDecode(json));
    return result;
}

I tried to use RegistersConverters to HtmlDecode any string I could find during deserialization but I can't figure out how to use it properly.

我尝试使用 RegistersConverters 对反序列化期间可以找到的任何字符串进行 HtmlDecode,但我不知道如何正确使用它。

How can I solve that problem?

我该如何解决这个问题?

As back2dos nicely explained, this problem wasn't related to an HtmlDecode problem but to an misformatted Json string.

正如 back2dos 很好地解释的那样,这个问题与 HtmlDecode 问题无关,而是与格式错误的 Json 字符串有关。

采纳答案by back2dos

ok, i have very superficial knowledge about C#, and none about the .NETAPI, but intuitively HtmlDecodeshould decode HTMLentities(please excuse me if i'm wrong on that one) ... encoding is quite a b*tch, i know, so i will try to clearly explain the differences between what you have, what you tried, and what should work ...

好的,我C#.NETAPI有非常肤浅的了解,但对API一无所知,但直觉上HtmlDecode应该解码HTML实体(如果我错了,请原谅我)......编码是非常糟糕的,我知道,所以我会试着清楚地解释你所拥有的、你尝试过的和应该工作的之间的区别......

the correct HTMLentitywould be &#x27and not \x27... \x27is a hexadecimal ASCIIescape-sequence, as accepted by some JSONdecoders and many programming languages, but is completely unrelated to HTML...

正确的HTML实体应该是&#x27而不是\x27...\x27是十六进制ASCII转义序列,正如一些JSON解码器和许多编程语言所接受的那样,但HTML完全无关......

and also, it has nothing to do with JSON, which is the problem ... JSON specs for stringsdo not allowhexadecimal ASCIIescape-sequences, but only Unicodeescape-sequences, which is why the escape sequence is unrecognized and which is why using \u0027instead should work ... now you could blindly replace \xwith \u00(this should perfectly work on validJSON, although some comments may get damaged in theory, but who cares ... :D)

而且,它与 无关JSON,这是问题......字符串的 JSON 规范不允许十六进制ASCII转义序列,而只允许Unicode转义序列,这就是为什么无法识别转义序列,这就是为什么使用\u0027相反应该工作......现在你可以盲目地替换\x\u00(这应该完全适用于有效JSON,尽管理论上某些评论可能会被损坏,但谁在乎......:D)

but personally, if you have access to the source, you should modify it, to make it output validJSONto match the specs ...

但就个人而言,如果您可以访问源代码,则应该对其进行修改,使其输出有效JSON以匹配规格......

greetz

问候语

back2dos

后退2dos

回答by Luke Schafer

I'm not sure I understand the requirements, but you could try looking at System.Security.SecurityElement.Escape (that's what I'm using, I'm guessing that there's an unescape but don't have time now to check the api, have to go to a meeting)

我不确定我是否理解要求,但您可以尝试查看 System.Security.SecurityElement.Escape(这就是我正在使用的,我猜有一个 unescape 但现在没有时间检查 api ,必须去开会)

Good luck

祝你好运