C# 使用Protobuf-net,突然出现未知线型异常
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2152978/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using Protobuf-net, I suddenly got an exception about an unknown wire-type
提问by Marc Gravell
(this is a re-post of a question that I saw in my RSS, but which was deleted by the OP. I've re-added it because I've seen this question asked several times in different places; wiki for "good form")
(这是我在 RSS 中看到的一个问题的重新发布,但已被 OP 删除。我重新添加了它,因为我在不同的地方多次看到这个问题;维基为“好形式”)
Suddenly, I receive a ProtoException
when deserializing and the message is: unknown wire-type 6
ProtoException
反序列化时突然收到一条消息:unknown wire-type 6
- What is a wire-type?
- What are the different wire-type values and their description?
- I suspect a field is causing the problem, how to debug this?
- 什么是线型?
- 什么是不同的电线类型值及其描述?
- 我怀疑某个字段导致了问题,如何调试?
采纳答案by Marc Gravell
First thing to check:
首先要检查:
IS THE INPUT DATA PROTOBUF DATA? If you try and parse another format (json, xml, csv, binary-formatter), or simply broken data (an "internal server error" html placeholder text page, for example), then it won't work.
输入数据是原型数据吗?如果您尝试解析另一种格式(json、xml、csv、二进制格式化程序),或者只是损坏的数据(例如,“内部服务器错误”html 占位符文本页面),那么它将无法工作。
What is a wire-type?
什么是线型?
It is a 3-bit flag that tells it (in broad terms; it is only 3 bits after all) what the next data looks like.
它是一个 3 位标志,它告诉它(从广义上讲;毕竟只有 3 位)下一个数据是什么样的。
Each field in protocol buffers is prefixed by a header that tells it which field (number) it represents, and what type of data is coming next; this "what type of data" is essential to support the case where unanticipateddata is in the stream (for example, you've added fields to the data-type at one end), as it lets the serializer know how to read past that data (or store it for round-trip if required).
protocol buffers 中的每个字段都有一个标头作为前缀,标头告诉它它代表哪个字段(数字),以及接下来是什么类型的数据;这个“什么类型的数据”对于支持流中存在意外数据的情况至关重要 (例如,您在一端向数据类型添加了字段),因为它让序列化程序知道如何读取过去数据(如果需要,或将其存储以供往返)。
What are the different wire-type values and their description?
什么是不同的电线类型值及其描述?
- 0: variant-length integer (up to 64 bits) - base-128 encoded with the MSB indicating continuation (used as the default for integer types, including enums)
- 1: 64-bit - 8 bytes of data (used for
double
, or electivelyforlong
/ulong
) - 2: length-prefixed - first read an integer using variant-length encoding; this tells you how many bytes of data follow (used for strings,
byte[]
, "packed" arrays, and as the default for child objects properties / lists) - 3: "start group" - an alternative mechanism for encoding child objects that uses start/end tags - largely deprecated by Google, it is more expensive to skip an entire child-object field since you can't just "seek" past an unexpected object
- 4: "end group" - twinned with 3
- 5: 32-bit - 4 bytes of data (used for
float
, or electivelyforint
/uint
and other small integer types)
- 0:变体长度整数(最多 64 位)——base-128 编码,MSB 表示继续(用作整数类型的默认值,包括枚举)
- 1:64 位 - 8 字节数据(用于
double
,或选择性地用于long
/ulong
) - 2:length-prefixed——首先使用变长编码读取一个整数;这会告诉您后面有多少字节的数据(用于字符串
byte[]
、“打包”数组,并作为子对象属性/列表的默认值) - 3:“起始组”——一种使用开始/结束标签对子对象进行编码的替代机制——在很大程度上被谷歌弃用,跳过整个子对象字段的成本更高,因为你不能只是“寻找”一个意想不到的东西目的
- 4:“结束组” - 与 3 结对
- 5:32 位 - 4 字节的数据(用于
float
,或选择性地用于int
/uint
和其他小整数类型)
I suspect a field is causing the problem, how to debug this?
我怀疑某个字段导致了问题,如何调试?
Are you serializing to a file? The most likelycause (in my experience) is that you have overwritten an existing file, but have not truncated it; i.e. it was200 bytes; you've re-written it, but with only 182 bytes. There are now 18 bytes of garbage on the end of your stream that is tripping it up. Files must be truncated when re-writing protocol buffers. You can do this with FileMode
:
你是序列化到一个文件吗?在最有可能的原因(在我的经验)是已覆盖现有文件,但还没有被截断它; 即它是200 个字节;你已经重写了它,但只有 182 个字节。流的末尾现在有 18 字节的垃圾导致它跳闸。重写协议缓冲区时必须截断文件。你可以这样做FileMode
:
using(var file = new FileStream(path, FileMode.Truncate)) {
// write
}
or alternatively by SetLength
afterwriting your data:
或者SetLength
在写入数据后:
file.SetLength(file.Position);
Other possible cause
其他可能的原因
You are (accidentally) deserializing a stream into a different type than what was serialized. It's worth double-checking both sides of the conversation to ensure this is not happening.
您(不小心)将流反序列化为与序列化不同的类型。值得仔细检查对话的双方以确保不会发生这种情况。
回答by Chriseyre2000
This can also be caused by an attempt to write more than one protobuf message to a single stream. The solution is to use SerializeWithLengthPrefix and DeserializeWithLengthPrefix.
这也可能是由于尝试将多个 protobuf 消息写入单个流而引起的。解决方案是使用 SerializeWithLengthPrefix 和 DeserializeWithLengthPrefix。
Why this happens:
为什么会发生这种情况:
The protobuf specification supports a fairly small number of wire-types (the binary storage formats) and data-types (the .NET etc data-types). Additionally, this is not 1:1, nor is is 1:many or many:1 - a single wire-type can be used for multiple data-types, and a single data-type can be encoded via any of multiple wire-types. As a consequence, you cannotfully understand a protobuf fragment unless you already know the scema, so you know how to interpret each value. When you are, say, reading an Int32
data-type, the supported wire-types might be "varint", "fixed32" and "fixed64", where-as when reading a String
data-type, the only supported wire-type is "string".
protobuf 规范支持相当少量的线类型(二进制存储格式)和数据类型(.NET 等数据类型)。此外,这不是 1:1,也不是 1:many 或 many:1 - 单一线型可用于多种数据类型,单一数据类型可通过多种线型中的任何一种进行编码. 因此,除非您已经了解场景,否则您无法完全理解 protobuf 片段,因此您知道如何解释每个值。例如,当您读取Int32
数据类型时,支持的连线类型可能是“varint”、“fixed32”和“fixed64”,而在读取String
数据类型时,唯一支持的连线类型是“string ”。
If there is no compatible map between the data-type and wire-type, then the data cannot be read, and this error is raised.
如果数据类型和线类型之间没有兼容映射,则无法读取数据,并引发此错误。
Now let's look at why this occurs in the scenario here:
现在让我们看看为什么会在此处的场景中发生这种情况:
[ProtoContract]
public class Data1
{
[ProtoMember(1, IsRequired=true)]
public int A { get; set; }
}
[ProtoContract]
public class Data2
{
[ProtoMember(1, IsRequired = true)]
public string B { get; set; }
}
class Program
{
static void Main(string[] args)
{
var d1 = new Data1 { A = 1};
var d2 = new Data2 { B = "Hello" };
var ms = new MemoryStream();
Serializer.Serialize(ms, d1);
Serializer.Serialize(ms, d2);
ms.Position = 0;
var d3 = Serializer.Deserialize<Data1>(ms); // This will fail
var d4 = Serializer.Deserialize<Data2>(ms);
Console.WriteLine("{0} {1}", d3, d4);
}
}
In the above, two messages are written directly after each-other. The complication is: protobuf is an appendable format, with append meaning "merge". A protobuf message does not know its own length, so the default way of reading a message is: read until EOF. However, here we have appended two differenttypes. If we read this back, it does not knowwhen we have finished reading the first message, so it keeps reading. When it gets to data from the second message, we find ourselves reading a "string" wire-type, but we are still trying to populate a Data1
instance, for which member 1 is an Int32
. There is no map between "string" and Int32
, so it explodes.
在上面,两条消息直接在彼此之后写入。复杂之处在于:protobuf 是一种可追加的格式,其中 append 的意思是“合并”。protobuf 消息不知道自己的长度,因此读取消息的默认方式是:一直读到 EOF。但是,这里我们附加了两种不同的类型。如果我们读回来,它不知道我们什么时候读完第一条消息,所以它一直在读。当它从第二条消息获取数据时,我们发现自己正在读取“字符串”线类型,但我们仍在尝试填充一个Data1
实例,其成员 1 是Int32
. "string" 和 之间没有映射Int32
,所以它会爆炸。
The *WithLengthPrefix
methods allowthe serializer to know where each message finishes; so, if we serialize a Data1
and Data2
using the *WithLengthPrefix
, then deserialize a Data1
and a Data2
using the *WithLengthPrefix
methods, then it correctlysplits the incoming data between the two instances, only reading the right value into the right object.
这些*WithLengthPrefix
方法允许序列化程序知道每条消息在哪里结束;因此,如果我们序列化 aData1
并Data2
使用*WithLengthPrefix
,然后使用这些方法反序列化 aData1
和 a ,那么它会在两个实例之间正确拆分传入数据,只将正确的值读入正确的对象。Data2
*WithLengthPrefix
Additionally, when storing heterogeneous data like this, you mightwant to additionally assign (via *WithLengthPrefix
) a different field-number to each class; this provides greater visibility of which type is being deserialized. There is also a method in Serializer.NonGeneric
which can then be used to deserialize the data without needing to know in advance what we are deserializing:
此外,当存储像这样的异构数据时,您可能希望额外(通过*WithLengthPrefix
)为每个类分配不同的字段编号;这为正在反序列化的类型提供了更大的可见性。还有一种方法Serializer.NonGeneric
可以用来反序列化数据,而无需事先知道我们正在反序列化的内容:
// Data1 is "1", Data2 is "2"
Serializer.SerializeWithLengthPrefix(ms, d1, PrefixStyle.Base128, 1);
Serializer.SerializeWithLengthPrefix(ms, d2, PrefixStyle.Base128, 2);
ms.Position = 0;
var lookup = new Dictionary<int,Type> { {1, typeof(Data1)}, {2,typeof(Data2)}};
object obj;
while (Serializer.NonGeneric.TryDeserializeWithLengthPrefix(ms,
PrefixStyle.Base128, fieldNum => lookup[fieldNum], out obj))
{
Console.WriteLine(obj); // writes Data1 on the first iteration,
// and Data2 on the second iteration
}
回答by Kirk Woll
Since the stack trace references this StackOverflow question, I thought I'd point out that you can also receive this exception if you (accidentally) deserialize a stream into a different type than what was serialized. So it's worth double-checking both sides of the conversation to ensure this is not happening.
由于堆栈跟踪引用了这个 StackOverflow 问题,我想我想指出的是,如果您(不小心)将流反序列化为与序列化的类型不同的类型,您也可以收到此异常。因此,值得仔细检查对话的双方以确保不会发生这种情况。
回答by Tomasito
Also check the obvious that all your subclasses have [ProtoContract]
attribute. Sometimes you can miss it when you have rich DTO.
还要检查所有子类都具有[ProtoContract]
属性的明显性。有时,当您拥有丰富的 DTO 时,您可能会错过它。
回答by Tobias
Previous answers already explain the problem better than I can. I just want to add an even simpler way to reproduce the exception.
以前的答案已经比我更好地解释了这个问题。我只想添加一种更简单的方法来重现异常。
This error will also occur simply if the type of a serialized ProtoMember
is different from the expected type during deserialization.
如果ProtoMember
在反序列化期间序列化的类型与预期类型不同,也会发生此错误。
For instance if the client sends the following message:
例如,如果客户端发送以下消息:
public class DummyRequest
{
[ProtoMember(1)]
public int Foo{ get; set; }
}
But what the server deserializes the message into is the following class:
但是服务器将消息反序列化为以下类:
public class DummyRequest
{
[ProtoMember(1)]
public string Foo{ get; set; }
}
Then this will result in the for this case slightly misleading error message
那么这将导致这种情况下稍微误导性的错误消息
ProtoBuf.ProtoException: Invalid wire-type; this usually means you have over-written a file without truncating or setting the length
ProtoBuf.ProtoException:线类型无效;这通常意味着您在没有截断或设置长度的情况下覆盖了文件
It will even occur if the property name changed. Let's say the client sent the following instead:
如果属性名称更改,它甚至会发生。假设客户端发送了以下内容:
public class DummyRequest
{
[ProtoMember(1)]
public int Bar{ get; set; }
}
This will still cause the server to deserialize the int
Bar
to string
Foo
which causes the same ProtoBuf.ProtoException
.
这仍然会导致服务器反序列化int
Bar
到string
Foo
这将导致相同的ProtoBuf.ProtoException
。
I hope this helps somebody debugging their application.
我希望这有助于某人调试他们的应用程序。
回答by Micah
I've seen this issue when using the improper Encoding
type to convert the bytes in and out of strings.
我在使用不正确的Encoding
类型将字节转换为字符串的输入和输出时看到了这个问题。
Need to use Encoding.Default
and not Encoding.UTF8
.
需要使用Encoding.Default
而不是Encoding.UTF8
。
using (var ms = new MemoryStream())
{
Serializer.Serialize(ms, obj);
var bytes = ms.ToArray();
str = Encoding.Default.GetString(bytes);
}
回答by Chris Xue
If you are using SerializeWithLengthPrefix, please mind that casting instance to object
type breaks the deserialization code and causes ProtoBuf.ProtoException : Invalid wire-type
.
如果您使用 SerializeWithLengthPrefix,请注意将实例转换为object
类型会破坏反序列化代码并导致ProtoBuf.ProtoException : Invalid wire-type
.
using (var ms = new MemoryStream())
{
var msg = new Message();
Serializer.SerializeWithLengthPrefix(ms, (object)msg, PrefixStyle.Base128); // Casting msg to object breaks the deserialization code.
ms.Position = 0;
Serializer.DeserializeWithLengthPrefix<Message>(ms, PrefixStyle.Base128)
}
回答by kamil-mrzyglod
This happened in my case because I had something like this:
这发生在我的情况下,因为我有这样的事情:
var ms = new MemoryStream();
Serializer.Serialize(ms, batch);
_queue.Add(Convert.ToBase64String(ms.ToArray()));
So basically I was putting a base64 into a queue and then, on the consumer side I had:
所以基本上我把一个 base64 放到一个队列中,然后在消费者方面我有:
var stream = new MemoryStream(Encoding.UTF8.GetBytes(myQueueItem));
var batch = Serializer.Deserialize<List<EventData>>(stream);
So though the type of each myQueueItemwas correct, I forgot that I converted a string. The solution was to convert it once more:
所以虽然每个myQueueItem的类型是正确的,但我忘记了我转换了一个字符串。解决方案是再次转换它:
var bytes = Convert.FromBase64String(myQueueItem);
var stream = new MemoryStream(bytes);
var batch = Serializer.Deserialize<List<EventData>>(stream);