C# 创建短哈希的最佳方法是什么,类似于 tiny Url 的作用?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1116860/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What's the best way to create a short hash, similar to what tiny Url does?
提问by Arron S
I'm currently using MD5 hashes but I would like to find something that will create a shorter hash that uses just [a-z][A-Z][0-9]
. It only needs to be around 5-10 characters long.
我目前正在使用 MD5 散列,但我想找到一些可以创建仅使用[a-z][A-Z][0-9]
. 它只需要大约 5-10 个字符长。
Is there something out there that already does this?
有没有什么东西已经做到了这一点?
Update 1:
更新 1:
I like the CRC32hash. Is there a clean way of calculating it in .NET?
我喜欢CRC32哈希。在.NET 中是否有一种干净的计算方法?
Update 2:
更新 2:
I'm using the CRC32function from the link Joeprovided. How can I convert the uInt into the characters defined above?
我正在使用Joe提供的链接中的CRC32函数。如何将 uInt 转换为上面定义的字符?
采纳答案by Vlad
.NET string object has a GetHashCode() function. It returns an integer. Convert it into a hex and then to an 8 characters long string.
.NET 字符串对象有一个 GetHashCode() 函数。它返回一个整数。将其转换为十六进制,然后转换为 8 个字符长的字符串。
Like so:
像这样:
string hashCode = String.Format("{0:X}", sourceString.GetHashCode());
More on that: http://msdn.microsoft.com/en-us/library/system.string.gethashcode.aspx
更多相关信息:http: //msdn.microsoft.com/en-us/library/system.string.gethashcode.aspx
UPDATE:Added the remarks from the link above to this answer:
更新:将上面链接中的评论添加到此答案中:
The behavior of GetHashCode is dependent on its implementation, which might change from one version of the common language runtime to another. A reason why this might happen is to improve the performance of GetHashCode.
If two string objects are equal, the GetHashCode method returns identical values. However, there is not a unique hash code value for each unique string value. Different strings can return the same hash code.
Notes to Callers
The value returned by GetHashCode is platform-dependent. It differs on the 32-bit and 64-bit versions of the .NET Framework.
GetHashCode 的行为取决于它的实现,它可能会从公共语言运行时的一个版本更改为另一个版本。可能发生这种情况的一个原因是为了提高 GetHashCode 的性能。
如果两个字符串对象相等,则 GetHashCode 方法返回相同的值。但是,每个唯一的字符串值都没有唯一的哈希码值。不同的字符串可以返回相同的哈希码。
给来电者的注意事项
GetHashCode 返回的值是平台相关的。它在 .NET Framework 的 32 位和 64 位版本上有所不同。
回答by Arron S
You can use CRC32, it is 8 bytes long and similar to MD5. Unique values will be supported by adding timestamp to actual value.
您可以使用 CRC32,它有 8 个字节长,类似于 MD5。将时间戳添加到实际值将支持唯一值。
So its will look like http://foo.bar/abcdefg12.
所以它看起来像http://foo.bar/abcdefg12。
回答by M4N
You could take the first alphanumeric 5-10 characters of the MD5 hash.
您可以采用 MD5 哈希的第一个字母数字 5-10 个字符。
回答by Kevin Montrose
回答by j?rg
I dont think URL shortening services use hashes, I think they just have a running alphanumerical string that is increased with every new URL and stored in a database. If you really need to use a hash function have a look at this link: some hash functionsAlso, a bit offtopic but depending on what you are working on this might be interesting: Coding Horror article
我不认为 URL 缩短服务使用散列,我认为它们只是一个运行的字母数字字符串,随着每个新 URL 增加并存储在数据库中。如果您真的需要使用散列函数,请查看此链接:一些散列函数另外,有点离题,但取决于您正在处理的内容,这可能会很有趣:编码恐怖文章
回答by Colin
You can decrease the number of characters from the MD5 hash by encoding them as alphanumerics. Each MD5 character is usually represented as hex, so that's 16 possible values. [a-zA-Z0-9] includes 62 possible values, so you could encode each value by taking 4 MD5 values.
您可以通过将它们编码为字母数字来减少 MD5 哈希中的字符数。每个 MD5 字符通常表示为十六进制,因此有 16 个可能的值。[a-zA-Z0-9] 包含 62 个可能的值,因此您可以通过采用 4 个 MD5 值对每个值进行编码。
EDIT:
编辑:
here's a function that takes a number ( 4 hex digits long ) and returns [0-9a-zA-Z]. This should give you an idea of how to implement it. Note that there may be some issues with the types; I didn't test this code.
这是一个函数,它接受一个数字(4 个十六进制数字长)并返回 [0-9a-zA-Z]。这应该让您了解如何实现它。请注意,类型可能存在一些问题;我没有测试这段代码。
char num2char( unsigned int x ){
if( x < 26 ) return (char)('a' + (int)x);
if( x < 52 ) return (char)('A' + (int)x - 26);
if( x < 62 ) return (char)('0' + (int)x - 52);
if( x == 62 ) return '0';
if( x == 63 ) return '1';
}
回答by Arjan
You cannot use a shorthash as you need a one-to-one mapping from the short version to the actual value. For a short hash the chance for a collision would be far too high. Normal, long hashes, would not be very user-friendly (and even though the chance for a collision would probably be small enough then, it still wouldn't feel "right" to me).
您不能使用短散列,因为您需要从短版本到实际值的一对一映射。对于短散列,发生冲突的机会太高了。正常的长哈希不会对用户非常友好(即使碰撞的机会可能足够小,但对我来说仍然感觉不“正确”)。
TinyURL.com seems to usean incremented number that is converted to Base 36(0-9, A-Z).
回答by codymanix
You could encode your md5 hash code with base64 instead of hexadecimal, this way you get a shorter url using exactly the characters [a-z][A-Z][0-9].
您可以使用 base64 而不是十六进制对您的 md5 哈希码进行编码,这样您就可以使用字符 [az][AZ][0-9] 获得更短的 url。
回答by Scott Wisniewski
Is your goal to create a URL shortener or to create a hash function?
您的目标是创建 URL 缩短器还是创建哈希函数?
If your goal is to create a URL shortener, then you don't need a hash function. In that case, you just want to pre generate a sequence of cryptographically secure random numbers, and then assign each url to be encoded a unique number from the sequence.
如果您的目标是创建 URL 缩短器,那么您不需要哈希函数。在这种情况下,您只想预先生成一系列加密安全的随机数,然后从序列中为每个要编码的 url 分配一个唯一的数字。
You can do this using code like:
您可以使用以下代码执行此操作:
using System.Security.Cryptography;
const int numberOfNumbersNeeded = 100;
const int numberOfBytesNeeded = 8;
var randomGen = RandomNumberGenerator.Create();
for (int i = 0; i < numberOfNumbersNeeded; ++i)
{
var bytes = new Byte[numberOfBytesNeeded];
randomGen.GetBytes(bytes);
}
Using the cryptographic number generator will make it very difficult for people to predict the strings you generate, which I assume is important to you.
使用加密数字生成器将使人们很难预测您生成的字符串,我认为这对您很重要。
You can then convert the 8 byte random number into a string using the chars in your alphabet. This is basically a change of base calculation (from base 256 to base 62).
然后,您可以使用字母表中的字符将 8 字节随机数转换为字符串。这基本上是基数计算的变化(从基数 256 到基数 62)。
回答by Norman Ramsey
There's a wonderful but ancient program called btoa
which converts binary to ASCII using upper- and lower-case letters, digits, and two additional characters. There's also the MIME base64 encoding; most Linux systems probably have a program called base64
or base64encode
. Either one would give you a short, readable string from a 32-bit CRC.
有一个很棒但古老的程序btoa
,它使用大小写字母、数字和两个附加字符将二进制转换为 ASCII。还有 MIME base64 编码;大多数 Linux 系统可能有一个名为base64
或的程序base64encode
。任何一个都会为您提供来自 32 位 CRC 的简短可读字符串。
回答by KingNestor
Just take a Base36 (case-insensitive) or Base64 of the ID of the entry.
只需获取条目 ID 的 Base36(不区分大小写)或 Base64。
So, lets say I wanted to use Base36:
所以,假设我想使用 Base36:
(ID - Base36)
1 - 1
2 - 2
3 - 3
10 - A
11 - B
12 - C
...
10000 - 7PS
22000 - GZ4
34000 - Q8C
...
1000000 - LFLS
2345000 - 1E9EW
6000000 - 3KLMO
(ID - Base36)
1 - 1
2 - 2
3 - 3
10 - A
11 - B
12 - C
...
10000 - 7PS
22000 - GZ4
34000 - Q8C
...
1000000 - LFLS
2345000 - 6003EW
- 6003EW 0
You could keep these even shorter if you went with base64 but then the URL's would be case-sensitive. You can see you still get your nice, neat alphanumeric key and with a guarantee that there will be no collisions!
如果你使用 base64,你可以让这些更短,但 URL 将区分大小写。您可以看到您仍然获得漂亮、整洁的字母数字密钥,并保证不会发生冲突!