如何在c#中匹配URL?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1323283/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 15:08:27  来源:igfitidea点击:

How to match URL in c#?

c#regexurl

提问by Tomasz Smykowski

I have found many examples of how to match particular types of URL-s in PHP and other languages. I need to match any URL from my C# application. How to do this? When I talk about URL I talk about links to any sites or to files on sites and subdirectiories and so on.

我找到了许多关于如何在 PHP 和其他语言中匹配特定类型的 URL 的示例。我需要匹配来自我的 C# 应用程序的任何 URL。这该怎么做?当我谈论 URL 时,我谈论的是指向任何站点的链接或指向站点和子目录等文件的链接。

I have a text like this: "Go to my awsome website http:\www.google.pl\something\blah\?lang=5" or else and I need to get this link from this message. Links can start only with www. too.

我有这样的文字:“转到我的 awsome 网站 http:\www.google.pl\something\blah\?lang=5” 或者我需要从此消息中获取此链接。链接只能以 www 开头。也。

采纳答案by michele

If you need to test your regex to find URLs you can try this resource

如果您需要测试您的正则表达式以查找 URL,您可以尝试此资源

http://gskinner.com/RegExr/

http://gskinner.com/RegExr/

It will test your regex while you're writing it.

它会在您编写正则表达式时测试您的正则表达式。

In C# you can use regex for example as below:

在 C# 中,您可以使用正则表达式,例如:

Regex r = new Regex(@"(?<Protocol>\w+):\/\/(?<Domain>[\w@][\w.:@]+)\/?[\w\.?=%&=\-@/$,]*");
// Match the regular expression pattern against a text string.
Match m = r.Match(text);
while (m.Success) 
{
   //do things with your matching text 
   m = m.NextMatch();
}

回答by DanDan

I am not sure exactly what you are asking, but a good start would be the Uriclass, which will parse the url for you.

我不确定你到底在问什么,但一个好的开始是Uri类,它会为你解析 url。

回答by almog.ori

Regex regx = new Regex("http(s)?://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?", RegexOptions.IgnoreCase); 

回答by David

Here's one defined for URL's.

这是为 URL 定义的一个。

^http(s?)\:\/\/[0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*(:(0-9)*)*(\/?)([a-zA-Z0-9\-\.\?\,\'\/\\+&amp;%$#_]*)?$

http://msdn.microsoft.com/en-us/library/ms998267.aspx

http://msdn.microsoft.com/en-us/library/ms998267.aspx

回答by apiguy

This will return a match collection of all matches found within "yourStringThatHasUrlsInIt":

这将返回在“yourStringThatHasUrlsInIt”中找到的所有匹配项的匹配集合:

var pattern = @"((ht|f)tp(s?)\:\/\/|~/|/)?([w]{2}([\w\-]+\.)+([\w]{2,5}))(:[\d]{1,5})?";
var regex = new Regex(pattern);
var matches = regex.Matches(yourStringThatHasUrlsInIt);

The return will be a "MatchCollection" which you can read more about here:

返回将是“MatchCollection”,您可以在此处阅读更多信息:

http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.matchcollection.aspx

http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.matchcollection.aspx

回答by Chuck Rostance

Microsoft has a nice page of some regular expressions...this is what they say (works pretty good too)

微软有一个不错的正则表达式页面……这就是他们所说的(效果也很好)

^(ht|f)tp(s?)\:\/\/[0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*(:(0-9)*)*(\/?)([a-zA-Z0-9\-\.\?\,\'\/\\+&amp;%$#_]*)?$

http://msdn.microsoft.com/en-us/library/ff650303.aspx#paght000001_commonregularexpressions

http://msdn.microsoft.com/en-us/library/ff650303.aspx#paght000001_commonregularexpressions

回答by AnonymousUser

          //This code return (protocol://)host:port from URL

          //Commented URL's with different protocols. Just uncomment to test.
          //string url = "http://www.contoso.com:8080/letters/readme.html";
          //string url = "ftp://www.contoso.com:8080/letters/readme.html";
          //string url = "l2tp://1.5.8.6:8080/letters/readme.html";
          string url = "l2tp://1.5.8.6:8080/letters/readme.html";

          string host = "";//empty string with host from url
                //protocol, (ip/domain), port
          host = Regex.Match(url, @"^(?<proto>\w+)://+?(?<host>[A-Za-z0-9\-\.]+)+?(?<port>:\d+)?/", RegexOptions.None, TimeSpan.FromMilliseconds(150)).Result("${proto}://${host}${port}");
                //(ip/domain):port without protocol. If HTTPS board loading images from HTTP host.
          //host = Regex.Match(url, @"^(?<proto>\w+)://+?(?<host>[A-Za-z0-9\-\.]+)+?(?<port>:\d+)?/", RegexOptions.None, TimeSpan.FromMilliseconds(150)).Result("${host}${port}");

          Console.WriteLine("url: "+url+"\nhost: "+host); //display host

see https://rextester.com/PVSO54371

https://rextester.com/PVSO54371