嘟嘟社区

有正则大佬吗?一包烟钱30元


本帖最后由 Far 于 2022-7-14 22:44 编辑

以第一个大佬正确的结果为准,我发红包。

  1. https://www.有图比.com/watch?v=Bo3BdddaUo<br />http://baidu.com/hsjjs/xxxxx https://qq.com/v/xxxxx https://hostloc.com/thread-22895-1-1.htmlhttps://hostloc.com/thread-47070-1-1.html
  2. https://www.有图比.com/watch?v=d0sfdsdhHca0<br />http://youku.com/5544 http://b.com/dddd
  3. <br />
  4. </td></tr></table>

复制代码

python
一包烟钱30元,感谢大佬

折腾了一晚上还是写不出来想要的正则,只能找大佬们

需求:
在上面代码页面中,除了youku.com和b.com域名不匹配,其他链接都匹配到
以换行和空格作为分割为一个链接

从上面页面中需要得到的正确结果:
5个链接:
https://www.有图比.com/watch?v=Bo3BdddaUo
http://baidu.com/hsjjs/xxxxx
https://qq.com/v/xxxxx
https://hostloc.com/thread-22895-1-1.html[url]https://hostloc.com/thread-47070-1-1.html[/url]
https://www.有图比.com/watch?v=d0sfdsdhHca0

排除2个链接:
http://youku.com/5544
http://b.com/dddd

不知道难度大不大,可以在加10元雪糕钱!

谢谢大佬.

技术这么不值钱吗?虽然我不会
排除应该再写语句  不是用正则
//([w-]+.)+[w-]+(/[w- ./?%&=]*)?
  1. (https?://(?:(?!(youku.com|b.com)))[^.]*(.([^<s](?!http))+)+)

复制代码

(https?)://(?!(b.com|youku.com))[u4e00-u9fa5-A-Za-z0-9+&@#/%?=~_|!:,.;]+[-A-Za-z0-9+&@#/%=~_|]
  1. [(‘http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">’, ”, ‘.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">’, ‘>’), (‘http://www.w3.org/1999/xhtml">’, ”, ‘.w3.org/1999/xhtml">’, ‘>’), (‘https://www.有图比.com/watch?v=Bo3BvhGdaU’, ”, ‘.有图比.com/watch?v=Bo3BvhGdaU’, ‘U’), (‘https://www.有图比.com/watch?v=d0xRgvhHca’, ”, ‘.有图比.com/watch?v=d0xRgvhHca’, ‘a’), (‘http://baidu.com/hsjjs/xxxxx’, ”, ‘.com/hsjjs/xxxxx’, ‘x’), (‘https://qq.com/v/xxxxx’, ”, ‘.com/v/xxxxx’, ‘x’), (‘https://hostloc.com’, ”, ‘.com’, ‘m’), (‘https://www.有图比.com/watch?v=Bo3BvhGdaUo’, ”, ‘.有图比.com/watch?v=Bo3BvhGdaUo’, ‘o’), (‘https://www.有图比.com/watch?v=d0xRgvhHca0’, ”, ‘.有图比.com/watch?v=d0xRgvhHca0’, ‘0’), (‘http://baidu.com/hsjjs/xxxxx’, ”, ‘.com/hsjjs/xxxxx’, ‘x’), (‘https://qq.com/v/xxxxx’, ”, ‘.com/v/xxxxx’, ‘x’), (‘https://hostloc.com/thread-22895-1-1.htm’, ”, ‘.com/thread-22895-1-1.htm’, ‘m’), (‘https://hostloc.com/thread-47070-1-1.html’, ”, ‘.com/thread-47070-1-1.html’, ‘l’), (‘https://www.有图比.com/watch?v=d0sfdsdhHca0&lt;br’, ”, ‘.有图比.com/watch?v=d0sfdsdhHca0&lt;br’, ‘r’), (‘http://youku.com/5544’, ”, ‘.com/5544’, ‘4’), (‘http://www.discuz.net"’, ”, ‘.discuz.net"’, ‘"’)]

复制代码

大佬,您的正则已经非常接近了,这是我放在页面匹配的, 方便加qq聊一下吗? 大佬

linkey 发表于 2022-7-14 22:40
(https?)://(?!(b.com|youku.com))[%u4e00-%u9fa5-A-Za-z0-9+&@#/%?=~_|!:,.;]+[-A-Za-z0-9+&@#/%=~_|]

  1. [(‘http’, ”), (‘http’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘http’, ”), (‘http’, ”), (‘https’, ”), (‘https’, ”), (‘http’, ”), (‘https’, ”), (‘https’, ”), (‘https’, ”), (‘http’, ”), (‘https’, ”), (‘https’, ”), (‘http’, ”)]

复制代码

大佬 匹配出的结果,只有头

本帖最后由 buggysoul 于 2022-7-14 22:53 编辑
  1. (https?:/(?:(?!/(youku.com|b.com)))(/[^./<s]*(.?[^./<s](?:(?!ttps?:)))+)+)

复制代码

少了l

  1. [(”, ‘/xhtml1-transitional.dtd">’, ‘>’), (”, ‘/xhtml">’, ‘>’), (”, ‘/watch?v=Bo3BvhGdaUohttps:’, ‘:’), (”, ‘/xxxxx’, ‘x’), (”, ‘/xxxxx’, ‘x’), (”, ‘/hostloc.com’, ‘m’), (”, ‘/watch?v=Bo3BvhGdaUo’, ‘o’), (”, ‘/watch?v=d0xRgvhHca0’, ‘0’), (”, ‘/xxxxx’, ‘x’), (”, ‘/xxxxx’, ‘x’), (”, ‘/thread-22895-1-1.html’, ‘l’), (”, ‘/thread-47070-1-1.html’, ‘l’), (”, ‘/watch?v=d0sfdsdhHca0&lt;br’, ‘r’), (”, ‘/5544’, ‘4’), (”, ‘/www.discuz.net"’, ‘"’)]

复制代码

大佬,反而没前面的有效!  
我把html源码贴在下面, 大佬您复制可以测试一下

test.rar (9.5 KB, 下载次数: 0)

昨天 22:59 上传

点击文件名下载附件