[自用正则]URL匹配清洗,不完全

[自用正则]URL匹配清洗,不完全 1Code
这篇文章约2分阅读完。

//[a-zA-Z0-9][-a-zA-Z0-9]{0,62}(\.[a-zA-Z0-9][-a-zA-Z0-9]{0,62})+
\w(?!w)([a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6}

目标清洗URL到如下表,不要http(s),www,目前还未完成:
proctorgallagherinstitute.s3.amazonaws.com
sb.creatingfaces.at
octubre.com
cgttrucks.it
cdn.advocadoapp.com
cdn.airsidemobile.com
gcicentraldecompras.es
phactum.at
elbisco.gr
factsrebrand.com
2spaintransfers.com
jwoodruff-cdn.s3.amazonaws.com
andersonhouse.it
cdn.biogenesi.it
dfi.ch
armg-webassets.s3.amazonaws.com
veronamarathon.it
raasveldbennis.nl
howtosection.com
ngda.de
gobiztravels.com
recom.fr
blogs.tcc.fl.edu
bredelhomes.com
rointe.eu
0317.syzefxis.gov.gr
aglaiakyriakou.gr
aglaiakyriakou.gr
grimms.fr
v-f-a.de
com-sit.com
grapevineofsetx.com
welshisc.co.uk
kanzlei-braeu.de
d276mjqqnq755s.cloudfront.net
classichomesofmaryland.com
qa.triangle.eu.com
triangle.eu.com
a-rworks.grapica.fi
fiberbar.grapica.fi
bw.grapica.fi
stenbacka.grapica.fi
sigtunalitteraturfestival.se
hallbardestination.se
factselevate.com
proctorgallagherinstitute.s3.amazonaws.com
teste.grow.josedemello.pt
195.22.14.50
fundacaoameliademello.org.pt
neoharbor.com

评论

复制标题和 URL