(Regex) MatchCollection ohne For each schleife

  • VB.NET
  • .NET (FX) 4.5–4.8

Es gibt 6 Antworten in diesem Thema. Der letzte Beitrag () ist von Baa$.

    (Regex) MatchCollection ohne For each schleife

    Guten Tag,
    ich baue zurzeit ein Scraper und habe ein paar fragen.
    ich habe bereits Threading etc. eingebaut damit es alles im code schneller funktioniert, jedoch läuft mir alles zu langsam.

    VB.NET-Quellcode

    1. Public Sub ScrapeProxyDo(address As String)
    2. Dim wc As New Net.WebClient
    3. Dim matchCollection As MatchCollection
    4. Try
    5. Dim input As String = wc.DownloadString(address)
    6. matchCollection = REGEX.Matches(input)
    7. 'ncihts
    8. For Each obj As Object In matchCollection
    9. Dim match As Match = CType(obj, Match)
    10. Dim item As String = match.ToString()
    11. RichTextBox2.AppendText(item & Environment.NewLine)
    12. Next
    13. Catch ex As Exception
    14. 'Nichts
    15. End Try
    16. End Sub


    Der Code ist relativ simpel, er guckt ob die die seite eine IP mit einem Port beinhaltet die Regex dafür ist:

    VB.NET-Quellcode

    1. Dim REGEX As Regex = New Regex("\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\:[0-9]{1,5}\b")


    Nun jedoch nimmt er von dem string was runtergeladen wird jede proxy einzelt rauß und das dauert natürlich seine Zeit, kann man es vielleicht irgendwie abändern, dass er alle proxys rausfiltert und direkt in die RichTextBox einfügt?
    Das würde so nähmlich viel schneller gehen als wenn er sich von unten nach oben langsam durcharbeitet.

    LG

    Kashed schrieb:

    Nun jedoch nimmt er von dem string was runtergeladen wird jede proxy einzelt rauß und das dauert natürlich seine Zeit
    Ich glaub nicht, dass das gross was dauert.
    Ich verstehe aber auch nicht, was du meinst mit "proxis einzeln rausnehmen" im Gegensatz zu "alle proxys rausfiltert und direkt in die RichTextBox".

    Aber bau mal eine Stopwatch ein, und miss nach, wie lange das dauert.
    Randbemerkung:

    VB.NET-Quellcode

    1. For Each obj As Object In matchCollection
    2. Dim match As Match = CType(obj, Match)
    3. Dim item As String = match.ToString()
    4. RichTextBox2.AppendText(item & Environment.NewLine)
    5. Next

    Die beiden Variablen match und item, sind redundant.

    Besser, weil lesbarer , weil kürzer:

    VB.NET-Quellcode

    1. For Each obj As Match In matchCollection
    2. RichTextBox2.AppendText(match.ToString & Environment.NewLine)
    3. Next
    Vollzitat eines Vorposts durch Anredefunktion ersetzt ~VaporiZed

    @ErfinderDesRades:
    Guten Mittag,
    es dauert bei 100 Tausend linien ca. 6min und es soll in wenigen sekunden funktionieren,
    wenn sie den code ausprobieren, dann sehen sie was ich meine, dass jedes Item einzelt rausgenommen wird.
    Und ich will das die Regex komplett im String alle Matches rausfiltert und direkt einsetzt und sich nicht von unten nach Oben durcharbeitet.

    LG

    Dieser Beitrag wurde bereits 1 mal editiert, zuletzt von „VaporiZed“ ()

    Moin,
    gehe es etwas anders an schreib dir eine Scraper Funktion die eine List of string zurück gibt hier eine alte scraper source von mir.
    der war super schnell...

    VB.NET-Quellcode

    1. Imports System.Threading
    2. Imports System.IO
    3. Imports System.Net
    4. Imports System.Text
    5. Imports System.Text.RegularExpressions
    6. Public Class Form1
    7. Dim toolTip1 As New ToolTip()
    8. Private cts As CancellationTokenSource
    9. Private SourceList As New List(Of String), ScraperFound As New List(Of String)
    10. Dim WorkerBool As Boolean, SitesWithFinds As New List(Of String)
    11. Public SearchString As String = "[0-9]*?\.[0-9]*?\.[0-9]*?\.[0-9]*?\:[0-9]+", Remove_DP_Source As Boolean
    12. Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
    13. Progressbar_.Value = 0
    14. toolTip1.AutoPopDelay = 5000
    15. toolTip1.InitialDelay = 1000
    16. toolTip1.ReshowDelay = 500
    17. toolTip1.ShowAlways = True
    18. toolTip1.SetToolTip(Me.Start_Button, "Start")
    19. toolTip1.SetToolTip(Me.Load_Button, "Load")
    20. toolTip1.SetToolTip(Me.Clear_Button, "Clear")
    21. toolTip1.SetToolTip(Me.Save_Button, "Save")
    22. toolTip1.SetToolTip(Me.Settings_Button, "Settings")
    23. toolTip1.SetToolTip(Me.About_Button, "Developer")
    24. Button_Enable(True, False, False, True)
    25. Remove_DP_Source = True
    26. End Sub
    27. Private Sub Start_Button_Click(sender As Object, e As EventArgs) Handles Start_Button.Click
    28. If ListViewDB1.Items.Count <> 0 Then
    29. If Start_Button.ImageIndex = 5 Then toolTip1.SetToolTip(Me.Start_Button, "Start")
    30. 'If Start_Button.ImageIndex = 9 Then
    31. ' Me.Update_Status("Scraper Startet...")
    32. 'Else
    33. ' Me.Update_Status("Scraper Cancelled")
    34. 'End If
    35. If Start_Button.ImageIndex = 5 Then
    36. Start_Button.ImageIndex = 9
    37. If Start_Button.ImageIndex = 9 Then toolTip1.SetToolTip(Me.Start_Button, "Stop")
    38. cts = New CancellationTokenSource()
    39. Dim factory As New TaskFactory(cts.Token,
    40. TaskCreationOptions.PreferFairness,
    41. TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Current)
    42. Dim Progress As Integer = 0, Remaining As Integer = ListViewDB1.Items.Count
    43. Button_Enable(False, False, False, False)
    44. Me.Update_Status("Scraper Startet...")
    45. ScraperFound.Clear()
    46. SitesWithFinds.Clear()
    47. Progressbar_.Value = 0
    48. WorkerBool = True
    49. Try
    50. For Each iTEM As ListViewItem In ListViewDB1.Items
    51. factory.StartNew(New System.Action(Sub()
    52. If WorkerBool = True Then
    53. Dim List As New List(Of String)
    54. List = Me.SearchWebsiteForString(iTEM, Me.SearchString)
    55. If List.Count <> 0 Then
    56. SitesWithFinds.Add(iTEM.Text)
    57. Me.Update_Listview(iTEM, "Found: " & List.Count, Color.SeaGreen, Color.Lavender)
    58. Else
    59. Me.Update_Listview(iTEM, "Found: " & List.Count, Color.Crimson, Color.Lavender)
    60. End If
    61. Progress += 1 : Remaining -= 1
    62. Me.Update_Progressbar(Progress, ListViewDB1.Items.Count)
    63. 'Total Found: 0 | Remaining Sites: 0
    64. Me.Update_Status("Total Found: " & ScraperFound.Count & " | Remaining Sites: " & Remaining)
    65. Else
    66. cts.Cancel()
    67. Thread.Sleep(500)
    68. Me.Update_Status("Scraper Cancelled")
    69. End If
    70. End Sub))
    71. Next
    72. Task.WaitAny()
    73. Catch ex As OperationCanceledException
    74. Me.Update_Status("Scraper Cancelled")
    75. End Try
    76. Else
    77. WorkerBool = False
    78. Start_Button.ImageIndex = 5
    79. Button_Enable(True, True, True, True)
    80. End If
    81. Else
    82. MessageBox.Show("Pls, Load first Source List´s...")
    83. End If
    84. End Sub
    85. Private Sub Progressbar__ProgressChanged(sender As Object, Value As Integer) Handles Progressbar_.ProgressChanged
    86. If Progressbar_.Value = Progressbar_.Maximum Then
    87. Start_Button.ImageIndex = 5
    88. Me.Update_Status("Scraper Finished, with: " & ScraperFound.Count)
    89. MessageBox.Show("Scraper Finished, with: " & ScraperFound.Count)
    90. Button_Enable(True, True, True, True)
    91. End If
    92. End Sub
    93. Private Sub Button_Enable(ByVal Load As Boolean, Clear As Boolean, Save As Boolean, Settings As Boolean)
    94. Load_Button.Enabled = Load
    95. Clear_Button.Enabled = Clear
    96. Save_Button.Enabled = Save
    97. Settings_Button.Enabled = Settings
    98. End Sub
    99. Private Sub Load_Button_Click(sender As Object, e As EventArgs) Handles Load_Button.Click
    100. ListsLoader()
    101. End Sub
    102. Private Sub Clear_Button_Click(sender As Object, e As EventArgs) Handles Clear_Button.Click
    103. If SourceList.Count And ListViewDB1.Items.Count <> 0 Then
    104. SourceList.Clear()
    105. ScraperFound.Clear()
    106. ListViewDB1.Items.Clear()
    107. Me.Update_Status("All Cleared")
    108. Progressbar_.Value = 0
    109. Else
    110. End If
    111. Button_Enable(True, False, False, True)
    112. End Sub
    113. Private Sub ListsLoader()
    114. SourceList.Clear()
    115. Dim InterneList As New List(Of String)
    116. Dim ofd As New OpenFileDialog With {
    117. .Title = "Select Proxie Scraping List",
    118. .Filter = "TXT (*.txt)|*.txt",
    119. .FilterIndex = 0,
    120. .InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments),
    121. .Multiselect = True}
    122. If ofd.ShowDialog = Windows.Forms.DialogResult.OK Then
    123. For Each filename As String In ofd.FileNames
    124. InterneList.AddRange(IO.File.ReadAllLines(filename))
    125. Next
    126. SourceList = Me.RemoveDuplicate(InterneList)
    127. If MessageBox.Show("Remove " & InterneList.Count - SourceList.Count & " Duplicated Links?", "Frage", MessageBoxButtons.YesNo) _
    128. = Windows.Forms.DialogResult.Yes Then
    129. For Each Link In SourceList
    130. ListViewDB1.Items.Add(Link)
    131. Next
    132. Else
    133. For Each Link In InterneList
    134. ListViewDB1.Items.Add(Link)
    135. Next
    136. End If
    137. Button_Enable(False, True, False, True)
    138. Else
    139. Exit Sub
    140. End If
    141. Me.Update_Status("Addet " & ListViewDB1.Items.Count & " Source Links to List..")
    142. End Sub
    143. Private Sub Save_Button_Click(sender As Object, e As EventArgs) Handles Save_Button.Click
    144. If ScraperFound.Count <> 0 Then
    145. Dim s As New SaveFileDialog
    146. s.Filter = "text|*.txt"
    147. s.Title = "Save All Finds!"
    148. If s.ShowDialog = DialogResult.OK Then
    149. Dim fileName = s.FileName
    150. Dim sb As New StringBuilder
    151. For Each Proxy As String In ScraperFound
    152. sb.AppendLine($"{Proxy}")
    153. Next
    154. File.WriteAllText(fileName, sb.ToString)
    155. End If
    156. Status_Label.Text = "Proxys Saved!"
    157. Else
    158. Exit Sub
    159. End If
    160. End Sub
    161. Private Sub Settings_Button_Click(sender As Object, e As EventArgs) Handles Settings_Button.Click
    162. Setform.Show()
    163. End Sub
    164. Private Sub About_Button_Click(sender As Object, e As EventArgs) Handles About_Button.Click
    165. MessageBox.Show("Hi i Call myself DevilEggs" & vbCrLf & "and am a German Programmer, Designer etc." & vbCrLf & "Greetz to All boyz in the Hood BK-CRIME591 & EINHEIT591")
    166. End Sub
    167. Private Sub SaveEveryoneWithFindsToolStripMenuItem_Click(sender As Object, e As EventArgs) Handles SaveEveryoneWithFindsToolStripMenuItem.Click
    168. If ScraperFound.Count <> 0 Then
    169. Dim s As New SaveFileDialog
    170. s.Filter = "text|*.txt"
    171. s.Title = "Save All Links with Finds!"
    172. If s.ShowDialog = DialogResult.OK Then
    173. Dim fileName = s.FileName
    174. Dim sb As New StringBuilder
    175. For Each Proxy As String In SitesWithFinds
    176. sb.AppendLine($"{Proxy}")
    177. Next
    178. File.WriteAllText(fileName, sb.ToString)
    179. End If
    180. Status_Label.Text = "Saved!"
    181. Else
    182. Exit Sub
    183. End If
    184. End Sub
    185. #Region "function"
    186. Function RemoveDuplicate(ByVal TheList As List(Of String)) As List(Of String)
    187. Dim Result As New List(Of String)
    188. Dim Exist As Boolean
    189. For Each ElementString As String In TheList
    190. Exist = False
    191. For Each ElementStringInResult As String In Result
    192. If ElementString = ElementStringInResult Then
    193. Exist = True
    194. Exit For
    195. End If
    196. Next
    197. If Not Exist Then
    198. Result.Add(ElementString)
    199. End If
    200. Next
    201. Return Result
    202. End Function
    203. Public Function SearchWebsiteForString(ByVal Url As ListViewItem, ByVal Regex As String) As List(Of String)
    204. If Equals(Url, Nothing) Or Equals(Regex, Nothing) Then Throw New ArgumentNullException(Url.SubItems(1).Text)
    205. Dim result As New List(Of String)
    206. Dim Status As Integer
    207. Try
    208. If Url.Text = Nothing Then
    209. Exit Function
    210. Else
    211. Dim webRequest = CType(HttpWebRequest.Create(Url.Text), HttpWebRequest)
    212. Using webResponse As HttpWebResponse = webRequest.GetResponse()
    213. webRequest.Timeout = 1600
    214. webRequest.AllowAutoRedirect = True
    215. Status = webResponse.StatusCode
    216. Using SR As StreamReader = New StreamReader(webResponse.GetResponseStream())
    217. Dim content As String = SR.ReadToEnd
    218. Dim r As Regex = New Regex(Regex, RegexOptions.IgnoreCase)
    219. Dim match As Match = r.Match(content)
    220. While match.Success
    221. result.Add(match.Value)
    222. ScraperFound.Add(match.Value)
    223. match = match.NextMatch()
    224. End While
    225. End Using
    226. End Using
    227. End If
    228. Catch ex As Exception
    229. End Try
    230. Return result
    231. End Function
    232. #End Region
    233. #Region "invoke subs"
    234. Private Sub Update_Listview(ByVal iTEM As ListViewItem, ByVal Nachricht As String, ByVal Forer As Color, ByVal Back As Color)
    235. Dim ListviewItemSub As New ListViewItem.ListViewSubItem
    236. ListviewItemSub.Text = Nachricht
    237. ListviewItemSub.ForeColor = Forer
    238. ListviewItemSub.BackColor = Back
    239. Try
    240. If iTEM.SubItems(1).Text.Length <> 0 Then
    241. iTEM.SubItems.Remove(iTEM.SubItems(1))
    242. End If
    243. Catch ex As Exception
    244. End Try
    245. Try
    246. If Me.ListViewDB1.InvokeRequired Then
    247. Me.ListViewDB1.Invoke(Sub() Me.ListViewDB1.Items.Item(iTEM.Index).UseItemStyleForSubItems = False)
    248. Me.ListViewDB1.Invoke(Sub() Me.ListViewDB1.Items.Item(iTEM.Index).SubItems.Add(ListviewItemSub))
    249. Else
    250. Me.ListViewDB1.Items.Item(iTEM.Index).UseItemStyleForSubItems = False
    251. Me.ListViewDB1.Items.Item(iTEM.Index).SubItems.Add(ListviewItemSub)
    252. End If
    253. Catch ex As Exception
    254. End Try
    255. End Sub
    256. Private Sub Update_Status(ByVal News As String)
    257. If Me.Status_Label.InvokeRequired Then
    258. Me.Status_Label.Invoke(Sub() Me.Status_Label.Text = News)
    259. Else
    260. Me.Status_Label.Text = News
    261. End If
    262. End Sub
    263. Private Sub Update_Progressbar(paramValue As Integer, paramMaximum As Integer)
    264. If Me.Progressbar_.InvokeRequired Then
    265. Me.Progressbar_.Invoke(Sub() Me.Update_Progressbar(paramValue, paramMaximum))
    266. Else
    267. Me.Progressbar_.Maximum = paramMaximum
    268. Me.Progressbar_.Value = paramValue
    269. Me.Progressbar_.Update()
    270. End If
    271. End Sub
    272. #End Region
    273. End Class

    Dieser Beitrag wurde bereits 1 mal editiert, zuletzt von „Baa$“ ()