简单的 PHP 网络爬虫的示例

何庆 · 发表于 2023-1-18 20:56:55

要创建 PHP 网络爬虫，您需要使用一些不同的库和工具，例如 cURL 和正则表达式。cURL 库允许您向 Web 服务器发送 HTTP 请求并检索响应，而正则表达式用于从响应中搜索和提取特定模式的文本。
这是一个简单的 PHP 网络爬虫的示例：
<?php
// Initialize the cURL session
$curl = curl_init();

// Set the URL to crawl
$url = &#34;https://en.wikipedia.org/wiki/Web_crawler&#34;;

// Set the cURL options
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

// Send the request and retrieve the response
$response = curl_exec($curl);

// Check for errors
if (curl_errno($curl)) {
  // Handle error
} else {
  // Extract the title of the page
  $regex = &#34;/<title>(.*)<\/title>/&#34;;
  preg_match($regex, $response, $matches);
  $title = $matches[1];
  echo &#34;Title: $title\n&#34;;

  // Extract all the links on the page
  $regex = &#34;/<a href=\&#34;(.*)\&#34;>/&#34;;
  preg_match_all($regex, $response, $matches);
  $links = $matches[1];
  foreach ($links as $link) {
echo &#34;$link\n&#34;;
  }
}

// Close the cURL session
curl_close($curl);
这个 PHP 网络爬虫向指定的 URL 发送 HTTP GET 请求，检索响应，然后使用正则表达式提取页面标题和页面上的所有链接。当然，这只是一个简单的示例，真实世界中的 PHP 网络爬虫可能会更复杂并执行更高级的任务。

		自动登录	找回密码
密码			立即注册