使用Python进行第一个爬虫脚本
First web crawler
Development
Install python
First crawler program
Install dependent environment
1 | pip install requests |
First crawler code
- request
1 | import requests |
2.head
获取豆瓣网页信息
1.send requests
1 | import requests |
2.伪装浏览器请求
1 | import requests |
3.打印html源码
1 | import requests |
4.安装第三方库bs4
1 | pip install bs4 |
5.print title tag text
1 | import requests |
6 print string
1 | import requests |
7.只打印不带斜杠的文本
1 | import requests |
8.打印所有页面
1 | import requests |
All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.
