python 爬取网易云音乐歌单

Python 爬取网易云音乐

来自哔哩哔哩学习视频爬取网易云音乐视频总结，也算是为自己记个笔记吧

所需库
requests 点击查看介绍
lxml 点击查看介绍
确定url地址
以谷歌浏览器

进入所需歌单，随便点首歌，右键，检查

找到network—XHR，如果没有内容，刷新一下，接下来可看到如图

在这里插入图片描述
可看到所需歌曲的url
https://m801.music.126.net/20191203211801/8a4fe99967c7a8b03ef13992bed3e408/jdyyaac/075b/560e/515a/01d7ceed01adc38a2402f0bce5efa4fa.m4a

用浏览器直接浏览，可播放歌曲

也就是可以用python来访问这个网址

#导入库(框架，模块)
import requests#1.确定url(网址，统一资源定位)地址
url = "https://m801.music.126.net/20191203211801/8a4fe99967c7a8b03ef13992bed3e408/jdyyaac/075b/560e/515a/01d7ceed01adc38a2402f0bce5efa4fa.m4a"#2.请求
music=requests.get(url).content#4.保存
#后面的wb,是允许写入二进制的
with open('mysic.m4a','wb') as file:file.write(music)

https://www.jianshu.com/p/c00df845323c with as 用法

下载成功

如果要下整个歌单，这种方法肯定是不行的
这是一场爬虫与反爬虫的较量

看到headers
为post请求，肯定会被加密
如图网址为外链地址
在这里插入图片描述
不过有个外链转化工具帮我们解决一切

在这里插入图片描述

干起来

#导入库(框架，模块)
import requests
from lxml import etree#1.确定url(网址，统一资源定位)地址，歌单地址，利用xpath得到每首歌的外链地址
url="https://music.163.com/playlist?id=10702884"
base_url='https://link.hhtjim.com/163/'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36',
}#2.请求
result=requests.get(url,headers=headers).text#3.删选数据
dom=etree.HTML(result)
ids = dom.xpath('//a[contains(@href,"/song?")]/@href')
#print(ids)
#https://link.hhtjim.com/163/347230.mp3
for songid in ids:#print(songid)count_id=songid.strip('/song?id=')#print(count_id)if ('$' in count_id) == False:music_url = base_url+'%s'%count_id+'.mp3'print(music_url)music=requests.get(music_url).content#4.保存with open('./music_wangyi/%s.mp3'%count_id,'wb') as file:file.write(music)