Python 获取公开信息

       大众点评店铺页面最大分页数是50页,要抓取信息就是通过区域、店铺类型分解到最小达到尽可能全的抓取。以成都餐饮为例,每种分类先取到最小,区域范围依次从成都到区县到街道,如果大区域该分类小于50页就可以抓取,否则继续分解。

       大众的页面有时候有加密,是通过把数据字体设置为它独有格式来实现,下载对应字体对应转码即可,有时候没有加密就可以跳过不管。

        首先把数据根据地区和类型分解到小于50页并存在数据库,然后一页页抓取基本信息,最后通过观察的接口获取详细信息如详细地址、经纬度、各项评分、评价数等。

# -*- coding: utf-8 -*-
import json
import requests
import pymysql
import time
from fontTools.ttLib import TTFontdef woff_dict(key):if key == 'address':woff = TTFont('C:\\Users\\Administrator\\Desktop\\address.woff') # 读取woff文件elif key == 'num':woff = TTFont('C:\\Users\\Administrator\\Desktop\\num.woff') # 读取woff文件# woff文件中ID编号为2~602的601个字符woff_str_601 = '1234567890店中美家馆小车大市公酒行国品发电金心业商司超生装园场食有新限天面工服海华水房饰城乐汽香部利子老艺花专东肉菜学福饭人百餐茶务通味所山区门药银农龙停尚安广鑫一容动南具源兴鲜记时机烤文康信果阳理锅宝达地儿衣特产西批坊州牛佳化五米修爱北养卖建材三会鸡室红站德王光名丽油院堂烧江社合星货型村自科快便日民营和活童明器烟育宾精屋经居庄石顺林尔县手厅销用好客火雅盛体旅之鞋辣作粉包楼校鱼平彩上吧保永万物教吃设医正造丰健点汤网庆技斯洗料配汇木缘加麻联卫川泰色世方寓风幼羊烫来高厂兰阿贝皮全女拉成云维贸道术运都口博河瑞宏京际路祥青镇厨培力惠连马鸿钢训影甲助窗布富牌头四多妆吉苑沙恒隆春干饼氏里二管诚制售嘉长轩杂副清计黄讯太鸭号街交与叉附近层旁对巷栋环省桥湖段乡厦府铺内侧元购前幢滨处向座下臬凤港开关景泉塘放昌线湾政步宁解白田町溪十八古双胜本单同九迎第台玉锦底后七斜期武岭松角纪朝峰六振珠局岗洲横边济井办汉代临弄团外塔杨铁浦字年岛陵原梅进荣友虹央桂沿事津凯莲丁秀柳集紫旗张谷的是不了很还个也这我就在以可到错没去过感次要比觉看得说常真们但最喜哈么别位能较境非为欢然他挺着价那意种想出员两推做排实分间甜度起满给热完格荐喝等其再几只现朋候样直而买于般豆量选奶打每评少算又因情找些份置适什蛋师气你姐棒试总定啊足级整带虾如态且尝主话强当更板知己无酸让入啦式笑赞片酱差像提队走嫩才刚午接重串回晚微周值费性桌拍跟块调糕'# ['cmap']为字符与Unicode编码的映射关系列表woff_unicode = woff['cmap'].tables[0].ttFont.getGlyphOrder()  # 获取603个字符对应的unicode编码woff_character = ['.notdef', 'x'] + list(woff_str_601) # 添加编号为0、1的两个特殊字符woff_dict = dict(zip(woff_unicode, woff_character))return woff_dictdef decodestr(firststr):strlist = firststr.split("<")laststr = ""for single in strlist:single = single.replace("/d>","").replace("/e>","")if single.find("address")>0:single = single[-5:-1]laststr += addressdict[single]#print(addressdict[single])elif single.find("num")>0:single = single[-5:-1]#print(numdict[single])laststr += numdict[single]elif single !="":laststr += singlereturn laststr#根据链接获取当前条件下结果的页数
def getpagecount(URLstr,countryname):try:res = requests.get(URLstr,headers=headers).textexcept:time.sleep(120)return getpagecount(URLstr,countryname)#如果抓取被限制,休眠后重新抓取if res.find("403 Forbidden")>0:time.sleep(60)print(URLstr+"  "+"403 forbidden   "+countryname)return getpagecount(URLstr,countryname)#当分页栏不存在说明只有一页if res.find("没有找到符合条件的商户")>0:pageCount = 0elif res.find("div class=\"page\"")<0:pageCount = 1print(URLstr+" "+"1页   "+countryname)else:pagestr = res[res.find("div class=\"page\""):]pagestr = pagestr[:pagestr.find("</div>")].replace("title=\"下一页\">下一页","")pagestr = pagestr.split("</a>")pagestr.reverse()for page in pagestr:if page.find("title=\"")>0:pageCount = page[page.find("title=\"")+7:]pageCount = pageCount[:pageCount.find("\"")]print(URLstr+" "+pageCount+"页  "+countryname)pageCount = (int)(pageCount) breakreturn pageCountif __name__ == '__main__':woffnum = (str)(woff_dict('num')).replace("{","").replace("}","").replace(" ","").replace("'uni","'")woffaddress = (str)(woff_dict('address')).replace("{","").replace("}","").replace(" ","").replace("'uni","'")numdict = {}newdict = woffnum.split(",")for d in newdict:d = '{' + d + '}'d = eval(d)numdict.update(d)addressdict = {}newdict = woffaddress.split(",")for d in newdict:d = '{' + d + '}'d = eval(d)addressdict.update(d)baseURL =  "https://www.dianping.com/chengdu/ch10"requeststr1 = baseURLheaders = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36","Cookie" : "自己的cookie",'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7'}#打开数据库连接conn = pymysql.connect(host = 'localhost',user = "root",passwd = "自己的密码",db = "大众点评")cur=conn.cursor()querysql = "SELECT mainParentCategoryId,pageCount,countryid,url,islast FROM dazhong_paging_restaurant"cur.execute(querysql)if cur.rowcount<1:print("需要初始化分页数据库")else:lists = cur.fetchall()for list in lists:mainParentCategoryId = list[0]pageCount = list[1]countryid = list[2]URLstr = list[3]islast = list[4]#超过五十页则继续细分if pageCount==None or (pageCount==50 and islast!=1): #根据链接获取当前分页的页数               pageCount = getpagecount(URLstr,"")if pageCount==0:continue#如果在50页内,更新数据库if pageCount<50:insertSQLStrings="REPLACE INTO `大众点评`.`dazhong_paging_restaurant`(`mainParentCategoryId`, `pageCount`, `countryid`, `url`) VALUES ({},{},{},'{}')".format(mainParentCategoryId,pageCount,countryid,URLstr)cur.execute(insertSQLStrings)#如果大于50页,继续细分到各区市县,如果区市县还大于50页,最低细分到街道else:querysql = "SELECT countryid,countryname,parentid FROM chengduareacode WHERE parentid = {}".format(countryid)cur.execute(querysql)#如果已经到最细分层级还大于50页,只能记录在数据库中if cur.rowcount<1:insertSQLStrings="REPLACE INTO `大众点评`.`dazhong_paging_restaurant`(`mainParentCategoryId`, `pageCount`, `countryid`, `url`,`islast`) VALUES ({},50,{},'{}',1)".format(mainParentCategoryId,countryid,URLstr)cur.execute(insertSQLStrings)else:countryids = cur.fetchall()for countryid in countryids:time.sleep(11)countryname = countryid[1]countryid = countryid[0]if countryid in (10,35,36,37,38,39,4956):URLstrnew = URLstr+"r"+(str)(countryid)else:URLstrnew = URLstr+"c"+(str)(countryid)pageCount = getpagecount(URLstrnew,countryname)if pageCount==0:continueinsertSQLString1="DELETE from `大众点评`.`dazhong_paging_restaurant` where url='{}'".format(URLstr)cur.execute(insertSQLString1)if pageCount<50:#细分到区市县后,先删除市级条目,再把区市县条目写入insertSQLString2="REPLACE INTO `大众点评`.`dazhong_paging_restaurant`(`mainParentCategoryId`, `pageCount`, `countryid`, `url`) VALUES ({},{},{},'{}')".format(mainParentCategoryId,pageCount,countryid,URLstrnew)cur.execute(insertSQLString2)URLstrnew = URLstrelse:#继续细化到街道querysql = "SELECT countryid,countryname,parentid FROM chengduareacode WHERE parentid = {}".format(countryid)cur.execute(querysql)#如果已经到最细分层级还大于50页,只能记录在数据库中if cur.rowcount<1:insertSQLStrings="REPLACE INTO `大众点评`.`dazhong_paging_restaurant`(`mainParentCategoryId`, `pageCount`, `countryid`, `url`,`islast`) VALUES ({},50,{},'{}',1)".format(mainParentCategoryId,countryid,URLstr)cur.execute(insertSQLStrings)else:countryids = cur.fetchall()for countryid in countryids:time.sleep(11)countryid = countryid[0]URLstrnew = URLstr+"r"+(str)(countryid)pageCount = getpagecount(URLstrnew,"")if pageCount==0:continueif pageCount<50:#细分到街道后,先删除区市县条目,再把街道条目写入#insertSQLString1="DELETE from `大众点评`.`dazhong_paging_restaurant` where url='{}'".format(URLstr)insertSQLString2="REPLACE INTO `大众点评`.`dazhong_paging_restaurant`(`mainParentCategoryId`, `pageCount`, `countryid`, `url`) VALUES ({},{},{},'{}')".format(mainParentCategoryId,pageCount,countryid,URLstrnew)#cur.execute(insertSQLString1)cur.execute(insertSQLString2)URLstrnew = URLstr#如果已经到最细分层级还大于50页,只能记录在数据库中                                                        elif pageCount==50:insertSQLStrings="REPLACE INTO `大众点评`.`dazhong_paging_restaurant`(`mainParentCategoryId`, `pageCount`, `countryid`, `url`,`islast`) VALUES ({},50,{},'{}',1)".format(mainParentCategoryId,countryid,URLstrnew)cur.execute(insertSQLStrings)URLstrnew = URLstrprint("最小限度划分已满50页:")print(insertSQLStrings)conn.commit()conn.commit()

   

这一步完成后,根据这些分好类的连接抓取基本信息

# -*- coding: utf-8 -*-
import json
import requests
from fontTools.ttLib import TTFont
import pymysql
import timedef woff_dict(key):if key == 'address':woff = TTFont('C:\\Users\\Administrator\\Desktop\\address.woff') # 读取woff文件elif key == 'num':woff = TTFont('C:\\Users\\Administrator\\Desktop\\num.woff') # 读取woff文件# woff文件中ID编号为2~602的601个字符woff_str_601 = '1234567890店中美家馆小车大市公酒行国品发电金心业商司超生装园场食有新限天面工服海华水房饰城乐汽香部利子老艺花专东肉菜学福饭人百餐茶务通味所山区门药银农龙停尚安广鑫一容动南具源兴鲜记时机烤文康信果阳理锅宝达地儿衣特产西批坊州牛佳化五米修爱北养卖建材三会鸡室红站德王光名丽油院堂烧江社合星货型村自科快便日民营和活童明器烟育宾精屋经居庄石顺林尔县手厅销用好客火雅盛体旅之鞋辣作粉包楼校鱼平彩上吧保永万物教吃设医正造丰健点汤网庆技斯洗料配汇木缘加麻联卫川泰色世方寓风幼羊烫来高厂兰阿贝皮全女拉成云维贸道术运都口博河瑞宏京际路祥青镇厨培力惠连马鸿钢训影甲助窗布富牌头四多妆吉苑沙恒隆春干饼氏里二管诚制售嘉长轩杂副清计黄讯太鸭号街交与叉附近层旁对巷栋环省桥湖段乡厦府铺内侧元购前幢滨处向座下臬凤港开关景泉塘放昌线湾政步宁解白田町溪十八古双胜本单同九迎第台玉锦底后七斜期武岭松角纪朝峰六振珠局岗洲横边济井办汉代临弄团外塔杨铁浦字年岛陵原梅进荣友虹央桂沿事津凯莲丁秀柳集紫旗张谷的是不了很还个也这我就在以可到错没去过感次要比觉看得说常真们但最喜哈么别位能较境非为欢然他挺着价那意种想出员两推做排实分间甜度起满给热完格荐喝等其再几只现朋候样直而买于般豆量选奶打每评少算又因情找些份置适什蛋师气你姐棒试总定啊足级整带虾如态且尝主话强当更板知己无酸让入啦式笑赞片酱差像提队走嫩才刚午接重串回晚微周值费性桌拍跟块调糕'# ['cmap']为字符与Unicode编码的映射关系列表woff_unicode = woff['cmap'].tables[0].ttFont.getGlyphOrder()  # 获取603个字符对应的unicode编码woff_character = ['.notdef', 'x'] + list(woff_str_601) # 添加编号为0、1的两个特殊字符woff_dict = dict(zip(woff_unicode, woff_character))return woff_dictdef decodestr(firststr):strlist = firststr.split("<")laststr = ""for single in strlist:single = single.replace("/d>","").replace("/e>","")if single.find("address")>0:single = single[-5:-1]laststr += addressdict[single]#print(addressdict[single])elif single.find("num")>0:single = single[-5:-1]#print(numdict[single])laststr += numdict[single]elif single !="":laststr += singlereturn laststrif __name__ == '__main__':woffnum = (str)(woff_dict('num')).replace("{","").replace("}","").replace(" ","").replace("'uni","'")woffaddress = (str)(woff_dict('address')).replace("{","").replace("}","").replace(" ","").replace("'uni","'")numdict = {}newdict = woffnum.split(",")for d in newdict:d = '{' + d + '}'d = eval(d)numdict.update(d)addressdict = {}newdict = woffaddress.split(",")for d in newdict:d = '{' + d + '}'d = eval(d)addressdict.update(d)headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36","Cookie" : "自己的",'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7'}conn = pymysql.connect(host = 'localhost',user = "root",passwd = "自己的",db = "大众点评")cur=conn.cursor()querysql = "SELECT url,hasGet,finish FROM dazhong_paging_restaurant"cur.execute(querysql)lists = cur.fetchall()for list in lists:url = list[0]hasGet = list[1]finish = list[2]if hasGet==None:hasGet=0             hasGet += 1       if finish!=1:url += "o3p"for i in range(1,51):if hasGet>i:print("已抓取,跳过该页:"+(str)(i))continueurlnew = url+(str)(i)requeststr0 = urlnewtry:res = requests.get(requeststr0,headers=headers,timeout=100).textexcept:time.sleep(80)res = requests.get(requeststr0,headers=headers,timeout=100).text#如果抓取被限制,休眠后重新抓取if res.find("403 Forbidden")>0:print("403访问被限制,已退出")exit()#如果页数到尽头 就退出该分类if res.find("没有找到符合条件的商户")>0:breakres = res[res.find("shop-all-list"):res.find("商户没有被收录")]res = res.split("<li class=\"\" >")for re in res:if len(re)<50:continueshopid = re[re.find("data-shopid=\"")+13:]shopid = shopid[:shopid.find("\"")]shopAllname = re[re.find("<h4>")+4:re.find("</h4>")].replace("'","\\'")if re.find("https://www.dianping.com/brands/")>0:shopGroupId = re[re.find("https://www.dianping.com/brands/")+32:re.find("\" module=\"list-branch\"")]else:shopGroupId = ""if re.find("我要评价")>0:defaultReviewCount = 0else:defaultReviewCount = re[re.find("<b>")+3:re.find("</b>")]avgPrice = re[re.find("人均"):]if avgPrice.find("-")==13:avgPrice=0else:avgPrice = avgPrice[avgPrice.find("<b>")+4:avgPrice.find("</b>")]if re.find("istopTrade")>0:status = re[re.find("istopTrade")+12:]status = status[:status.find("</span>")]else:status=""countryAndtype = re[re.find("tag-addr"):]mainParentCategoryId = countryAndtype[countryAndtype.find("/g")+2:countryAndtype.find("\" data-click-name")]categoryName = countryAndtype[countryAndtype.find("class=\"tag\">")+12:countryAndtype.find("</span>")]countryAndtype = countryAndtype[countryAndtype.find("\"sep\""):]countryid = countryAndtype[countryAndtype.find("/r")+2:countryAndtype.find("\" data-click-name")]countryname = countryAndtype[countryAndtype.find("class=\"tag\">")+12:countryAndtype.find("</span>")]if countryid.find("|")>0:print("该店铺信息异常被跳过:"+shopid)continueif re.find("class=\"recommend\"")>0: recommendstr = re[re.find("class=\"recommend\"")+16:]recommendstr = recommendstr[:recommendstr.find("</div>")]recommendstr = recommendstr.split("\">")recommend = ""for recommendtemp in recommendstr:if recommendtemp.find("</a>")>0:                            recommendtemp = recommendtemp[:recommendtemp.find("</a>")]recommend = recommend+recommendtemp+" "else:recommend = ""                                                        print(shopid+" "+shopAllname+" "+shopGroupId+" "+(str)(defaultReviewCount)+" "+(str)(avgPrice)+" "+mainParentCategoryId+" "+categoryName+" "+countryid+" "+countryname+" "+status+" "+recommend)insertSQLStrings="REPLACE INTO `大众点评`.`shopdetail_restaurant`(`shopid`, `shopAllname`, `shopGroupId`, `defaultReviewCount`,`avgPrice`,`mainParentCategoryId`,`categoryName`,`countryid`,`countryname`,`status`,`recommend`) VALUES ('{}','{}','{}',{},{},{},'{}',{},'{}','{}','{}')".format(shopid, shopAllname, shopGroupId, defaultReviewCount,avgPrice,mainParentCategoryId,categoryName,countryid,countryname,status,recommend)cur.execute(insertSQLStrings)print("第"+(str)(i)+"页已抓取")updatesql1 = "UPDATE dazhong_paging_restaurant SET hasGet={} WHERE url='{}'".format(i,list[0])cur.execute(updatesql1)conn.commit()time.sleep(15)updatesql2 = "UPDATE dazhong_paging_restaurant SET finish=1 WHERE url='{}'".format(list[0])cur.execute(updatesql2)conn.commit()

最后通过接口获取更多详细丰富信息

# -*- coding: utf-8 -*-
import json
import requests
from fontTools.ttLib import TTFont
import pymysql
import timedef woff_dict(key):if key == 'address':woff = TTFont('C:\\Users\\Administrator\\Desktop\\address.woff') # 读取woff文件elif key == 'num':woff = TTFont('C:\\Users\\Administrator\\Desktop\\num.woff') # 读取woff文件# woff文件中ID编号为2~602的601个字符woff_str_601 = '1234567890店中美家馆小车大市公酒行国品发电金心业商司超生装园场食有新限天面工服海华水房饰城乐汽香部利子老艺花专东肉菜学福饭人百餐茶务通味所山区门药银农龙停尚安广鑫一容动南具源兴鲜记时机烤文康信果阳理锅宝达地儿衣特产西批坊州牛佳化五米修爱北养卖建材三会鸡室红站德王光名丽油院堂烧江社合星货型村自科快便日民营和活童明器烟育宾精屋经居庄石顺林尔县手厅销用好客火雅盛体旅之鞋辣作粉包楼校鱼平彩上吧保永万物教吃设医正造丰健点汤网庆技斯洗料配汇木缘加麻联卫川泰色世方寓风幼羊烫来高厂兰阿贝皮全女拉成云维贸道术运都口博河瑞宏京际路祥青镇厨培力惠连马鸿钢训影甲助窗布富牌头四多妆吉苑沙恒隆春干饼氏里二管诚制售嘉长轩杂副清计黄讯太鸭号街交与叉附近层旁对巷栋环省桥湖段乡厦府铺内侧元购前幢滨处向座下臬凤港开关景泉塘放昌线湾政步宁解白田町溪十八古双胜本单同九迎第台玉锦底后七斜期武岭松角纪朝峰六振珠局岗洲横边济井办汉代临弄团外塔杨铁浦字年岛陵原梅进荣友虹央桂沿事津凯莲丁秀柳集紫旗张谷的是不了很还个也这我就在以可到错没去过感次要比觉看得说常真们但最喜哈么别位能较境非为欢然他挺着价那意种想出员两推做排实分间甜度起满给热完格荐喝等其再几只现朋候样直而买于般豆量选奶打每评少算又因情找些份置适什蛋师气你姐棒试总定啊足级整带虾如态且尝主话强当更板知己无酸让入啦式笑赞片酱差像提队走嫩才刚午接重串回晚微周值费性桌拍跟块调糕'# ['cmap']为字符与Unicode编码的映射关系列表woff_unicode = woff['cmap'].tables[0].ttFont.getGlyphOrder()  # 获取603个字符对应的unicode编码woff_character = ['.notdef', 'x'] + list(woff_str_601) # 添加编号为0、1的两个特殊字符woff_dict = dict(zip(woff_unicode, woff_character))return woff_dictdef decodestr(firststr):strlist = firststr.split("<")laststr = ""for single in strlist:single = single.replace("/d>","").replace("/e>","")if single.find("address")>0:single = single[-5:-1]laststr += addressdict[single]#print(addressdict[single])elif single.find("num")>0:single = single[-5:-1]#print(numdict[single])laststr += numdict[single]elif single !="":laststr += singlereturn laststrif __name__ == '__main__':woffnum = (str)(woff_dict('num')).replace("{","").replace("}","").replace(" ","").replace("'uni","'")woffaddress = (str)(woff_dict('address')).replace("{","").replace("}","").replace(" ","").replace("'uni","'")numdict = {}newdict = woffnum.split(",")for d in newdict:d = '{' + d + '}'d = eval(d)numdict.update(d)addressdict = {}newdict = woffaddress.split(",")for d in newdict:d = '{' + d + '}'d = eval(d)addressdict.update(d)headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36",#"Cookie" : "自己的",'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7'}conn = pymysql.connect(host = 'localhost',user = "root",passwd = "自己的",db = "大众点评")cur=conn.cursor()querysql = "SELECT shopid FROM shopdetail_restaurant where fivescore is NULL"cur.execute(querysql)lists = cur.fetchall()for list in lists:shopid = list[0]requeststr1 = "https://www.dianping.com/ajax/json/shopDynamic/reviewAndStar?shopId={}&cityId=1&mainCategoryId=10".format(shopid)requeststr2 = "https://www.dianping.com/ajax/json/shopDynamic/basicHideInfo?shopId="+shopidrequeststr3 = "https://www.dianping.com/ajax/json/shopDynamic/shopAside?shopId="+shopidheaders = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.82 Safari/537.36"}res = requests.get(requeststr1,headers=headers).json()avgPrice = decodestr(res['avgPrice'])defaultReviewCount = decodestr(res['defaultReviewCount'])try:fivescore = res['fiveScore']except:fivescore = '-'if fivescore=="-":fivescore=0scoreTaste = decodestr(res['shopRefinedScoreValueList'][0])if scoreTaste=="-":scoreTaste=0scoreEnvironment = decodestr(res['shopRefinedScoreValueList'][1])if scoreEnvironment=="-":scoreEnvironment=0scoreService = decodestr(res['shopRefinedScoreValueList'][2])if scoreService=="-":scoreService=0res = requests.get(requeststr2,headers=headers).json()shopName = res['msg']['shopInfo']['shopName'].replace("'","\\'")branchName = res['msg']['shopInfo']['branchName']address = decodestr(res['msg']['shopInfo']['address']).replace("'","\\'")phoneNo = decodestr(res['msg']['shopInfo']['phoneNo'])shopGroupId = res['msg']['shopInfo']['shopGroupId']if shopGroupId==shopid:shopGroupId=""res = requests.get(requeststr3,headers=headers).json()glat = res['shop']['glat']glng = res['shop']['glng']categoryName = res['category']['categoryName']#enl = res['category']['mainParentCategoryId']if branchName==None:branchName=""#print(avgPrice+" "+defaultReviewCount+" "+fivescore+" "+scoreTaste+" "+scoreEnvironment+" "+scoreService+" "+shopName+" "+branchName+" "+address+" "+phoneNo+" "+shopGroupId+" "+(str)(glat)+" "+(str)(glng)+" "+categoryName+" "+(str)(mainParentCategoryId))print(avgPrice+" "+defaultReviewCount+" "+(str)(fivescore)+" "+(str)(scoreTaste)+" "+(str)(scoreEnvironment)+" "+(str)(scoreService)+" "+shopName+" "+branchName+" "+address+" "+phoneNo+" "+shopGroupId+" "+(str)(glat)+" "+(str)(glng)+" "+categoryName)insertSQLStrings="update `大众点评`.`shopdetail_restaurant` SET `fivescore` = {},`scoreTaste`={},`scoreEnvironment`={},`scoreService`={},`avgPrice`={},`defaultReviewCount`={},`shopName`='{}',`branchName`='{}',`address`='{}',`phoneNo`='{}',`shopGroupId`='{}',`glat`={},`glng`={} WHERE shopid = '{}'".format(fivescore, scoreTaste, scoreEnvironment,scoreService,avgPrice,defaultReviewCount,shopName,branchName,address,phoneNo,shopGroupId,glat,glng,shopid)#print(insertSQLStrings)cur.execute(insertSQLStrings)conn.commit()time.sleep(2)#exit()

最后结束如下

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://xiahunao.cn/news/352390.html

如若内容造成侵权/违法违规/事实不符,请联系瞎胡闹网进行投诉反馈,一经查实,立即删除!

相关文章

项目干系人管理实用方法,让你的项目顺风顺水

项目管理中的干系人是每个项目的一个重要方面&#xff0c;因为项目的结果取决于他们。然而&#xff0c;管理各种各样的干系人的艺术很有挑战性。在项目管理中根本没有出错的余地&#xff0c;本文将带你了解项目干系人以及如何管理他们以促进项目的全面成功。 谁是项目管理的干…

ChatGPT 之后,B 端产品设计会迎来颠覆式革命吗?| Liga妙谈

近日&#xff0c;脑机接口公司 Neuralink 宣布&#xff0c;其植入式脑机接口设备首次人体临床研究已被准许启动。遥想当年&#xff0c;我们还嘲讽罗老师「动嘴做 PPT」&#xff0c;谁曾想不久后我们可能连嘴都不用动&#x1f64a;。 脑机接口何时会引爆人机交互革命尚未可知&a…

Blender2023超好用的插件合集,还不抓紧用起来

Blender对于艺术家或电影制作人来说不再是一个陌生的名字。Blender 拥有一套全面的工具和一个用户友好的界面&#xff0c;使初学者和专业人士都可以使用它。全球开发人员和用户社区不断更新新功能和改进。此外&#xff0c;有这么多可用的工具和插件&#xff0c;Blender可以定制…

【我的渲染技术进阶之旅】关于C++轻量级界面开发框架Dear ImGui介绍

文章目录 一、怎么知道ImGui的1.1 Filament中有使用ImGui1.2 其他很多渲染框架都有使用ImGui 二、ImGui介绍2.1 ImGui风格2.2 Imgui介绍2.2.1 Imgui简介2.2.2 Imgui用法2.2.3 Demo示例2.2.4 集成2.2.5 更多案例 2.3 查看Imgui实例源代码2.3.1 运行demo2.3.2 项目结构分析2.3.3 …

微软发布会前新平板Xbox Surface规格泄露

在昨天雷锋网曾经报道过微软Xbox 720、Kinect 2以及Kinect Glasses相关文档在Scribd上被泄露的信息&#xff0c;而今天就在微软的“核心产品发布会”&#xff08;Major Announcement&#xff09;前又有一份技术文件被泄露&#xff0c;根据 Shifted2u报道&#xff0c;文件表明微…

ui与前端的仇恨_我如何学会停止仇恨并开始喜欢Windows 8

ui与前端的仇恨 Windows 8 has been with us for a good few months now. From consumer previews to the released products, there has been much venting of spleen over the changes made by Microsoft. But after initially hating Windows 8, I started to love it. Genu…

看上去很美--次世代游戏平台XBOX360测评

XBOX360硬件测评&#xff1a;前言 前言&#xff1a;在飞速发展的电视游戏娱乐产业中&#xff0c;以PS2、XBOX、NGC为代表的现行电视游戏娱乐平台已经开始慢慢进入了一个相对的衰退期&#xff0c;取而代之的将是以当前科技水准的最高水准&#xff08;以一定的性价比为标准&#…

微软Windows NT之父加盟Xbox团队

戴夫-卡特勒(Dave Cutler) 新浪科技讯 北京时间1月20日下午消息&#xff0c;微软(微博)发言人周四称&#xff0c;有“Windows NT之父”之称的戴夫•卡特勒(Dave Cutler)已加盟微软Xbox团队。 该发言人说&#xff0c;戴夫具有出众的智慧&#xff0c;他的加入将有助于Xbox从以游戏…

Xbox One 手册:磨刀不误砍材工

其实好多朋友&#xff0c;拿到设备后第一时间根本不是看说明书&#xff0c;而是赶紧上手&#xff0c;之后也就忘了说明书这回事了。所以&#xff0c;还是有必要帮大家看看的&#xff0c;呵呵&#xff0c;毕竟有些注意事项都在其中哦。 包装清单就不说了&#xff0c;如果你不认设…

【手势交互】5. Kinect for Xbox One

Kinect for XBox One是Kinect for XBox的升级版&#xff0c;这里称他为Kinect2 &#xff08;也有叫它Kinect 720, Kinect One&#xff09;。Kinect2于2013年中和Xbox One一起配套发售&#xff08;Kinect1于2010年11月在美国上市&#xff0c;建议售价149美金&#xff09;。2014年…

xbox360游戏下载_完成的乐趣-通过Xbox向后兼容性探索旧游戏

xbox360游戏下载 Im on vacation for the holidays and Im finally getting some time to play video games. Ive got an Xbox One X that is my primary machine, and I also have a Nintendo Switch that is a constant source of joy. I recently also picked up a very use…

xbox录制视频没声音_如何在Xbox One上截图和录制视频

xbox录制视频没声音 Microsoft’s Xbox One allows you to easily capture a screenshot or record the last thirty seconds of gameplay as a video. You can also use the Game DVR app for more fine-grained recording options. All video clips are saved in 720p resolu…

xbox360使用_适用于Xbox 360的HD-DVD Player

xbox360使用 The whole HD-DVD vs. Blu-Ray thing is so ridiculous. That we didnt learn from the VHS vs. BetaMax fiasco and are doing it again just kills me. Of course, some folks think the battle is more like the DVD-Audio vs. Super Audio CD (SACD) wars (ski…

xbox360 功率测试软件,运行游戏164W!Xbox 360功耗早知道

Xbox 360发布已经有一段时间了&#xff0c;此间相关的消息不断&#xff0c;但是很少提及Xbox 360本身的功耗问题&#xff0c;想必大家都非常的想知道Xbox 360的功耗情况。因为从某种意义上讲&#xff0c;Xbox 360本身和一台主流的PC机器配置无异。 首先我们回顾一下Xbox 360主要…

微软文档外泄:Xbox 720、Kinect 2以及Kinect Glasses抢先看

微软今年并没有推出下一代Xbox游戏机的计划&#xff0c;但近期微软一份56页的文档被外泄&#xff0c;透露了微软下一代Xbox游戏机– Xbox 720 –或将于2013年推出。在这份日期为2010年8月的文档中还提到了关于SmartGlass、Metro仪表盘以及Xbox TV等应用未来的更新信息。 Xbox 7…

猫眼电影爬虫(参考崔大的书写的)

import json #json会将字典类型转化为字符串类型&#xff0c;利于保存 import requests #调用requests库发起get请求 import socket #与urlerror类型有关的库&#xff0c;与url库有关&#xff0c;本例没有用到。 from requests import RequestException #request异常的库 i…

干货 :深入浅出之推荐系统原理应用介绍

写在正文之前 最近在做推荐系统&#xff0c;在项目组内做了一个分享。今天有些时间&#xff0c;就将逻辑梳理一遍&#xff0c;将ppt内容用文字沉淀下来&#xff0c;便于接下来对推荐系统的进一步研究。推荐系统确实是极度复杂&#xff0c;要走的路还很长。 A First Glance 为什…

Python爬虫之requests+正则表达式抓取猫眼电影top100以及瓜子二手网二手车信息(四)...

requests+正则表达式抓取猫眼电影top100 一.首先我们先分析下网页结构 可以看到第一页的URL和第二页的URL的区别在于offset的值,第一页为0,第二页为10,以此类推。 二.<dd>标签的结构(含有电影相关信息) 三、源代码 import requests import re import …

爬虫之抓取猫眼电影排行

一 需求 我们要提取出猫眼电影TOP100的电影名称、时间、评分、图片等信息&#xff0c;提取的站点URL为http://maoyan.com/board/4&#xff0c;提取的结果会以文件形式保存下来。 二 技术手段 利用requests库和正则表达式来抓取猫眼电影TOP100的相关内容。 三 抓取分析 我们…

2019-01-18-Python爬取猫眼电影排行榜

title: Python爬取猫眼电影排行榜 date: 2019-01-18 20:44:16 tags: pythonlxmlrequestsjson categories: python Python爬取猫眼电影排行榜 requests 获取 htmllxml 解析 htmlxpath 定位元素json 存文件 自己写的 import requests from lxml import html import json # 存…