current position:Home>How to use the BeautifulSoup method to extract the tags in the webpage code, and put the extracted data into the excel file in turn

How to use the BeautifulSoup method to extract the tags in the webpage code, and put the extracted data into the excel file in turn

2022-08-06 16:03:51CSDN Q&A

如何用BeautifulSoupMethods to extract tags inside the web page source code,将提取出的数据,依次放入excel文件中

问题相关代码
import xlwtfrom bs4 import BeautifulSoupif __name__=="__main__":    wookbook  = xlwt.Workbook() #创建工作簿    sheet1 = wookbook.add_sheet('Sheet_one',cell_overwrite_ok=True)  #创建sheet,名字为Sheet_one    headlist = ['序号','列名','英文名'] #表头数据    row = 0    col = 0    # 写入表头数据    for head in headlist:        sheet1.write(row, col, head)        col = col + 1html = """<thead><tr><th class="bs-checkbox " style="width: 36px; " data-field="ck" tabindex="0"><div class="th-inner "><input name="btSelectAll" type="checkbox"></div><div class="fht-cell"></div></th><th style="" data-field="isInside" tabindex="0"><div class="th-inner ">Belong to the mobile inside or outside the network number</div><div class="fht-cell"></div></th><th style="" data-field="businessCategory" tabindex="0"><div class="th-inner ">业务类别(Inbound access)</div><div class="fht-cell"></div></th>"""soup = BeautifulSoup(html, 'lxml')tr_list = soup.find_all('tr')[1:]for th in soup.select('th'): print(th['data-field'])headlist=th['data-field']row = 1  # Starting form the second line of writing datafor c, top in enumerate(headlist): sheet1.write(row, 2, top)  # rou代表列,col代表行,top.text写入值row += 1#wookbook.save(r'D:\test.xls')
运行结果及报错内容

The data you need is the output out,但是我导入excelFile failed to import

img

img

我的解答思路和尝试过的方法

I tried all kinds of ideas to take out the data,Have put them into the list,But they are each data into a list,Then the import,Only import the first,Can't import success in turn.

Used to convert data into the list,然后再导入:
import xlwtfrom bs4 import BeautifulSoupif __name__=="__main__":    wookbook  = xlwt.Workbook() #创建工作簿    sheet1 = wookbook.add_sheet('Sheet_one',cell_overwrite_ok=True)  #创建sheet,名字为Sheet_one    headlist = ['序号','列名','英文名'] #表头数据    row = 0    col = 0    # 写入表头数据    for head in headlist:        sheet1.write(row, col, head)        col = col + 1html = """<thead><tr><th class="bs-checkbox " style="width: 36px; " data-field="ck" tabindex="0"><div class="th-inner "><input name="btSelectAll" type="checkbox"></div><div class="fht-cell"></div></th><th style="" data-field="isInside" tabindex="0"><div class="th-inner ">Belong to the mobile inside or outside the network number</div><div class="fht-cell"></div></th><th style="" data-field="businessCategory" tabindex="0"><div class="th-inner ">业务类别(Inbound access)</div><div class="fht-cell"></div></th>"""soup = BeautifulSoup(html, 'lxml')for th in soup.select('th'):    headlist=th['data-field']    A = headlist.split()    print(A)row = 1  # Starting form the second line of writing datafor c, top in enumerate(A): sheet1.write(row, 2, top)  # rou代表列,col代表行,top.text写入值row += 1#wookbook.save(r'D:\test.xls')

img

img

Results entered a list,And I feel the three list is a list of..

我想要达到的结果

img

copyright notice
author[CSDN Q&A],Please bring the original link to reprint, thank you.
https://en.primo.wiki/2022/218/202207310003021564.html

Random recommended