current position:Home>Python crawler obtains the data of Douban movie top250 fantasy class
Python crawler obtains the data of Douban movie top250 fantasy class
2022-02-03 00:55:02 【CSDN Q & A】
# There is an error :ValueError: Length mismatch: Expected axis has 0 elements, new values have 12 elements
# Code :
Get the ranking of Douban films - Fantasy Film
import requests
import pandas as pd
import time
Grab from web page
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'
}
quest_data= pd.DataFrame()
n = 0
index = []
N=20# Number of movies per acquisition
for interval_id in range(10,0,-1): # In the range of interval_id
for start in range(9999): # Start sequence number start
url = 'https://movie.douban.com/j/chart/top_list?type=23&interval_id={}%3A{}&action=&start={}&limit={}'.format(interval_id * 10, (interval_id - 1) * 10, start * N, N)
html1 = requests.post(url, headers=headers)
try:
film_data = eval(html1.text.replace('true', 'True').replace('false', 'False'))
except:
continue
for film in film_data:
df = pd.DataFrame(
data=[film['rank'], film['title'], film['cover_url'], film['actors'], film['is_playable'],
film['id'], film['types'],
film['regions'], film['release_date'], film['score'], film['vote_count'], film['url']])
quest_data= quest_data.append(df.T)
n += 1
index.append(n)
time.sleep(6) if len(film_data) < N: break
quest_data.columns = [' ranking ',' Film name ',' Movie picture link ',' actor ', ' Playable ',' douban ID',' Film type ',' Producer country ',' Time of issue ', ' score ', ' Number of evaluators ',' link ']
quest_data.index = index
Store data
quest_data.to_excel(r'E:\ Douban film category ranking - Fantasy Film .xls') # r'D:\ Douban film category ranking -xx slice .xls'
Refer to the answer 1:
Please sort out the code format , Note that the indentation
Refer to the answer 2:
copyright notice
author[CSDN Q & A],Please bring the original link to reprint, thank you.
https://en.primo.wiki/2022/02/202202030055005284.html
The sidebar is recommended
- SQL. Why not attach it, as shown in the figure
- How does robotjs set automatic rebuild in electronic build
- Python final exam SOS
- I admit that I have degenerated, but I still want to know how to solve Java
- I have encountered several array problems in C + +. I hope the big God can solve the puzzle. It's best to have comments
- Why is it double? don't get it
- C language novice village, high adoption rate
- The usage and law of this function are expected to be explained.
- Let's see what's wrong. My code will output one less country
- Why is the average 0
guess what you like
-
The replacement of tab key in hive leads to a problem in the file.
-
Now to realize a function, use the El tree component of element UI, select the relevant node, and then copy a new tree with only relevant parent-child nodes below.
-
Sqlyog database import failed with the following error. Do you know
-
How do two lists compare in the same location
-
How to group the following data by film and television, education, music, E-sports and other sectors
-
Server socket + Android socket can only send a message once after connecting. The second sending does not respond. You need to reconnect the socket and solve it
-
Questions about student information management system C++
-
Qtable widget of QT double click to edit and display the problem
-
Separate strings and insert a comma for each 5 characters
-
QT qtablewidget double click to modify the content. What is the problem
Random recommended
- MySQL reports an error in Navicat. How to solve it?
- Router link cannot jump into router view (Vue)
- Why is the camera preview frame obtained by Android like this
- How to implement JS? The address bar changes the page load and refresh, and the selected conditions of the current page remain unchanged as all by default
- Why did 0.3 become 0.2999999999
- What is the reason for the warning when Vue project is packaged
- Yolov5 deployment final inference report cannot find picture
- How to extract the characters of a paragraph in Notepad?
- Computer network technology IP questions
- MySQL group query gets the data of the first 24 hours in each group
- What's wrong with this and need to be changed?
- Write a program to find the number of integers between 200 and 300 that meet the following conditions. Condition: the product of hundreds, tens and individuals is equal to the sum of hundreds, tens and individuals
- In Python, the condition setting and interpretation of while loop
- How can jupyter notebook run on the next line of code followed by the above code
- The page has been loaded. Press F12 at the top to show it. What's going on
- This CompareTo method is not called. How do you sort the elements in the collection
- Using pointer method: input three integers ABC and output them in order of size
- C # new ADO Net data model, error, failed to load file or assembly
- Input n integers from the keyboard and store them in a pile of arrays. Put all positive numbers in the array in front of the array and negative numbers in the back to output the array
- Why can't the small gourd be installed.
- Java byte array and string conversion
- Definition of two-dimensional array in C language
- Why is there an extra 0 here?
- Python code programs don't
- Measure the distance with sound wave, and then send the result to the serial port. Do you want to interrupt twice, language interrupt, and ask for an idea
- An error occurs when starting Hadoop. How to solve it?
- What about the problem of remote access server?
- pywintypes. com_ Error: (- 2147023728, 'element not found', none, none)
- Global variables don't work in this program? Why is the answer a
- In Python, there is something wrong with the program and an error is reported
- Simple OJ questions for Freshmen
- Butterfly effect problem, insufficient number of input parameters
- Use of extractall function of Python zipfile module
- After spring imports the database and c3p0 dependency package in idea, it will prompt the database for abnormal error
- Python from introduction to practice 5-10
- 1703792 what does that mean? What's wrong?
- C language and C + are easy to search dogs
- This interface is always used when I use CONDA to install tensorflow? About half an hour. How?
- React summernote rich text editor!! Why do some function icons of rich text editor fail after I pack them
- How does silly girl robot log in with mobile phone