What about this dynamic_finder(url) function, my user-defined Page class, and PyQt5.QtWebEngineWidgets.QWebEnginePage prevents me from running it more than once?
from bs4 import BeautifulSoup from colorama import Fore # Modules for dynamic JS websites from PyQt5.QtWebEngineWidgets import QWebEnginePage from PyQt5.QtWidgets import QApplication from PyQt5.QtCore import QUrl import sys def dynamic_finder(url_path): class Page(QWebEnginePage): def __init__(self, url): self.app = QApplication(sys.argv) QWebEnginePage.__init__(self) self.html = '' self.loadFinished.connect(self._on_load_finished) self.load(QUrl(url)) self.app.exec_() def _on_load_finished(self): self.toHtml(self.callable) print(Fore.YELLOW + 'Dynamic Load finished') def callable(self, html_str): self.html = html_str self.app.quit() page = Page(url_path) soupy = BeautifulSoup(page.html, 'html.parser') tag = soupy.a output = tag.text return output, tag.attrs['href'] print(dynamic_finder("https://www.fb.com")) print(dynamic_finder("https://www.apple.com")) print(dynamic_finder("https://www.a.co")) print(dynamic_finder("https://www.netflix.com")) print(dynamic_finder("https://www.google.com"))
However, whenever I try and make a second instance of Page (by calling dynamic_finder more than once) within PyCharm, it throws an error "Process finished with exit code -1073741819 (0xC0000005)."
I found one potential solution in this Question on StackOverflow but even after applying the suggested change in my settings, I still face the same issue. I have also tried changing the variable name (page1, page2, etc.) with each function call to no avail.
I'm not sure how StackOverflow handles package requirements (this is only my second time sending a question), but for those who seek to run to code see the requirements.txt and Anaconda Prompt (Windows) commands
beautifulsoup4==4.9.1 colorama==0.4.1 ipython==7.18.1 PyQt5==5.15.2 PyQt5-sip==12.8.1 PyQtWebEngine==5.15.2 requests==2.22.0 virtualenv==20.0.31 widgetsnbextension==3.5.1
pip install virtualenv virtualenv venv .\venv\Scripts\activate pip install -r requirements.txt ipython
I believe the problem lies within the Page class definition which I adapted from this YouTube video.
User contributions licensed under CC BY-SA 3.0