<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Blogs on Ke's Notes and Blogs</title><link>https://kxue43.github.io/notes-and-blogs/blogs/</link><description>Recent content in Blogs on Ke's Notes and Blogs</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sat, 05 Feb 2022 17:31:35 -0500</lastBuildDate><atom:link href="https://kxue43.github.io/notes-and-blogs/blogs/index.xml" rel="self" type="application/rss+xml"/><item><title>Selenium vs. Playwright</title><link>https://kxue43.github.io/notes-and-blogs/blogs/selenium-vs-playwright/</link><pubDate>Sat, 05 Feb 2022 17:31:35 -0500</pubDate><guid>https://kxue43.github.io/notes-and-blogs/blogs/selenium-vs-playwright/</guid><description>&lt;p&gt;Recently I developed a Python program that scrapes a Single-Page Application with 
 









 


 &lt;a href="https://selenium-python.readthedocs.io/"&gt;Selenium&lt;/a&gt;,
and then reimplemented it with 
 









 


 &lt;a href="https://playwright.dev/python/"&gt;Playwright&lt;/a&gt;. This gave me firsthand knowledge of the difference
between the two. In this article, I shall explain my preference of Playwright over Selenium, and share some general
experience of scraping single-page applications.&lt;/p&gt;





&lt;h2 id="the-reason-and-result-of-the-reimplementation" class="heading "&gt;The reason and result of the reimplementation&lt;a href="#the-reason-and-result-of-the-reimplementation" aria-labelledby="the-reason-and-result-of-the-reimplementation"&gt;








&lt;!-- &lt;i class="fas fa-link anchor"&gt;&lt;/i&gt; --&gt;
 &lt;svg class="svg-inline--fa fas fa-link anchor" fill="currentColor" aria-hidden="true" role="img" viewBox="0 0 576 512"&gt;&lt;use href="#fas-link"&gt;&lt;/use&gt;&lt;/svg&gt;&amp;nbsp;
 &lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;My scraping program needs to click an anchor element on the target webpage to trigger file download by the browser,
wait for the file download to finish, and then do something with the downloaded file. With Selenium, as surprising as
it may seem, waiting for file download to finish is not straightforward. According to its

 









 


 &lt;a href="https://www.selenium.dev/documentation/test_practices/discouraged/file_downloads/"&gt;official documentation&lt;/a&gt;:&lt;/p&gt;</description></item></channel></rss>