Newsletter Subject

Navigating ChromeDriver crashes in Kubernetes: A tale of test automation resilience

From

ministryoftesting.com

Email Address

hello@ministryoftesting.com

Sent On

Tue, Jul 9, 2024 01:17 PM

Email Preheader Text

Overcome ChromeDriver crashes and resource limitations by testing on Kubernetes by Dan Burlacu | Why

Desktop View
HTML
Text
Mobile View

Go Premium to Unlock

Subscribe Now

Overcome ChromeDriver crashes and resource limitations by testing on Kubernetes [View this email in your browser]( by Dan Burlacu | [Read online at Ministry of Testing]( Why test on Kubernetes? In my day-to-day work as a Software Developer in Test, I was tasked with developing automated UI (user interface) tests for a complex web application with dynamically generated content. This means that the web elements I need to interact with often lack static attributes that can be easily referenced. As a result, I have to use more complex strategies to locate and interact with these elements. After using some web element localization gymnastics to make sure I can click the right buttons and access the right iframes, the tests were complete. The web application runs as a separate K8S (kubernetes) instance for each client company, on a different K8S cluster, where the resources for that particular instance are grouped inside namespaces. The UI tests are automatically triggered prior to a major version update of the web application, and immediately after, to make sure that the update has not negatively impacted its usability. This was achieved by containerizing the UI test code in a Docker image, sending it to an organisational repository and using a K8S job to deploy the tests on the specific instance namespace whenever an update was pending and immediately after. Being deployed on hundreds of resources, the Docker image needed to be light, so the tests ran in a linux environment. Running in Linux with no display support, meant that the tests couldnât open a normal browser, but instead had to use the browserâs headless mode. All of the development and testing is done using the Chrome web browser company-wide, so naturally, I used the ChromeDriver to drive the Chromium browser set up on container deployment. A server that kept track of the upgrades schedule was used to trigger the tests for a particular instance of the web application, and the tests reported back to the server a JSON summary of the results. Returns JSON summary of resultsL Server triggers tests, K8S environment runs tests Starting your journey The first try of running the tests in a containerized fashion, using Docker Desktop ended up with providing the following error: selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally. (unknown error: DevToolsActivePort file doesn't exist) (The process started from chrome location /usr/lib/chromium/chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.) Double-check your web results Upon searching the error online1, I came across a solution that seemed to work, and that is using in the options for the WebDriver the --no-sandbox flag. This seemed fine, my tests worked, so I could have moved on, but I wondered what that flag actually does and if using this solution can negatively impact the tests or the environment they run in. It seems like âThe sandbox removes unnecessary privileges from the processes that don't need them in Chrome for security purposes. Disabling the sandbox makes your PC more vulnerable to exploits via web pages, so Google doesn't recommend it.â This explanation was found here1 and it seems to be an option that is needed to run Chrome on unix-like systems in headless mode. Given that my tests ran in an ephemeral K8S job that is discarded after some time has passed since it was complete, I did not need to worry about environment security concerns. This however, introduced a new issue when I tried to run my tests from my development environment, which is not discarded after use, running Windows 11 Pro, version 22H2. In about 50% of the cases, after the tests were run and the WebDriver.quit() line was executed, Chrome had 2 lingering background processes. These needed to be killed manually from the task manager, or else running the tests multiple times would increase the CPU usage to 100%, making the PC unusable without a restart. The testing community on the world wide web again came to the rescue, as this problem is documented in this GitHub issue2. Try running in the final environment All seemed to work fine on Docker Desktop, so I could have published the final image with the testing code and be done with it, chanting the infamous âWorks on my machineâ. I was even delivering âmy machineâ or the container environment where the tests worked to the location they were supposed to run in, the K8s cluster, so no issues there, right? Wrong. When running the tests in the environment they were supposed to run in, namely K8S, only the come tests from the whole suite were executed correctly and then all the others failed. Be creative Having worked on these tests already for quite a long time, I had to deliver. Having no time to investigate further why the UI tests fail, without any apparent set-up error, I got the idea of thinking outside of the company box by using Firefox. That seemed to solve the issue and the tests finally worked in the environment they were supposed to. The code was delivered and it did what it was supposed to do, but the question remained, why did the UI tests fail in a K8S environment? "Ministry of Testing is the one-stop shop for everything testing." â Ashutosh Mishra [Upgrade your software testing career with Ministry of Testing]( Be curious After delivering the code, I could have patted myself on the back for a job well done, and closed the chapter on the subject, but the question of why some of the tests work on Firefox and not on Chrome kept bugging me, so I started investigating, without having the pressure of delivering results. For debugging purposes, I set the Dockerfile to run a sleep infinity command after creating the container, instead of the usual command that triggered the UI tests. This kept the container alive as I executed commands inside it, to run the tests. It also provided a way to be able to transfer files between the K8s pod that was running my tests and my local machine. This is the full Dockerfile I used for debugging: Dockerfile used for debugging. I wanted to see how the UI looks when the tests fail, so I wrote some logic to take screenshots on test failure. As I could not see the images on the linux server environment inside the K8S pod, I downloaded the files on my pc. It seemed that during one of the tests that failed, the web page was trying to load an iframe and the iframe did not load when the web page was accessed from inside the K8S pod. This issue was not encountered when running the tests headless from localhost, nor when running them from inside a container in Docker Desktop based on the same image that was used in the kubernetes environment. I now knew why the tests had failed, the iframes did not load, but what could be the cause of that? Be methodical I thought about the processes that were going on during the tests and imagined what could go wrong such that the iframe would not load. The first thing that came to mind was that the browser could not reach the iframe URL from inside the K8S pod. I tried getting the iframe URL from the DevTools on my local browser and pinging the URL from inside the K8S pod, and it was responding fine. This ruled out a connection issue. If the browser reached the iframe, maybe the WebDriver wasnât handling the response correctly, so some logs would be helpful. I tried saving the WebDriver logs and going through them line by line while comparing them to the WebDriver logs that I got from the Docker Desktop container. They were identical, so the WebDriver handling of the connection to the iframe was not the problem. Was it a problem in the way the browser handled displaying the information it got from the iframe? Were there any problems in the DevTools console logs? I wrote some logic in the code to fetch the DevTools console logs and save them to a file while running the tests. When going through the file, I found the culprit: {'level': 'SEVERE', 'message': ' - Failed to load resource: net::ERR_INSUFFICIENT_RESOURCES', 'source': 'network', 'timestamp': 1704886353348} It seemed like the browser did not have sufficient resources to load a simple iframe? I checked the resources allocated to the pod, and they were more than enough. I then turned to Stack Overflow for answers. They pointed out a 2011 chromium bug that on Linux forced the browser to reach a memory capacity3. Still no solution in sight, as the bug was not resolved and the last reply was from 2018. Be resilient Not wanting to give up, I tried doing some research on what each of the Selenium ChromeOptions mean, to see if one of them could be the answer that could fix my problem. I stumbled upon the --disable-dev-shm-usage option in this post4. This stated that âThe /dev/shm partition is too small in certain VM environments, causing Chrome to fail or crash. Use this flag to work-around this issue (a temporary directory will always be used to create anonymous shared memory files).â It seems to be related to another Chromium bug, as linked in the post. The way the Chrome browser was using resources in a K8S environment was different than the ones used in the Docker Desktop app and on my local machine. Be successful I initialised my ChromeDriver with this flag and ran the tests. After such a long journey, it finally worked. I was now running UI tests inside a Chromium browser on a linux environment in a K8S cluster. I changed the code back to use the Chromium browser for running the tests and deployed the new image to the company container registry. I then checked for an instance that had an upcoming update and waited for the test results to roll in. Everything worked and this journey brought a sense of accomplishment and enriched my testing experience. To wrap up If you ever feel the gentle nudge of the nagging question of âWhy?â take the time and resources to pursue and answer. It might take you where few others have gone before and prove to be an advantage in navigating the ever changing world of software testing. If you do find something interesting, take the time to share it with others, because together we test the world, one bug at a time. For those facing the challenge of testing within a K8S environment, be sure to use the two options that made all this work possible: chrome_options.add_argument('--no-sandbox') chrome_options.add_argument('--disable-dev-shm-usage') Navigating ChromeDriver crashes in Kubernetes presents a journey of test automation resilience. Tasked with UI test development for a dynamically generated web app, challenges arose due to the fact that initial tests in Docker Desktop faced Chrome crashes. This was resolved by the --no-sandbox flag, yet Windows lingering processes required further community solutions. Deploying tests in Kubernetes revealed further failures, prompting Firefox adoption as a workaround. Curiosity drove post-delivery investigation, uncovering iframe loading issues. Resource checks, bug searches, and ChromeOptions exploration led to success with --disable-dev-shm-usage, ensuring UI tests run smoothly in Kubernetes. This journey underscores the importance of curiosity, resilience, and resourcefulness in overcoming testing challenges, with lessons shared for navigating similar hurdles. References: - [What does the Chromium option `--no-sandbox` mean?]( - [Chrome process still running in background after driver.quit()]( - [Chrome fails to load more than ~440 images on one page.]( - [Meaning of Selenium ChromeOptions]( For More Information: - [Tooling for Automated Testing with Butch Mayhew]( - [Approach to Comparing Tools with Matthew Churcher]( - [Digging In: Getting Familiar With The Code To Be A Better Tester]( - Hilary Weaver - [Community Guide on UI Automation]( - Ministry of Testing collection "I found a community in the Ministry of Testing, a form of belonging which helped me grow personally and in my career." â Kim Knup [Upgrade your software testing career with Ministry of Testing]( [Website]( [LinkedIn]( [YouTube]( [Twitter]( [Instagram]( Copyright Â© 2024 Ministry of Testing, All rights reserved. You have opted to join this email list. Our mailing address is: Ministry of Testing 19 New RoadBrighton, East Sussex BN1 1UF United Kingdom [Add us to your address book]( Want to change how you receive these emails? You can [update your preferences]( or [unsubscribe from this list](.

Edit & Download HTML

Add To Favourites

EDM Keywords (223)

wrote wrap worry worked work wondered webdriver way wanting wanted waited vulnerable using used use usability url update unsubscribe turned trying triggered trigger tried together time thought tests testing test tasked tale take sure supposed successful success subject stated solve solution small sight share set server sense seems seemed see save running run ruled roll result restart resources resourcefulness resolved resilient research rescue related recommend receive reach ran quite question pursue published providing prove processes problems problem pressure preferences post4 post pointed pod pinging pending pc patted others options option opted open one needed need navigating naturally moved ministry mind methodical means made machine logic location locate localhost load list linux linked line light kubernetes knew kept journey join issues issue investigate interact instead instance inside initialised information importance immediately imagined images image iframes iframe identical idea hundreds helpful helped handling got google gone going give found form flag firefox files file fetch failed fail fact facing explanation exist error environment enriched enough encountered emails email elements drive downloaded done documented dockerfile discarded different devtools development deployed deploy delivering delivered deliver debugging day curious creative creating could containerizing container connection complete comparing community code closed click chromedriver chrome checked chapter chanting changed change challenge cause cases came bug browser belonging background back assuming answers answer always advantage achieved accomplishment accessed access able 50 2018

ministryoftesting.com

Ministry of Testing

Follow domain to get weekly email update

Marketing emails from ministryoftesting.com

Sent On

25/09/2024

Sent On

23/09/2024

Sent On

19/09/2024

Sent On

18/09/2024

Sent On

10/09/2024

Sent On

09/09/2024

Navigating ChromeDriver crashes in Kubernetes: A tale of test automation resilience

Email Preheader Text

EDM Keywords (223)

ministryoftesting.com

Marketing emails from ministryoftesting.com

Email Content Statistics

Font Used