QA Graphic
January 22, 2025

Determine Installed Python Packages and Their Disk Usage

Useful for Disk Cleanup

When working with Python, it's common to install numerous packages over time. Some of these packages might no longer be needed, and identifying them can help free up valuable disk space. In this blog post, we'll explore a practical way to list all installed Python packages, locate their directories, and estimate their size on disk.

The Command Breakdown

The following command pipeline is a powerful one-liner that accomplishes our goal:

pip list 
| tail -n +3 
| awk '{print $1}' 
| xargs pip show 
| grep -E 'Location:|Name:' 
| cut -d ' ' -f 2 
| paste -d ' ' - - 
| awk '{print $2 "/" tolower($1)}' 
| xargs du -sh 2> /dev/null 
| sort -hr

Let's break this command into digestible pieces:

  1. pip list: Lists all installed Python packages and their versions.

  2. tail -n +3: Removes the first two lines of the output (the header row) to leave only the package names and versions.

  3. awk '{print $1}': Extracts the first column, which contains the package names.

  4. xargs pip show: Feeds the package names to the pip show command to retrieve details about each package.

  5. grep -E 'Location:|Name:': Filters the output to include only the Location and Name fields.

  6. cut -d ' ' -f 2: Splits each line by spaces and extracts the second field, which is the value of the Location and Name fields.

  7. paste -d ' ' - -: Combines the Name and Location outputs into a single line per package.

  8. awk '{print $2 "/" tolower($1)}': Constructs the full path to each package by appending the package name to its location.

  9. xargs du -sh 2> /dev/null: Calculates the disk usage of each package directory and suppresses error messages (e.g., for inaccessible directories).

  10. sort -hr: Sorts the packages by size in descending order.

Example Output

Running this command produces output similar to:

12M /path/to/python/site-packages/numpy
8.5M /path/to/python/site-packages/pandas
3.4M /path/to/python/site-packages/scipy
...

This shows the size of each installed package, helping you identify large ones that might no longer be necessary.

Use Cases

  1. Disk Space Cleanup: Remove large, unused packages to free up space.

  2. Environment Management: Understand which packages are installed and ensure that only necessary ones are present in your environment.

Pro Tip: Automating Cleanup

Once you identify packages you no longer need, you can remove them using:

pip uninstall <package_name>

Considerations

  • Virtual Environments: Always run this command within the virtual environment you want to inspect to avoid confusion with global packages.
  • Dependencies: Be cautious when uninstalling packages as they may be dependencies for others.

By understanding your Python environment, you can keep it clean, efficient, and ready for action. Try out the command and see how much disk space you can reclaim!

Permalink
January 15, 2025

Python expressions categorized by their type

Comparison and Assignment Operators

This a comprehensive table of various Python expressions categorized by their type. Each expression is accompanied by a sample value to illustrate its usage.

Category Expression Sample Value
Arithmetic 2 + 3 5
5 - 2 3
4 * 3 12
10 / 2 5.0
10 // 3 3
10 % 3 1
2 ** 3 8
Comparison 5 > 3 True
5 < 3 False
5 == 5 True
5 != 4 True
5 >= 5 True
5 <= 4 False
Logical True and False False
True or False True
not True False
Bitwise 0b1010 & 0b0110 0b0010 (2)
0b1010 | 0b0110 0b1110 (14)
0b1010 ^ 0b0110 0b1100 (12)
~0b1010 -0b1011 (-11)
0b1010 << 2 0b101000 (40)
0b1010 >> 2 0b10 (2)
Assignment x = 5 x is 5
x += 3 x becomes 8
x -= 2 x becomes 6
x *= 4 x becomes 24
x /= 3 x becomes 8.0
x %= 5 x becomes 3.0
x **= 2 x becomes 9.0
Membership 'a' in 'apple' True
'b' not in 'apple' True
Identity x is y True or False
x is not y True or False
Ternary a if condition else b Depends on condition
Lambda lambda x: x + 1 Function adding 1 to input
List Comprehension [x*2 for x in range(3)] [0, 2, 4]
Function Call max(1, 2, 3) 3
Attribute Access object.attribute Depends on object
Indexing my_list[0] First element of my_list
Slicing my_list[1:3] Sublist from index 1 to 2
String Formatting f"Hello, {name}" e.g., "Hello, Alice"

Permalink
January 8, 2025

Harnessing Python and WGet for Efficient Web Scraping

Basic Usage of WGet in Pyton

Welcome to our exploration of Python and WGet, two powerful tools that can enhance your web scraping capabilities. Whether you're an experienced programmer or just starting out, this post will guide you through integrating these tools to streamline your data retrieval tasks.

What is WGet?

WGet is a free utility for non-interactive download of files from the web. It supports HTTP, HTTPS, and FTP protocols, making it good tool for retrieving content from trusted sources. WGet can resume broken downloads, handle recursive downloads, convert links for local viewing, and much more, which makes it an excellent companion for web scraping projects.

Why Use Python with WGet?

Python, with its simplicity and extensive libraries, is perfect for scripting and automating tasks. When combined with WGet, you harness:

  • Simplicity: Python's syntax is easy to read and write, reducing development time.
  • Automation: Schedule downloads, manage files, and process data all within one script.
  • Flexibility: Handle data post-download with Python's data manipulation libraries like Pandas.

Setting Up

Before we dive into the example, ensure you have Python and WGet installed:

  • Python: Available on python.org.
  • WGet: On Unix-like systems, it's usually pre-installed or available via package managers like apt or brew. For Windows, you might need to download it from the GNU WGet site.

Example: Download and Process a Website Let's create a simple Python script that uses WGet to download a website and then processes the downloaded content:


python
import subprocess
import os
def download_website(url, directory="downloaded_site"):
    """
    Download a website using WGet and save it to the specified directory.
    :param url: URL of the site to download
    :param directory: Directory to save the downloaded site
    """
    # Create directory if it doesn't exist
    if not os.path.exists(directory):
        os.makedirs(directory)
    # Use WGet to download the site
    command = f"wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --directory-prefix={directory} {url}"
    subprocess.run(command, shell=True, check=True)
    print(f"Successfully downloaded {url} to {directory}")
def process_files(directory):
    """
    Placeholder function to process files after download.
    Here you could analyze content, extract information, etc.
    """
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file.endswith('.html'):
                # Example: You could open and read HTML files here
                pass
# URL of the site to scrape
url_to_download = "http://example.com"
# Download the site
download_website(url_to_download)
# Process the downloaded files
process_files("downloaded_site")

Explanation

wget Command: We use WGet with specific flags:

  • --recursive for recursive downloading.
  • --no-clobber to avoid re-downloading existing files.
  • --page-requisites to download all files necessary for the page display.
  • --html-extension adds .html to filenames that don't have an extension.
  • --convert-links modifies links for local viewing.
  • --restrict-file-names=windows for Windows-compatible file names.

Subprocess: This module allows Python to run WGet as an external command. File Processing: A basic example where we could implement parsing, data extraction, or any other processing.

Conclusion

Combining Python with WGet gives you a potent tool for web scraping and data collection. This example just scratches the surface; you can extend this script to handle authentication, deal with specific formats, or integrate with other Python libraries for data analysis. Remember, with great power comes great responsibility - always respect the terms of service of the websites you scrape and consider the legal and ethical implications.

Permalink
December 21, 2023

How pyperclip Can Supercharge Your Python Automation

Say Goodbye to Temporary Files

Python Pyperclip

I have several Python scripts makes my life easier. Sometimes I create temporary files to view the output. Which is fine, but there has to be a better way. Enter pyperclip, a cross-platform gem that turned my automation paradigm upside down.

Imagine this: your script crunches through data, generates a beautiful report, and?poofs it into your clipboard. Gone are the days of saving messy filenames or having files clutter the desktop. With pyperclip, your output lands directly in your preferred text editor, ready for further editing or pasting wherever it needs to go.

Here's how pyperclip works its magic:

  • Simple Installation: Just pip install pyperclip and you're good to go. No platform-specific dependencies, just pure Pythonic goodness.
  • Copy in a snap: Use pyperclip.copy(my_report) to send any string, be it plain text, HTML, or even Markdown, straight to your clipboard.
  • Paste anywhere: Open your favorite text editor, hit Ctrl+V, and voila! Your script's output is there, ready to be polished or shared.

But pyperclip isn't just about convenience. It offers several advantages:

  • Reduced disk I/O: No more temporary files cluttering your disk. Your script runs leaner and meaner.
  • Improved workflow: Paste directly into reports, emails, or anything else, saving precious time and context switching.
  • Platform independence: Works seamlessly across Windows, macOS, and Linux, making your scripts truly portable.

Of course, there are limitations. pyperclip deals primarily with plain text, so complex data structures or images might require alternative approaches. And, like any good tool, it's best used judiciously. Sensitive information shouldn't be carelessly chucked into the clipboard.

But for everyday automation tasks, pyperclip is a game-changer. It streamlines workflows, reduces complexity, and adds a touch of magic to your Python scripts. So, the next time you're tempted to create a temporary file, remember: with pyperclip, your output can be just a Ctrl+V away.

Go forth, fellow automators, and embrace the clipboard revolution!

Bonus Tip: Combine pyperclip with other libraries like pandas or beautifulsoup4 to scrape data, generate reports, and send them directly to your favorite text editor. The possibilities are endless!

I hope this blog post inspires you to explore the power of pyperclip and unlock a new level of efficiency in your Python automation efforts. Happy coding!

Permalink
December 7, 2023

Using FFmpeg with Python

Cool Tricks with FFmpeg

With a background of five years in Quality Assurance (QA), I've had the opportunity to delve deep into the world of automation programming using Python. In this journey, one tool that has stood out for its versatility and power is FFmpeg, a comprehensive multimedia framework. This blog aims to share insights and practical advice on leveraging FFmpeg in Python for various automation tasks.

What is FFmpeg?

FFmpeg is an open-source software suite that can record, convert, and stream digital audio and video in various formats. It includes libavcodec, a leading audio/video codec library that is used by many other projects.

Why Use FFmpeg with Python?

Python, known for its simplicity and readability, is an excellent choice for automating tasks. When paired with FFmpeg's capabilities, it becomes a powerhouse for handling media files. Python's vast ecosystem offers libraries like moviepy, imageio, and ffmpeg-python that act as wrappers for FFmpeg, making it more accessible and easier to use within Python scripts.

Getting Started with FFmpeg in Python

Installation

  1. Install FFmpeg: Ensure FFmpeg is installed on your system. It's available for Windows, Mac, and Linux.

  2. Python Libraries: Install a Python wrapper for FFmpeg. You can use pip to install libraries like ffmpeg-python:

    pip install ffmpeg-python

Basic Operations

Video Conversion

Convert a video from one format to another:

import ffmpeg

input_video = 'input.mp4'
output_video = 'output.avi'

ffmpeg.input(input_video).output(output_video).run()

Extracting Audio

Extract audio from a video file:

input_video = 'input.mp4'
output_audio = 'output.mp3'

ffmpeg.input(input_video).output(output_audio).run()

Advanced Usage

Video Editing

Combine multiple video clips into one:

import ffmpeg

input1 = ffmpeg.input('input1.mp4')
input2 = ffmpeg.input('input2.mp4')
joined = ffmpeg.concat(input1, input2, v=1, a=1).node
output = ffmpeg.output(joined[0], joined[1], 'output.mp4')
ffmpeg.run(output)

Automated Testing

Create automated tests for video/audio quality, format compatibility, and performance testing.

# Example: Verify video resolution
video_info = ffmpeg.probe('video.mp4')
width = video_info['streams'][0]['width']
height = video_info['streams'][0]['height']

assert width == 1920 and height == 1080, "Resolution check failed"

Best Practices and Tips

  • Scripting: Automate repetitive tasks with scripts. FFmpeg commands can be integrated into Python scripts for batch processing.
  • Error Handling: Implement robust error handling to manage exceptions and unexpected inputs.
  • Performance Optimization: Use appropriate codecs and settings to balance quality and performance.
  • Documentation and Community: Leverage the extensive documentation and active community for troubleshooting and advanced techniques.

Conclusion

Integrating FFmpeg with Python offers a powerful solution for automating a wide range of multimedia processing tasks. Whether it's for QA, development, or content creation, the combination of FFmpeg's capabilities and Python's ease of use opens up endless possibilities. Embrace this toolset, experiment with its features, and watch your productivity soar!

Permalink
November 30, 2023

cURL in Python

Quick Tip on using cURL in Python

As a QA professional, I've encountered various tools and methodologies that enhance testing processes. Among these, Unix cURL stands out for its utility in handling network requests. In this blog, I'll delve into how we can leverage cURL within Python to streamline our testing and automation tasks.

What is cURL?

cURL, short for 'Client for URLs', is a command-line tool used to transfer data to or from a server. It supports a variety of protocols, including HTTP, HTTPS, FTP, and more. While cURL is inherently a Unix-based tool, its functionality is crucial for testing APIs, web applications, and automating network requests.

Why Use cURL in Python?

Python's extensive libraries like requests are commonly used for handling HTTP requests. However, cURL offers a different approach with its command-line driven method, which is sometimes more flexible and powerful, especially when dealing with complex scenarios or when reproducing requests copied from browsers' developer tools.

Integrating cURL with Python

There are several ways to integrate cURL commands into Python, with the most common method being the use of subprocess module. Here's a basic example:

import subprocess

def curl_command(url):
    command = f"curl {url}"
    process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
    output, error = process.communicate()
    if error:
        print(f"Error: {error}")
    else:
        return output.decode('utf-8')

# Example Usage
response = curl_command("https://api.example.com/data")
print(response)

This script uses subprocess.Popen to execute a cURL command and fetch data from a given URL. It's a straightforward way to integrate cURL's capabilities into a Python script.

Handling Complex cURL Commands

For more complex scenarios, like sending data, handling headers, or dealing with authentication, the cURL command string can be modified accordingly. For instance:

def complex_curl_command(url, data, headers):
    command = f"curl -X POST -d '{data}' -H '{headers}' {url}"
    # Rest of the code remains similar

Advantages and Limitations

Advantages: 1. Direct Transfer from Browsers: cURL commands can often be copied as-is from browsers' developer tools, which is handy for reproducing and automating specific requests. 2. Support for Multiple Protocols: cURL's versatility across different protocols makes it a powerful tool in a QA's arsenal.

Limitations: 1. Security Concerns: Running shell commands from within Python can pose security risks, especially when dealing with untrusted input. 2. Complexity: For simple HTTP requests, using Python libraries like requests is more straightforward and Pythonic.

Conclusion

Integrating Unix cURL with Python scripts provides a robust method for handling complex network requests. It's particularly useful for QA professionals looking to automate and test applications with specific network interactions. However, it's essential to weigh its benefits against potential security risks and complexity.

For those interested in exploring further, I recommend experimenting with different cURL options and understanding how they translate into the Python context. Happy testing!

Permalink
November 23, 2023

Giving Thanks to Python

A QA Professional's Perspective

Python Thanksgiving

As a Quality Assurance (QA) professional with five years of hands-on experience in automation programming, I've developed a deep appreciation for the tools and languages that make my work both possible and enjoyable. With Thanksgiving around the corner, it feels like the perfect time to reflect on what makes Python such an invaluable asset in our field. Let's delve into the reasons why we, as QA professionals, are especially thankful for Python.

Simplifying Complexity: Python's Readability

One of the most immediate aspects of Python that we're thankful for is its readability. Python's syntax is clear, concise, and almost English-like, making it an excellent language for beginners and experts alike. This readability not only makes writing code more straightforward but also simplifies the process of reviewing and maintaining code over time - a crucial factor in QA where clarity is king.

Wide-Ranging Libraries and Frameworks

Python's extensive libraries and frameworks are a boon for QA automation. Selenium for web automation, PyTest for writing test scripts, and Behave for behavior-driven development, to name just a few, are all powerful tools that help streamline our testing processes. These libraries save us from reinventing the wheel and allow us to focus on creating more sophisticated and effective test cases.

Cross-Platform Compatibility

Python's ability to run on various platforms ? Windows, macOS, Linux ? is a significant advantage. This compatibility ensures that our test scripts are versatile and adaptable, mirroring the diverse environments in which the software we test operates. For a QA professional, this universality is invaluable.

Strong Community and Support

The Python community is a vibrant and supportive ecosystem. From forums and discussion boards to conferences and meetups, the wealth of shared knowledge and resources is something to be truly thankful for. This community support makes problem-solving more collaborative and learning continuous.

Automation Made Easy

Python excels in automating repetitive tasks, a core part of QA work. With its simplicity and powerful scripting capabilities, Python makes it easier to automate test cases, data generation, and even setup and teardown processes in test environments. This efficiency is something every QA professional appreciates.

Integration Capabilities

Python's ability to integrate with other tools and technologies is another reason for gratitude. Whether it's integrating with CI/CD pipelines, cloud services, or other programming languages, Python's flexibility makes it a Swiss Army knife in the QA toolkit.

The Joy of Learning and Growing

Lastly, Python makes the journey of learning and professional growth enjoyable. Its welcoming community, vast resources, and the satisfaction of writing efficient code make Python not just a tool, but a path to continuous learning and improvement.

Conclusion

As we gather around the Thanksgiving table, it's worth reflecting on the tools and technologies that enrich our professional lives. Python, with its versatility, ease of use, and strong community support, is certainly high on the list for many QA professionals. It has not just made our jobs easier but also more enjoyable.

Call to Action

Are you a QA professional who uses Python? What aspects of Python are you most thankful for? Share your thoughts and experiences in the comments below!

Permalink
November 16, 2023

Duck Typing using Python

Duck Typing in action

Duck Python Code

As a QA professional with five years of experience in automation programming using Python, I've come to appreciate the language's flexibility and expressiveness. One of the concepts that stand out in Python is Duck Typing. It's not just a programming principle; it's a philosophy that Python embraces wholeheartedly. In this blog, I'll share some fun tricks and insights into Duck Typing, showing how it can make your Python code more flexible and intuitive.

What is Duck Typing?

Duck Typing is a concept derived from the saying, "If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck." In Python, this means that you don't check the type of an object; you check for the presence of a specific method or attribute.

Why is Duck Typing Fun?

  • Flexibility: Duck Typing allows for more generic and flexible code. You can write functions that accept any object, as long as it has the methods or attributes you need.
  • Simplicity: It simplifies the code. You don?t have to write complex type-checking code.
  • Surprise Factor: It's always fun to see an object work seamlessly in a place where you wouldn't traditionally expect it to.

Duck typing is a concept in Python that allows you to use objects without knowing their specific type. This can be helpful when you're working with objects from different libraries or third-party code. For example, you could use the len() function to get the length of a list, a string, or even a tuple:


print(len([1, 2, 3]))
print(len("Hello, world!"))
print(len((1, 2, 3)))

All of these statements will work because the len() function doesn't care about the specific type of object it's given. It only cares that the object has a __len__() method defined.

Permalink
November 9, 2023

Getting the Current URL Using Python in MacOS

Example using Chrome or Safari

Welcome to my latest blog post! As a QA professional with five years of experience in automation programming using Python, I've often encountered scenarios where I needed to fetch the current URL from a browser - be it for testing, automation, or data extraction. Today, I'm going to walk you through a simple yet effective way to get the current URL from Chrome or Safari on MacOS using Python.

Why Fetch URLs Programmatically?

Fetching URLs programmatically can be useful in various scenarios, such as:

  • Automating tests that require validating the current page in a browser.
  • Monitoring web usage or collecting data for analysis.
  • Building extensions or integrations that react to URL changes.

Python Code Chrome

Prerequisites

Before we dive in, ensure you have Python installed on your MacOS. You can check this by running python -version in your terminal. Additionally, install the required libraries using pip:

pip install pyobjc-framework-ScriptingBridge

or

python3 -m pip install pyobjc-framework-ScriptingBridge --user

This library allows Python to interact with MacOS applications via the ScriptingBridge framework.

Fetching URL from Chrome

Let's start with Google Chrome. Chrome, like many modern browsers, exposes its current state through AppleScript, which we can access in Python using the ScriptingBridge framework.

The Python Script


import ScriptingBridge
def get_chrome_url():
    chrome = ScriptingBridge.SBApplication.applicationWithBundleIdentifier_("com.google.Chrome")
    if not chrome.windows():
        return "No active window"
    
    window = chrome.windows()[0]  # Get the first window
    tab = window.activeTab()  # Get the active tab in the window
    return tab.URL()  # Return the URL of the active tab
print(get_chrome_url())

This script initializes a connection to Chrome, checks if there are any open windows, and then fetches the URL of the active tab in the first window.

Chrome Example

Fetching URL from Safari

The process for Safari is quite similar. However, Safari's bundle identifier differs.

The Python Script for Safari


import ScriptingBridge
def get_safari_url():
    safari = ScriptingBridge.SBApplication.applicationWithBundleIdentifier_("com.apple.Safari")
    if not safari.windows():
        return "No active window"
    
    window = safari.windows()[0]
    return window.currentTab().URL()
print(get_safari_url())

Here, we connect to Safari and fetch the URL from the current tab of the first window.

Safari Python Example

Handling Multiple Windows or Tabs

If you need to handle multiple windows or tabs, you can loop through the windows array and fetch URLs from each tab. This can be particularly useful for comprehensive testing or data extraction tasks.

Security and Permissions

Since MacOS Mojave, apps need explicit permissions to control other apps. The first time you run these scripts, MacOS should prompt you to grant Terminal (or your Python IDE) access to control the browser. Make sure to allow this for the script to function correctly.

Conclusion

Fetching the current URL from browsers like Chrome and Safari is straightforward with Python and ScriptingBridge in MacOS. This technique opens up a range of possibilities for automation, testing, and data collection. Experiment with it, and you'll find it a valuable addition to your Python automation toolkit.

Permalink
October 26, 2023

Ghost Writing in Python

Some Practical Techniques in Python

Ghost Writing Python

As a QA Engineer, you're always looking for ways to improve your testing process. Python is a powerful language that can be used to automate many QA tasks, but it can also be used to generate text. This can be useful for creating test cases, writing reports, and even ghostwriting blog posts.

On this Halloween, let's take a look at how to use Python to do ghostwriting.

What is ghost writing?

Ghost writing is the practice of writing content for someone else, but not taking credit for it. This is often done for clients who need help writing blog posts, articles, or even books.

Why use Python for ghost writing?

There are a few reasons why Python is a good choice for ghost writing:

It's a powerful language that can be used to generate text in a variety of formats. It's relatively easy to learn, especially for QA Engineers who are already familiar with programming. There are a number of libraries and tools available that can make it easier to generate text with Python.

How to do ghost writing in Python

Here is a simple example of how to do ghost writing in Python:


import random
def generate_sentence():
  """Generates a random sentence."""
  words = ["The", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog."]
  sentence = ""
  for word in random.choices(words, k=10):
    sentence += word + " "
  return sentence
# Generate 10 random sentences
sentences = []
for i in range(10):
  sentences.append(generate_sentence())
# Write the sentences to a file
with open("ghost_written_sentences.txt", "w") as f:
  for sentence in sentences:
    f.write(sentence + "n")

This code will generate 10 random sentences and write them to a file called ghost_written_sentences.txt.

Generating text with Python for QA purposes

Of course, you can use Python to generate more sophisticated text than just random sentences. For example, you could use it to generate test cases, write reports, or even ghostwrite blog posts.

Here are a few ideas:

  • Generate test cases: You could use Python to generate test cases for your QA testing. This could be especially useful for generating complex or data-driven test cases.
  • Write reports: You could use Python to generate reports from your QA testing results. This could save you time and help you to communicate your results more effectively.
  • Ghostwrite blog posts: If you're a QA Engineer with expertise in a particular area, you could use Python to ghostwrite blog posts for other people. This is a great way to share your knowledge and build your reputation.

Permalink