杰瑞科技汇

Python TouchAction如何实现移动端手势操作?

Of course! Let's dive deep into TouchAction in Python, specifically in the context of the most popular mobile automation framework, Appium.

Python TouchAction如何实现移动端手势操作?-图1
(图片来源网络,侵删)

What is TouchAction?

TouchAction is a class in the Appium Python Client library that allows you to chain together individual touch actions to create complex gestures on a mobile device. Think of it as giving a script a "finger" that can tap, swipe, scroll, drag, and perform other multi-touch interactions.

Instead of just sending a single command, you build a sequence of actions and then perform them all at once. This is how you simulate realistic, human-like gestures.


Core Concepts: The Building Blocks

A TouchAction is built from a series of methods, where each method represents one part of the gesture. The most common building blocks are:

Method Description Corresponds to...
.press() Starts the gesture by touching the screen at a specific element or coordinates. Putting your finger down.
.move_to() Moves the finger from the current position to a new element or coordinates. Sliding your finger.
.wait() Pauses the gesture for a specified duration (in milliseconds). Holding your finger in place.
.release() Lifts the finger off the screen, ending the gesture. Lifting your finger.
.tap() A shortcut for a quick press and release. A single tap.
.perform() Executes the entire sequence of actions you've built. The final "do it" command.

How to Use TouchAction: A Step-by-Step Guide

First, you need to import the class and initialize it with your driver instance.

Python TouchAction如何实现移动端手势操作?-图2
(图片来源网络,侵删)
from appium.webdriver.common.touch_action import TouchAction
from appium import webdriver
# Assume 'driver' is your initialized Appium driver
# driver = webdriver.Remote('http://localhost:4723/wd/hub', desired_caps)
# Initialize the TouchAction object
actions = TouchAction(driver)

The Basic Tap

Tapping is the most common action. The .tap() method is a convenient shortcut.

# --- Option A: Tap on an element ---
element = driver.find_element("id", "some_button_id")
actions.tap(element).perform()
# --- Option B: Tap on specific coordinates (x, y) ---
# This is useful if there's no easily locatable element.
actions.tap(x=100, y=200).perform()

The Swipe (or Drag-and-Drop)

A swipe is a combination of pressing, moving, and releasing.

Let's say you want to swipe an element from its current position to another position.

# Find the source and destination elements
source_element = driver.find_element("id", "drag_item")
destination_element = driver.find_element("id", "drop_zone")
# Build the swipe action
actions \
    .press(source_element) \
    .wait(500)  # Hold for 500 milliseconds \
    .move_to(destination_element) \
    .release() \
    .perform()

Explanation:

  1. .press(source_element): Place your "finger" on the drag_item.
  2. .wait(500): Keep the finger pressed for half a second.
  3. .move_to(destination_element): Slide the finger to the drop_zone.
  4. .release(): Lift the finger off the screen.
  5. .perform(): Execute the entire sequence.

The Scroll

Scrolling is similar to a swipe, but it's often used to navigate a long list. You can scroll to a specific element or by a specific amount.

A) Scrolling to an Element (by UI Automator)

This is the most reliable way to scroll. You tell Appium to scroll until a specific element is visible.

# This uses the UIAutomator2 driver's ability to scroll to an element.
# It's not a TouchAction, but it's the best practice for scrolling.
element_to_find = driver.find_element("xpath", "//android.widget.TextView[@text='Settings']")
driver.scroll_to_element(element_to_find)
element_to_find.click()

B) Scrolling by Coordinates (using TouchAction)

If you need to perform a generic scroll without a specific target, you can use coordinates.

# Get the size of the screen
screen_size = driver.get_window_size()
width = screen_size['width']
height = screen_size['height']
# Define the start and end points for the scroll
# Start from 80% down the screen and scroll up to 20% down the screen
start_y = int(height * 0.8)
end_y = int(height * 0.2)
start_x = end_x = width // 2 # Scroll vertically in the middle
# Build the scroll action
actions \
    .press(x=start_x, y=start_y) \
    .wait(1000) \
    .move_to(x=end_x, y=end_y) \
    .release() \
    .perform()

Long Press

A long press is useful for opening context menus or selecting text.

element_to_long_press = driver.find_element("id", "long_press_me")
actions \
    .press(element_to_long_press) \
    .wait(2000)  # Wait for 2 seconds \
    .release() \
    .perform()

Complete, Runnable Example

Here's a full example using the Android Calculator App. Make sure you have an Android emulator or device running and Appium server is started.

import time
from appium import webdriver
from appium.webdriver.common.touch_action import TouchAction
# Desired Capabilities for Android Emulator
desired_caps = {
    "platformName": "Android",
    "deviceName": "Pixel_API_30", # Change to your device/emulator name
    "appPackage": "com.android.calculator2",
    "appActivity": "com.android.calculator2.Calculator",
    "automationName": "UiAutomator2",
    "noReset": True
}
# Initialize the driver
driver = webdriver.Remote('http://localhost:4723/wd/hub', desired_caps)
time.sleep(2) # Wait for the app to launch
# --- Example 1: Simple Tap ---
print("Performing a simple tap...")
button_5 = driver.find_element("id", "digit_5")
actions = TouchAction(driver)
actions.tap(button_5).perform()
# --- Example 2: Swipe Gesture ---
print("Performing a swipe gesture...")
# Let's swipe the '5' button to the '6' button
button_6 = driver.find_element("id", "digit_6")
actions \
    .press(button_5) \
    .wait(500) \
    .move_to(button_6) \
    .release() \
    .perform()
# --- Example 3: Long Press ---
print("Performing a long press...")
# Let's long press the '=' button to see if it does anything (it might not in this app)
button_equals = driver.find_element("id", "eq")
actions \
    .press(button_equals) \
    .wait(1500) \
    .release() \
    .perform()
time.sleep(3) # Pause to observe the results
# Quit the driver
driver.quit()

Modern Alternatives: W3C Actions

While TouchAction is still widely supported and works perfectly, the official W3C WebDriver protocol now includes a more powerful and standardized way to handle actions: the W3C Actions API.

Appium supports this API, and it's considered the more modern approach. It's more complex to write but is more powerful for multi-touch (pinch, zoom) and is the future standard.

Example of a W3C Actions swipe (equivalent to the TouchAction swipe):

# You need to import the W3C actions classes
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.actions import interaction
from selenium.webdriver.common.actions.action_builder import ActionBuilder
from selenium.webdriver.common.actions.pointer_input import PointerInput
# Get the source and destination elements
source_element = driver.find_element("id", "drag_item")
destination_element = driver.find_element("id", "drop_zone")
# Create a pointer action (touch)
pointer = PointerInput(interaction.POINTER_TOUCH, "touch")
actions = ActionChains(driver)
# Build the sequence
actions \
    .w3c_actions \
    .add(pointer.create_pointer_move(duration=0, x=source_element.location['x'], y=source_element.location['y'])) \
    .add(pointer.create_pointer_down(interaction.BUTTON_LEFT)) \
    .add(pointer.create_pause(500)) \
    .add(pointer.create_pointer_move(duration=500, x=destination_element.location['x'], y=destination_element.location['y'])) \
    .add(pointer.create_pointer_up(interaction.BUTTON_LEFT)) \
    .perform()

Summary: TouchAction vs. W3C Actions

Feature TouchAction W3C Actions API
Simplicity Easier to write and read for basic gestures. More verbose and complex.
Standardization Appium-specific implementation. W3C Standard, future-proof.
Power Good for single-finger gestures. More powerful, handles multi-touch (pinch, zoom) easily.
Recommendation Perfect for beginners and most use cases. Recommended for complex gestures and future-proof projects.

For most day-to-day automation tasks, TouchAction is your best friend. It's simple, effective, and gets the job done. If you find yourself needing pinch, zoom, or other advanced multi-touch gestures, then it's time to learn the W3C Actions API.

分享:
扫描分享到社交APP
上一篇
下一篇