杰瑞科技汇

Python multiarray如何高效操作多维数组?

Of course! The term "multiarray" in Python almost always refers to the numpy.ndarray object from the NumPy library.

Python multiarray如何高效操作多维数组?-图1
(图片来源网络,侵删)

While Python lists can hold lists (creating a "list of lists" that acts like a 2D array), they are not true multi-dimensional arrays. They are inefficient for numerical operations, lack many mathematical functions, and their performance is poor for large-scale data.

NumPy solves this by providing a powerful, high-performance, multi-dimensional array object called ndarray.

Here’s a comprehensive guide to multi-arrays in Python using NumPy.


What is a NumPy ndarray?

An ndarray (N-dimensional array) is a grid of values, all of the same data type, indexed by a tuple of non-negative integers. The key features are:

Python multiarray如何高效操作多维数组?-图2
(图片来源网络,侵删)
  • Homogeneous Data: All elements in an array must be of the same type (e.g., all integers, all floats). This is what allows for efficient memory usage and fast operations.
  • Fixed Size: The size of an array is fixed once it's created. You cannot change the size of an existing array (though you can create a new one).
  • Multi-dimensional: It can represent data in 1D (vector), 2D (matrix), 3D (tensor), or even higher dimensions.
  • Vectorized Operations: You can perform mathematical operations on entire arrays without writing slow Python loops. This is the core reason for NumPy's speed.

Installation and Basic Setup

First, you need to install NumPy if you haven't already.

pip install numpy

Then, import it in your Python script. The standard convention is to import it as np.

import numpy as np

Creating Multi-Arrays (ndarray)

There are several ways to create NumPy arrays.

a) From a Python List

This is the most common way to start.

Python multiarray如何高效操作多维数组?-图3
(图片来源网络,侵删)
# 1D Array (Vector)
list_1d = [1, 2, 3, 4, 5]
arr_1d = np.array(list_1d)
print(f"1D Array: {arr_1d}")
print(f"Shape: {arr_1d.shape}") # (5,) means 5 elements in one dimension
print("-" * 20)
# 2D Array (Matrix)
list_2d = [[1, 2, 3], [4, 5, 6]]
arr_2d = np.array(list_2d)
print(f"2D Array:\n{arr_2d}")
print(f"Shape: {arr_2d.shape}") # (2, 3) means 2 rows, 3 columns
print("-" * 20)
# 3D Array (Tensor)
list_3d = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
arr_3d = np.array(list_3d)
print(f"3D Array:\n{arr_3d}")
print(f"Shape: {arr_3d.shape}") # (2, 2, 2) means 2 matrices, each with 2 rows and 2 columns

b) Built-in Creation Functions

NumPy provides functions to create arrays of specific patterns.

# Array of zeros
zeros_arr = np.zeros((3, 4)) # Shape (3 rows, 4 columns)
print(f"Zeros Array:\n{zeros_arr}")
print("-" * 20)
# Array of ones
ones_arr = np.ones((2, 2))
print(f"Ones Array:\n{ones_arr}")
print("-" * 20)
# Array of a constant value
full_arr = np.full((2, 3), 99) # Shape (2, 3) filled with 99
print(f"Full Array:\n{full_arr}")
print("-" * 20)
# Identity matrix (square matrix with 1s on the diagonal)
identity_arr = np.eye(3)
print(f"Identity Array:\n{identity_arr}")
print("-" * 20)
# Array with a range of values
range_arr = np.arange(10) # Similar to Python's range()
print(f"Range Array: {range_arr}")
print("-" * 20)
# Array with evenly spaced numbers over an interval
linspace_arr = np.linspace(0, 1, 5) # 5 numbers from 0 to 1 (inclusive)
print(f"Linspace Array: {linspace_arr}")
print("-" * 20)
# Array of random numbers
random_arr = np.random.rand(3, 2) # 3x2 array of random floats between 0 and 1
print(f"Random Array:\n{random_arr}")

Key Properties of an ndarray

arr = np.array([[1, 2, 3], [4, 5, 6]])
print(f"Array:\n{arr}")
print(f"Shape: {arr.shape}")       # Dimensions of the array (2, 3)
print(f"Number of dimensions: {arr.ndim}") # Number of axes (2)
print(f"Size (total elements): {arr.size}") # Total number of elements (6)
print(f"Data type: {arr.dtype}")   # Type of the elements (e.g., int64)
print(f"Item size (bytes per element): {arr.itemsize}") # Size in bytes of each element

Indexing and Slicing

Slicing works very similarly to Python lists, but with multiple dimensions separated by commas.

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(f"Original Array:\n{arr}")
print("-" * 20)
# Element at a specific position (row, column)
print(f"Element at [0, 1]: {arr[0, 1]}") # 2
print("-" * 20)
# Slice a row
print(f"First row: {arr[0, :]}") # or simply arr[0]
print("-" * 20)
# Slice a column
print(f"Second column: {arr[:, 1]}")
print("-" * 20)
# Slice a sub-matrix
# Rows from index 1 to 2, columns from index 2 to 3
print(f"Sub-matrix:\n{arr[1:3, 2:4]}")

Essential Operations

This is where NumPy truly shines.

a) Arithmetic Operations (Vectorized)

Operations are performed element-wise.

a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])
print(f"a: {a}")
print(f"b: {b}")
print(f"a + b: {a + b}")
print(f"a * b: {a * b}")
print(f"a / 2: {a / 2}") # Division with a scalar
print(f"a ** 2: {a ** 2}") # Exponentiation

b) Universal Functions (ufuncs)

These are functions that operate on ndarray objects in an element-wise fashion.

arr = np.array([0, np.pi/2, np.pi])
print(f"sin(arr): {np.sin(arr)}")
print(f"cos(arr): {np.cos(arr)}")
print(f"sqrt(arr): {np.sqrt(arr)}")

c) Aggregation Functions

These functions compute a single result from an array.

arr = np.array([[1, 2, 3], [4, 5, 6]])
print(f"Array:\n{arr}")
print(f"Sum of all elements: {np.sum(arr)}")
print(f"Sum along columns (axis=0): {np.sum(arr, axis=0)}") # [1+4, 2+5, 3+6]
print(f"Sum along rows (axis=1): {np.sum(arr, axis=1)}")   # [1+2+3, 4+5+6]
print(f"Mean: {np.mean(arr)}")
print(f"Max: {np.max(arr)}")
print(f"Min: {np.min(arr)}")
print(f"Standard Deviation: {np.std(arr)}")

Reshaping and Stacking

You can change the shape of an array or combine multiple arrays.

Reshaping

# Create a 1D array with 12 elements
arr_1d = np.arange(12)
print(f"Original 1D array: {arr_1d}")
print(f"Shape: {arr_1d.shape}")
print("-" * 20)
# Reshape it to a 3x4 matrix
arr_2d = arr_1d.reshape(3, 4)
print(f"Reshaped 2D array:\n{arr_2d}")
print(f"Shape: {arr_2d.shape}")
print("-" * 20)
# Reshape it to a 2x2x3 tensor
arr_3d = arr_1d.reshape(2, 2, 3)
print(f"Reshaped 3D array:\n{arr_3d}")
print(f"Shape: {arr_3d.shape}")

Stacking

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
# Stack vertically (creates more rows)
vstack_arr = np.vstack((a, b))
print(f"Vertical Stack:\n{vstack_arr}")
print("-" * 20)
# Stack horizontally (creates more columns)
hstack_arr = np.hstack((a, b))
print(f"Horizontal Stack:\n{hstack_arr}")

Comparison with Python Lists

Feature Python List NumPy ndarray
Data Type Can hold mixed types (e.g., int, str, object) Homogeneous (all elements same type)
Performance Slow for numerical operations due to Python overhead Extremely fast due to C-level implementation
Memory Usage Higher memory overhead Compact and efficient memory usage
Functionality Basic container Rich library for linear algebra, statistics, etc.
Operations Element-wise operations require explicit loops Vectorized operations on entire arrays
Dimensions "List of lists" is a clumsy simulation True multi-dimensional objects

Conclusion

When you hear "multiarray" in a Python context, you should immediately think of NumPy's ndarray. It is the foundational library for scientific computing, data analysis (with Pandas), and machine learning (with Scikit-learn, TensorFlow, PyTorch) in Python. Mastering NumPy arrays is an essential skill for any data scientist or engineer working in Python.

分享:
扫描分享到社交APP
上一篇
下一篇