NumPy是Python语言的一个扩展程序库。
支持高阶大量多维度数组与矩阵运算,针对数组运算提供大量的数学函数库。
import numpy as npx = np.arange ( 5 ) print ( x )
np.arange(5)
是 NumPy 库中的一个函数调用,用于生成一个从 0 开始、步长为 1、到(但不包含)5 的一维数组。
安装 NumPy
安装
pip install numpy
查看版本
import numpy as np
print(np.__version__)
快速入门
NumPy 包的核心是 ndarray 对象。
它封装了同质数据类型的 n 维数组,
为了提高性能,许多操作在编译后的代码中执行。
NumPy 的数组类名为ndarray 。
它也有一个别名 array 。
请注意, numpy.array 与 Python 标准库中的类 array.array 不同,后者仅处理一维数组,且功能较少。
>>> import numpy as np>>> one = np.array( [ 7, 2, 9, 10 ] ) >>> one.shape (4,)>>> two = np.array( [ [ 5.2, 3.0, 4.5 ], [ 9.1, 0.1, 0.3 ] ] ) >>> two.shape (2, 3)>>> three = np.array( [ [ [ 1, 1 ], [ 1, 1 ], [ 1, 1 ] ],[ [ 1, 1 ], [ 1, 1 ], [ 1, 1 ] ],[ [ 1, 1 ], [ 1, 1 ], [ 1, 1 ] ],[ [ 1, 1 ], [ 1, 1 ], [ 1, 1 ] ] ] ) >>> three.shape (4, 3, 2)
如果array里有一个元素是字符,则所有元素都会变成string类型
>>> a = np.array ( [ 0, 1, 2, 3, 'abc' ] ) >>> a array(['0', '1', '2', '3', 'abc'], dtype='<U21')
The Basic 基础知识—— An Example
ndarray.dtype : the type of the elements in the array. 描述数组中元素类型的对象。可用标准 Python数据类型。NumPy 还提供了自己的类型,numpy.int32、numpy.int16 和 numpy.float64。
ndarray.itemsize : the size in bytes of each element of the array. 数组中每个元素的大小(以字节为单位)。float64类型占用8个字节内存空间
ndarray.ndim : the number of axes (dimensions) of the array. 数组的轴数(维度数量)
ndarray.shape : the dimensions of the array. 数组的维度。为整数元组,表示每个维度上的大小。
ndarray.size : the total number of elements of the array. 数组元素的总数。等于shape的乘积。
>>> import numpy as np >>> a = np.arange( 15 ) >>> a array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]) >>> a.shape (15,) >>> a.ndim 1 >>> a.dtype.name 'int64' >>> a.itemsize 8 >>> a.size 15 >>> type(a) <class 'numpy.ndarray'>
>>> b = a.reshape( 3, 5 ) >>> b array([[ 0, 1, 2, 3, 4],[ 5, 6, 7, 8, 9],[10, 11, 12, 13, 14]]) >>> b.shape (3, 5) >>> b.ndim 2 >>> b.dtype.name 'int64' >>> b.itemsize 8 >>> b.size 15 >>> type(b) <class 'numpy.ndarray'>
在 NumPy 中,a.reshape(3, 5)
是数组对象 a
的一个方法,用于将数组 a
重塑为指定形状的新数组,这里的目标形状是 3 行 5 列(即 3×5 的二维数组)。
关键说明:
The basics——Array creation 数组创建
>>> np.zeros((3, 4)) array([[0., 0., 0., 0.],[0., 0., 0., 0.],[0., 0., 0., 0.]]) >>> d = np.zeros((3, 4)) >>> d array([[0., 0., 0., 0.],[0., 0., 0., 0.],[0., 0., 0., 0.]])>>> e = np.ones((2, 3, 4), dtype=np.int16) >>> e array([[[1, 1, 1, 1],[1, 1, 1, 1],[1, 1, 1, 1]],[[1, 1, 1, 1],[1, 1, 1, 1],[1, 1, 1, 1]]], dtype=int16)>>> f = np.empty((2, 3)) >>> f array([[3.44900708e-307, 4.22786102e-307, 2.78145267e-307],[4.00537061e-307, 2.23419104e-317, 0.00000000e+000]])
>>> np.arange ( 10, 30, 5 ) array([10, 15, 20, 25]) >>> np.arange ( 0, 2, 0.3 ) array([0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])>>> np.linspace ( 0, 2, 9 ) array([0.,0.25,0.5,0.75,1.,1.25,1.5,1.75,2.]) >>> x = np.linspace ( 0, 2 * np.pi, 100 ) >>> f = np.sin(x)
函数arange由于浮点精度有限,通常无法预测获得的元素数量,应该使用函数 linspace
The type of the array can also be explicitly specified at creation time:
数组的类型也可以在创建时明确指定:
>>> c = np.array([[1, 2], [3, 4]], dtype=complex) >>> c array([[1.+0.j, 2.+0.j],[3.+0.j, 4.+0.j]])
The basics——Printing arrays 打印数组
When you print an array, NumPy displays it in a similar way to nested lists, but with the following layout:打印数组时,NumPy 以类似于嵌套列表的方式显示它,但采用以下布局:
the last axis is printed from left to right, 最后一个轴从左到右打印,
the second-to-last is printed from top to bottom, 倒数第二个从上到下打印,
the rest are also printed from top to bottom, with each slice separated from the next by an empty line. 其余部分也从上到下打印,每个切片与下一个切片之间用空行分隔。
One-dimensional arrays are then printed as rows, bidimensionals as matrices and tridimensionals as lists of matrices. 然后将一维数组打印为行,将二维数组打印为矩阵,将三维数组打印为矩阵列表。
>>> a = np.arange(6) >>> a array([0, 1, 2, 3, 4, 5]) >>> print(a) [0 1 2 3 4 5]>>> b = np.arange(12).reshape(4, 3) >>> print(b) [[ 0 1 2][ 3 4 5][ 6 7 8][ 9 10 11]]>>> c = np.arange(24).reshape(2, 3, 4) >>> print(c) [[[ 0 1 2 3][ 4 5 6 7][ 8 9 10 11]][[12 13 14 15][16 17 18 19][20 21 22 23]]]
>>> print(np.arange(10000)) [ 0 1 2 ... 9997 9998 9999] >>> print(np.arange(10000).reshape(100, 100)) [[ 0 1 2 ... 97 98 99][ 100 101 102 ... 197 198 199][ 200 201 202 ... 297 298 299]...[9700 9701 9702 ... 9797 9798 9799][9800 9801 9802 ... 9897 9898 9899][9900 9901 9902 ... 9997 9998 9999]]
The basics——Basic operations 基本操作
Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result. 数组上的算术运算符按元素进行操作。创建一个新数组并用结果填充。
>>> a = np.array( [ 20, 30, 40, 50 ] ) >>> b = np.arange( 4 ) >>> b array( [ 0, 1, 2, 3 ] ) >>> c = a - b >>> c array( [ 20, 29, 38, 47 ] ) >>> b ** 2 array( [ 0, 1, 4, 9 ] ) >>> 10 * np.sin( a ) array( [ 9.129, -9.880, 7.451 , -2.623] ) >>> a < 35 array( [ True, True, False, False ] )
The product operator * operates elementwise in NumPy arrays. The matrix product can be performed using the @ operator or the dot function or method 乘积运算符 * 在 NumPy 数组中按元素进行运算。矩阵乘积可以使用 @ 运算符或 dot 函数或方法执行
>>> A = np.array( [ [ 1, 1 ], [ 0, 1 ] ] ) >>> B = np.array( [ [ 2, 0 ], [ 3, 4 ] ] ) >>> A * B array( [ [ 2, 0 ],[ 0, 4 ] ] ) >>> A @ B array( [ [ 5, 4 ],[ 3, 4 ] ] ) >>> A.dot ( B ) array( [ [ 5, 4 ],[ 3, 4 ] ] )
The basics——Basic operations 基本操作
Some operations, such as += and *=, act in place to modify an existing array rather than create a new one. 某些操作(例如 += 和 *= )会修改现有数组,而不是创建新数组。
>>> a = np.ones((2, 5), dtype=int) >>> a array( [ [ 1, 1, 1, 1, 1 ],[ 1, 1, 1, 1, 1 ] ] ) >>> b = np.linspace(0,1,10).reshape(2,5) >>> b array( [ [ 0. , 0.11111111, 0.22222222, 0.33333333, 0.44444444 ],[ 0.55555556, 0.66666667, 0.77777778, 0.88888889, 1. ] ] )>>> a *= 3 >>> a array( [ [ 3, 3, 3, 3, 3 ],[ 3, 3, 3, 3, 3 ] ] ) >>> b += a >>> b array( [ [ 3. , 3.11111111, 3.22222222, 3.33333333, 3.44444444 ],[ 3.55555556, 3.66666667, 3.77777778, 3.88888889, 4. ] ] )>>> a += b Traceback (most recent call last):File "<python-input-96>", line 1, in <module>a += b numpy._core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'add' output from dtype('float64') to dtype('int64') with casting rule 'same_kind'
The basics——Basic operations 基本操作
When operating with arrays of different types, the type of the resulting array corresponds to the more general or precise one (a behavior known as upcasting).
当操作不同类型的数组时,结果数组的类型对应于更通用或更精确的类型(这种行为称为向上转型)。
>>> a = np.ones(3, dtype=np.int32) >>> a array([1, 1, 1], dtype=int32) >>> b = np.linspace(0, np.pi, 3) >>> b array([0. , 1.57079633, 3.14159265]) >>> b.dtype.name 'float64' >>> c = a + b >>> c array([1. , 2.57079633, 4.14159265]) >>> c.dtype.name 'float64' >>> d = np.exp(c * 1j ) >>> d array([ 0.54+0.84j, -0.84+0.54j, -0.54-0.84j]) >>> d.dtype.name 'complex128'
Many unary operations are implemented as methods of the ndarray class.许多一元运算都是作为 ndarray 类的方法实现的。
>>> a = np.arange(5) >>> a array([0, 1, 2, 3, 4])>>> a.sum() np.int64(10) >>> a.min() np.int64(0) >>> a.max() np.int64(4)
By specifying the axis parameter you can apply an operation along the specified axis of an array 通过指定 axis 参数,您可以沿数组的指定轴应用操作
>>> b = np.arange(6).reshape(2,3) >>> b array([[0, 1, 2],[3, 4, 5]])>>> b.sum(axis=0) array([3, 5, 7]) >>> b.sum(axis=1) array([ 3, 12])
The basics——Universal functions通用函数
NumPy provides familiar mathematical functions such as sin, cos, and exp. In NumPy, these are called “universal functions” (ufunc). Within NumPy, these functions operate elementwise on an array, producing an array as output.
-- NumPy 提供了一些常见的数学函数,例如 sin、cos 和 exp。在 NumPy 中,这些函数被称为“通用函数”( ufunc )。在 NumPy 中,这些函数对数组进行逐元素运算,并生成一个数组作为输出。
>>> B = np.arange(3) >>> B array([0, 1, 2])>>> np.exp(B) array([1. , 2.71828183, 7.3890561 ]) >>> np.sqrt(B) array([0. , 1. , 1.41421356])>>> C = np.array ( [ 2.0, -1.0, 4.0 ] ) >>> np.add ( B, C ) array([2., 0., 6.])
The basics——Indexing, slicing and iterating索引、切片和迭代
One-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python sequences一维数组可被索引/切片/迭代,就像lists和其他 Python 序列一样
>>> a = np.arange(10)**3 >>> a array([ 0, 1, 8, 27, 64, 125, 216, 343, 512, 729])>>> a [ 2 ] np.int64(8) >>> a [ 2 : 5 ] array([ 8, 27, 64])>>> a [ 0 : 6 : 2 ] = 1000 >>> a array([1000, 1, 1000, 27, 1000, 125, 216, 343, 512, 729])>>> a [ : : -1 ] array([ 729, 512, 343, 216, 125, 1000, 27, 1000, 1, 1000])
数组切片 a[0 : 6 : 2]
Shape manipulation 维度/形状的操作
Changing the shape of an array 改变数组的形状
An array has a shape given by the number of elements along each axis
-- 数组的形状由沿每个轴的元素数量决定
>>> a = np.arange( 15 ) >>> a array( [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 ] ) >>> a.shape (15,) >>> b = a.reshape( 3, 5 ) >>> b array( [ [ 0, 1, 2, 3, 4 ],[ 5, 6, 7, 8, 9 ],[10, 11, 12, 13, 14 ] ] ) >>> b.shape (3, 5) >>> c = b.ravel( ) >>> c array( [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] ) >>> e = b.T >>> c.shape (15,) >>> e array( [ [ 0, 5, 10 ],[ 1, 6, 11 ],[ 2, 7, 12 ],[ 3, 8, 13 ],[ 4, 9, 14 ] ] ) >>> e.shape (5, 3)
Stacking together different arrays 将不同的数组堆叠在一起
Several arrays can be stacked together along different axes:
-- 可以沿不同的轴将多个数组堆叠在一起
>>> a = np.arange( 1, 5, ).reshape( 2, 2 ) >>> a array( [ [ 1, 2 ],[ 3, 4 ] ] ) >>> b = np.arange( 11, 15, ).reshape( 2, 2 ) >>> b array( [ [ 11, 12 ],[ 13, 14 ] ] )>>> c = np.vstack( ( a, b ) ) >>> c array( [ [ 1, 2 ],[ 3, 4 ],[11, 12 ],[13, 14 ] ] ) >>> d = np.hstack ( ( a, b ) ) >>> d array( [ [ 1, 2, 11, 12 ],[ 3, 4, 13, 14 ] ] )
np.vstack((a, b))
是 NumPy 中的一个函数,用于将两个数组 垂直堆叠(vertical stack),即按行拼接,将数组 b
堆叠在数组 a
的下方,形成一个新的数组。
np.hstack((a, b))
是 NumPy 中的函数,用于将两个数组 水平堆叠(horizontal stack),即按列拼接,将数组 b
拼接在数组 a
的右侧,形成一个新的数组。
Splitting one array into several smaller ones将一个数组拆分成几个较小的数组
使用 hsplit 沿水平轴拆分数组,方法是指定要返回的相同形状数组的数量,或者指定应在之后进行划分的列。
使用vsplit 沿垂直轴分割, array_split 允许指定沿哪个轴分割。
>>> a = np.arange( 18 ).reshape( 3, 6 ) >>> a array( [ [ 0, 1, 2, 3, 4, 5 ],[ 6, 7, 8, 9, 10, 11 ],[12, 13, 14, 15, 16, 17 ] ] )>>> b, c, d = np.hsplit( a, 3 ) >>> b array( [ [ 0, 1 ],[ 6, 7 ],[12, 13 ] ] ) >>> c array( [ [ 2, 3 ],[ 8, 9 ],[14, 15 ] ] ) >>> d array( [ [ 4, 5 ],[10, 11 ],[16, 17 ] ] )
np.hsplit(a, 3)
是 NumPy 中的函数,用于将数组 a
按列方向(水平方向)分割 成 3 个等长的子数组。
数组 a
必须是二维数组(或可视为二维的数组),且 列数必须能被 3 整除,否则会报错。
例如:若 a
是一个 n×6
的数组(6 列),6 能被 3 整除,则可分割为 3 个 n×2
的子数组;若列数为 5,则无法被 3 整除,分割失败。
>>> x, y, z = np.hsplit( a, ( 2, 4 ) ) >>> x array( [ [ 0, 1 ],[ 6, 7 ],[12, 13 ] ] ) >>> y array( [ [ 2, 3 ],[ 8, 9 ],[14, 15 ] ] ) >>> z array( [ [ 4, 5 ],[10, 11 ],[16, 17 ] ] )
np.hsplit(a, (2, 4))
是 NumPy 中的函数,用于将数组 a
按列方向(水平方向)在指定的列索引位置进行分割,这里的分割点是第 2 列和第 4 列,最终会得到 3 个子数组。
Copies and views 副本和视图
No copy at all 根本没有副本
Simple assignments make no copy of objects or their data. 简单的赋值操作,不会复制对象或其数据。
>>> a = np.arange ( 5 ) >>> a array( [ 0, 1, 2, 3, 4 ] ) >>> b = a >>> b array( [ 0, 1, 2, 3, 4 ] ) >>> a is b True >>> id ( a ) 1295299280784 >>> id ( b ) 1295299280784
View or shallow copy 视图或浅拷贝
Different array objects can share the same data. The view method creates a new array object that looks at the same data. 不同的数组对象可以共享相同的数据。 view方法会创建一个新数组对象,用于查看同一组数据。
>>> a = np.arange ( 5 ) >>> a array( [ 0, 1, 2, 3, 4 ] ) >>> c = a.view( ) >>> c array( [ 0, 1, 2, 3, 4 ] ) >>> c is a False >>> id( a ) 1295299278480 >>> id( c ) 1295299279728 >>> c.base is a True >>> c.flags.owndata False
这段代码涉及 NumPy 数组的 视图(view) 概念,具体解释如下:
1. 代码分解
第一步:创建数组 a
第二步:创建数组 a
的视图 c
特性:内存共享
与副本(copy)的区别
总结
下面两行代码用于判断 NumPy 数组 c
与原数组 a
的内存关系,具体解释如下:
1. c.base is a
2. c.flags.owndata
Deep copy 深层复制
The copy method makes a complete copy of the array and its data. 使用copy 方法对数组及其数据进行完整复制
>>> a = np.arange ( 5 ) >>> b = a.copy( ) >>> b is a False >>> b.base is a False >>> id(a) 1295299279824 >>> id(b) 1295299280784 >>> a array([0, 1, 2, 3, 4]) >>> b array([0, 1, 2, 3, 4])>>> b[2] = 999 >>> a array([0, 1, 2, 3, 4]) >>> b array([ 0, 1, 999, 3, 4])
Less basic - Broadcasting rules广播规则
Broadcasting allows universal functions to deal in a meaningful way with inputs that do not have exactly the same shape. 广播允许通用函数以有意义的方式处理不完全相同形状的输入。
如果所有输入数组的维数不同,则会在较小数组的形状前面重复添加“1”,直到所有数组的维数相同。
确保沿特定维度大小为 1 的数组的行为就好像它们具有沿该维度具有最大形状的数组的大小一样。
The value of the array element is assumed to be the same along that dimension for the “broadcast” array. 数组元素的值,被假设为沿着“被广播数组”的维度,向前扩展。
After application of the broadcasting rules, the sizes of all arrays must match. More details can be found in Broadcasting. 应用广播规则后,所有数组的大小必须匹配。
>>> import numpy as np>>> a = np.array( [ 1.0, 2.0, 3.0 ] ) >>> b = 2.0>>> a * b array( [ 2., 4., 6. ] )
NumPy 的广播(Broadcasting) 是一种机制,用于处理不同形状(shape)的数组之间的算术运算,它可以自动扩展较小的数组,使其与较大数组的形状兼容,从而避免显式的形状调整(如复制数据),提高运算效率。
核心原则:形状兼容规则
广播的步骤
示例说明
示例 1:标量与数组的广播
示例 2:一维数组与二维数组的广播
示例 3:两个二维数组的广播
不兼容的情况
Advanced indexing and index tricks 高级索引和索引技巧
NumPy offers more indexing facilities than regular Python sequences. In addition to indexing by integers and slices, as we saw before, arrays can be indexed by arrays of integers and arrays of booleans. -- NumPy 提供了比常规 Python 序列更多的索引功能。除了之前提到的整数和切片索引之外,数组还可以通过整数数组和布尔数组进行索引。
>>> a = np.arange ( 12 ) ** 2 >>> a array( [ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121 ] )>>> i = np.array ( [ 1, 1, 3, 8, 5 ] ) >>> a [ i ] array( [ 1, 1, 9, 64, 25 ] )>>> j = np.array( [ [ 3, 4 ], [ 9, 7 ] ] ) >>> a [ j ] array( [ [ 9, 16 ],[ 81, 49 ] ] )
这段这段代码展示了 NumPy 中的数组索引(高级索引) 功能,即通过另一个数组(索引数组)来选取原数组中的元素,具体解释如下:
1. 初始数组 a
的创建
2. 一维索引数组 i
选取元素
3. 二维索引数组 j
选取元素
核心特点
>>> a = np.arange(12).reshape(3,4) >>> a array( [ [ 0, 1, 2, 3 ],[ 4, 5, 6, 7 ],[ 8, 9, 10, 11 ] ] ) >>> b = np.array ( [ [ 0, 1, 2 ], [ 2, 1, 0 ] ] )>>> a [ b ] array( [ [ [ 0, 1, 2, 3 ],[ 4, 5, 6, 7 ],[ 8, 9, 10, 11 ] ],[ [ 8, 9, 10, 11 ],[ 4, 5, 6, 7 ],[ 0, 1, 2, 3 ] ] ] )
When the indexed array “a” is multidimensional, a single array of indices refers to the first dimension of “a”. 当索引数组 a 为多维时,单个索引数组引用 a 的第一维。
Tricks and tips 技巧和窍门
“Automatic” reshaping“自动”重塑
To change the dimensions of an array, you can omit one of the sizes which will then be deduced automatically 要更改数组的尺寸,您可以省略其中一个尺寸,系统将自动推断出该尺寸。
>>> a = np.arange(24) >>> a array( [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 ] ) >>> b = a.reshape( ( 2, -1, 4 ) ) >>> b.shape ( 2, 3, 4 ) >>> b array( [ [ [ 0, 1, 2, 3 ],[ 4, 5, 6, 7 ],[ 8, 9, 10, 11 ] ],[ [ 12, 13, 14, 15 ],[ 16, 17, 18, 19 ],[ 20, 21, 22, 23 ] ] ] )