NumPy quickstart# NumPy 快速入门
Prerequisites# 先决条件#
You’ll need to know a bit of Python. For a refresher, see the Python
tutorial.
你需要了解一些 Python 知识。如需复习,请参阅 Python 教程 。
To work the examples, you’ll need matplotlib installed
in addition to NumPy.
要运行这些示例,除了 NumPy 之外,您还需要安装 matplotlib 。
Learner profile 学习者概况
This is a quick overview of arrays in NumPy. It demonstrates how n-dimensional
() arrays are represented and can be manipulated. In particular, if
you don’t know how to apply common functions to n-dimensional arrays (without
using for-loops), or if you want to understand axis and shape properties for
n-dimensional arrays, this article might be of help.
这是 NumPy 中数组的简要概述。它演示了如何
() 数组可以表示并操作。特别是,如果你不知道如何将常用函数应用于 n 维数组(不使用 for 循环),或者你想了解 n 维数组的轴和形状属性,那么本文可能会有所帮助。
Learning Objectives 学习目标
After reading, you should be able to:
阅读后,您应该能够:
Understand the difference between one-, two- and n-dimensional arrays in NumPy;
了解 NumPy 中一维、二维和 n 维数组之间的区别;Understand how to apply some linear algebra operations to n-dimensional arrays without using for-loops;
了解如何在不使用 for 循环的情况下将一些线性代数运算应用于 n 维数组;Understand axis and shape properties for n-dimensional arrays.
了解 n 维数组的轴和形状属性。
The basics# 基础知识#
NumPy’s main object is the homogeneous multidimensional array. It is a
table of elements (usually numbers), all of the same type, indexed by a
tuple of non-negative integers. In NumPy dimensions are called axes.
NumPy 的主要对象是同构多维数组。它是一个由相同类型的元素(通常是数字)组成的表,由一个非负整数元组索引。在 NumPy 中,维度被称为轴 (axis) 。
For example, the array for the coordinates of a point in 3D space,
[1, 2, 1], has one axis. That axis has 3 elements in it, so we say
it has a length of 3. In the example pictured below, the array has 2
axes. The first axis has a length of 2, the second axis has a length of
3.
例如,3D 空间中某个点的坐标数组,
[1, 2, 1] 有一个轴。该轴包含 3 个元素,因此我们称其长度为 3。在下图的示例中,该数组有两个轴。第一个轴的长度为 2,第二个轴的长度为 3。
[[1., 0., 0.],
[0., 1., 2.]]
NumPy’s array class is called ndarray. It is also known by the alias
array. Note that numpy.array is not the same as the Standard
Python Library class array.array, which only handles one-dimensional
arrays and offers less functionality. The more important attributes of
an ndarray object are:
NumPy 的数组类称为 ndarray 。它也被称为
array 。需要注意的是, numpy.array 与 Python 标准库中的 array.array 类不同,后者仅处理一维数组,且功能较少。ndarray ndarray 最重要的属性包括:
- ndarray.ndim
the number of axes (dimensions) of the array.
数组的轴数(维度)。- ndarray.shape ndaray.shape
the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns,
shapewill be(n,m). The length of theshapetuple is therefore the number of axes,ndim.
数组的维度。这是一个整数元组,表示数组在每个维度上的大小。对于 n 行 m 列的矩阵,shape为(n,m)。 因此,shape元组就是轴的数量,ndim。- ndarray.size
the total number of elements of the array. This is equal to the product of the elements of
shape.
数组元素的总数。这等于shape元素的乘积。- ndarray.dtype
an object describing the type of the elements in the array. One can create or specify dtype’s using standard Python types. Additionally NumPy provides types of its own. numpy.int32, numpy.int16, and numpy.float64 are some examples.
描述数组中元素类型的对象。可以使用标准 Python 类型创建或指定数据类型。此外,NumPy 还提供了自己的类型。例如,numpy.int32、numpy.int16 和 numpy.float64。- ndarray.itemsize
the size in bytes of each element of the array. For example, an array of elements of type
float64hasitemsize8 (=64/8), while one of typecomplex32hasitemsize4 (=32/8). It is equivalent tondarray.dtype.itemsize.
数组中每个元素的大小(以字节为单位)。例如,float64类型的元素数组的itemsize为 8(=64/8),而complex32类型的元素数组的itemsize为 4(=32/8)。它等效于ndarray.dtype.itemsize。- ndarray.data ndarray.数据
the buffer containing the actual elements of the array. Normally, we won’t need to use this attribute because we will access the elements in an array using indexing facilities.
包含数组实际元素的缓冲区。通常情况下,我们不需要使用此属性,因为我们将使用索引功能访问数组中的元素。
An example# 一个例子#
>>> import numpy as np
>>> a = np.arange(15).reshape(3, 5)
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
>>> a.shape
(3, 5)
>>> a.ndim
2
>>> a.dtype.name
'int64'
>>> a.itemsize
8
>>> a.size
15
>>> type(a)
<class 'numpy.ndarray'>
>>> b = np.array([6, 7, 8])
>>> b
array([6, 7, 8])
>>> type(b)
<class 'numpy.ndarray'>
Array creation# 数组创建#
There are several ways to create arrays.
有多种创建数组的方法。
For example, you can create an array from a regular Python list or tuple
using the array function. The type of the resulting array is deduced
from the type of the elements in the sequences.
例如,你可以使用 array 函数从常规 Python 列表或元组创建数组。结果数组的类型由序列中元素的类型推断而来。
>>> import numpy as np
>>> a = np.array([2, 3, 4])
>>> a
array([2, 3, 4])
>>> a.dtype
dtype('int64')
>>> b = np.array([1.2, 3.5, 5.1])
>>> b.dtype
dtype('float64')
A frequent error consists in calling array with multiple arguments,
rather than providing a single sequence as an argument.
常见的错误是调用带有多个参数的 array ,而不是提供单个序列作为参数。
>>> a = np.array(1, 2, 3, 4) # WRONG
Traceback (most recent call last):
...
TypeError: array() takes from 1 to 2 positional arguments but 4 were given
>>> a = np.array([1, 2, 3, 4]) # RIGHT
array transforms sequences of sequences into two-dimensional arrays,
sequences of sequences of sequences into three-dimensional arrays, and
so on.array 将序列的序列转换为二维数组,将序列的序列的序列转换为三维数组,等等。
>>> b = np.array([(1.5, 2, 3), (4, 5, 6)])
>>> b
array([[1.5, 2. , 3. ],
[4. , 5. , 6. ]])
The type of the array can also be explicitly specified at creation time:
数组的类型也可以在创建时明确指定:
>>> c = np.array([[1, 2], [3, 4]], dtype=complex)
>>> c
array([[1.+0.j, 2.+0.j],
[3.+0.j, 4.+0.j]])
Often, the elements of an array are originally unknown, but its size is
known. Hence, NumPy offers several functions to create
arrays with initial placeholder content. These minimize the necessity of
growing arrays, an expensive operation.
通常,数组的元素最初是未知的,但其大小是已知的。因此,NumPy 提供了几个函数来创建带有初始占位符内容的数组。这些函数最大限度地减少了增加数组大小这一高开销操作的必要性。
The function zeros creates an array full of zeros, the function
ones creates an array full of ones, and the function empty
creates an array whose initial content is random and depends on the
state of the memory. By default, the dtype of the created array is
float64, but it can be specified via the key word argument dtype.
函数 zeros 创建一个全为零的数组,该函数
ones 创建一个全为 1 的数组,函数 empty
创建一个数组,其初始内容是随机的,并且取决于
内存状态。默认情况下,创建的数组的数据类型为
float64 ,但可以通过关键字参数 dtype 指定。
>>> np.zeros((3, 4))
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
>>> np.ones((2, 3, 4), dtype=np.int16)
array([[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]],
[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]]], dtype=int16)
>>> np.empty((2, 3))
array([[3.73603959e-262, 6.02658058e-154, 6.55490914e-260], # may vary
[5.30498948e-313, 3.14673309e-307, 1.00000000e+000]])
To create sequences of numbers, NumPy provides the arange function
which is analogous to the Python built-in range, but returns an
array.
为了创建数字序列,NumPy 提供了 arange 函数,该函数类似于 Python 内置的 range ,但返回一个数组。
>>> np.arange(10, 30, 5)
array([10, 15, 20, 25])
>>> np.arange(0, 2, 0.3) # it accepts float arguments
array([0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])
When arange is used with floating point arguments, it is generally
not possible to predict the number of elements obtained, due to the
finite floating point precision. For this reason, it is usually better
to use the function linspace that receives as an argument the number
of elements that we want, instead of the step:
当 arange 与浮点参数一起使用时,由于浮点精度有限,通常无法预测获得的元素数量。因此,通常最好使用 linspace 函数,该函数接收我们想要的元素数量作为参数,而不是使用 step 函数:
>>> from numpy import pi
>>> np.linspace(0, 2, 9) # 9 numbers from 0 to 2
array([0. , 0.25, 0.5 , 0.75, 1. , 1.25, 1.5 , 1.75, 2. ])
>>> x = np.linspace(0, 2 * pi, 100) # useful to evaluate function at lots of points
>>> f = np.sin(x)
See also 参见
array,
zeros,
zeros_like,
ones,
ones_like,
empty,
empty_like,
arange,
linspace,
random.Generator.random,
random.Generator.normal,
fromfunction,
fromfilearray ,
zeros ,
zeros_like ,
ones ,
ones_like ,
empty ,
empty_like ,
arange ,
linspace ,
random.Generator.random ,
random.Generator.normal ,
fromfunction
fromfile
Printing arrays# 打印数组#
When you print an array, NumPy displays it in a similar way to nested
lists, but with the following layout:
打印数组时,NumPy 以类似于嵌套列表的方式显示它,但采用以下布局:
the last axis is printed from left to right,
最后一个轴从左到右打印,the second-to-last is printed from top to bottom,
倒数第二个从上到下打印,the rest are also printed from top to bottom, with each slice separated from the next by an empty line.
其余部分也从上到下打印,每个切片与下一个切片之间用空行分隔。
One-dimensional arrays are then printed as rows, bidimensionals as
matrices and tridimensionals as lists of matrices.
然后将一维数组打印为行,将二维数组打印为矩阵,将三维数组打印为矩阵列表。
>>> a = np.arange(6) # 1d array
>>> print(a)
[0 1 2 3 4 5]
>>>
>>> b = np.arange(12).reshape(4, 3) # 2d array
>>> print(b)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
>>>
>>> c = np.arange(24).reshape(2, 3, 4) # 3d array
>>> print(c)
[[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]]
See below to get
more details on reshape.
请参阅下文以获取有关 reshape 的更多详细信息。
If an array is too large to be printed, NumPy automatically skips the
central part of the array and only prints the corners:
如果数组太大而无法打印,NumPy 会自动跳过数组的中心部分并仅打印角落:
>>> print(np.arange(10000))
[ 0 1 2 ... 9997 9998 9999]
>>>
>>> print(np.arange(10000).reshape(100, 100))
[[ 0 1 2 ... 97 98 99]
[ 100 101 102 ... 197 198 199]
[ 200 201 202 ... 297 298 299]
...
[9700 9701 9702 ... 9797 9798 9799]
[9800 9801 9802 ... 9897 9898 9899]
[9900 9901 9902 ... 9997 9998 9999]]
To disable this behaviour and force NumPy to print the entire array, you
can change the printing options using set_printoptions.
要禁用此行为并强制 NumPy 打印整个数组,您可以使用 set_printoptions 更改打印选项。
>>> np.set_printoptions(threshold=sys.maxsize) # sys module should be imported
Basic operations# 基本操作#
Arithmetic operators on arrays apply elementwise. A new array is
created and filled with the result.
数组上的算术运算符按元素进行操作。系统会创建一个新数组,并用结果填充它。
>>> a = np.array([20, 30, 40, 50])
>>> b = np.arange(4)
>>> b
array([0, 1, 2, 3])
>>> c = a - b
>>> c
array([20, 29, 38, 47])
>>> b**2
array([0, 1, 4, 9])
>>> 10 * np.sin(a)
array([ 9.12945251, -9.88031624, 7.4511316 , -2.62374854])
>>> a < 35
array([ True, True, False, False])
Unlike in many matrix languages, the product operator * operates
elementwise in NumPy arrays. The matrix product can be performed using
the @ operator (in python >=3.5) or the dot function or method:
与许多矩阵语言不同,NumPy 数组中的乘积运算符 * 是逐元素运算的。矩阵乘积可以使用 @ 运算符(在 Python 3.5 中)或 dot 函数或方法执行:
>>> A = np.array([[1, 1],
... [0, 1]])
>>> B = np.array([[2, 0],
... [3, 4]])
>>> A * B # elementwise product
array([[2, 0],
[0, 4]])
>>> A @ B # matrix product
array([[5, 4],
[3, 4]])
>>> A.dot(B) # another matrix product
array([[5, 4],
[3, 4]])
Some operations, such as += and *=, act in place to modify an
existing array rather than create a new one.
某些操作(例如 += 和 *= 会修改现有数组而不是创建新数组。
>>> rg = np.random.default_rng(1) # create instance of default random number generator
>>> a = np.ones((2, 3), dtype=int)
>>> b = rg.random((2, 3))
>>> a *= 3
>>> a
array([[3, 3, 3],
[3, 3, 3]])
>>> b += a
>>> b
array([[3.51182162, 3.9504637 , 3.14415961],
[3.94864945, 3.31183145, 3.42332645]])
>>> a += b # b is not automatically converted to integer type
Traceback (most recent call last):
...
numpy._core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'add' output from dtype('float64') to dtype('int64') with casting rule 'same_kind'
When operating with arrays of different types, the type of the resulting
array corresponds to the more general or precise one (a behavior known
as upcasting).
当操作不同类型的数组时,结果数组的类型对应于更通用或更精确的类型(这种行为称为向上转型)。
>>> a = np.ones(3, dtype=np.int32)
>>> b = np.linspace(0, pi, 3)
>>> b.dtype.name
'float64'
>>> c = a + b
>>> c
array([1. , 2.57079633, 4.14159265])
>>> c.dtype.name
'float64'
>>> d = np.exp(c * 1j)
>>> d
array([ 0.54030231+0.84147098j, -0.84147098+0.54030231j,
-0.54030231-0.84147098j])
>>> d.dtype.name
'complex128'
Many unary operations, such as computing the sum of all the elements in
the array, are implemented as methods of the ndarray class.
许多一元运算(例如计算数组中所有元素的总和)都作为 ndarray 类的方法实现。
>>> a = rg.random((2, 3))
>>> a
array([[0.82770259, 0.40919914, 0.54959369],
[0.02755911, 0.75351311, 0.53814331]])
>>> a.sum()
3.1057109529998157
>>> a.min()
0.027559113243068367
>>> a.max()
0.8277025938204418
By default, these operations apply to the array as though it were a list
of numbers, regardless of its shape. However, by specifying the axis
parameter you can apply an operation along the specified axis of an
array:
默认情况下,这些操作会将数组视为数字列表,无论其形状如何。但是,通过指定 axis
参数,您可以沿指定轴应用操作
大批:
>>> b = np.arange(12).reshape(3, 4)
>>> b
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>>
>>> b.sum(axis=0) # sum of each column
array([12, 15, 18, 21])
>>>
>>> b.min(axis=1) # min of each row
array([0, 4, 8])
>>>
>>> b.cumsum(axis=1) # cumulative sum along each row
array([[ 0, 1, 3, 6],
[ 4, 9, 15, 22],
[ 8, 17, 27, 38]])
Universal functions# 通用函数#
NumPy provides familiar mathematical functions such as sin, cos, and
exp. In NumPy, these are called “universal
functions” (ufunc). Within NumPy, these functions
operate elementwise on an array, producing an array as output.
NumPy 提供了一些常见的数学函数,例如 sin、cos 和 exp。在 NumPy 中,这些函数被称为“通用函数”( ufunc )。在 NumPy 中,这些函数对数组进行逐元素运算,并生成一个数组作为输出。
>>> B = np.arange(3)
>>> B
array([0, 1, 2])
>>> np.exp(B)
array([1. , 2.71828183, 7.3890561 ])
>>> np.sqrt(B)
array([0. , 1. , 1.41421356])
>>> C = np.array([2., -1., 4.])
>>> np.add(B, C)
array([2., 0., 6.])
See also 参见
all,
any,
apply_along_axis,
argmax,
argmin,
argsort,
average,
bincount,
ceil,
clip,
conj,
corrcoef,
cov,
cross,
cumprod,
cumsum,
diff,
dot,
floor,
inner,
invert,
lexsort,
max,
maximum,
mean,
median,
min,
minimum,
nonzero,
outer,
prod,
re,
round,
sort,
std,
sum,
trace,
transpose,
var,
vdot,
vectorize,
whereall ,
any ,
apply_along_axis
argmax ,
argmin
argsort ,
average ,
bincount ,
ceil ,
clip ,
conj ,
corrcoef ,
cov
cross ,
cumprod ,
cumsum ,
diff ,
dot ,
floor ,
inner ,
invert ,
lexsort ,
max ,
maximum ,
mean ,
median ,
min ,
minimum ,
nonzero ,
outer ,
prod ,
re ,
round ,
sort ,
std ,
sum ,
trace ,
transpose ,
var ,
vdot ,
vectorize ,
where
Indexing, slicing and iterating#
索引、切片和迭代#
One-dimensional arrays can be indexed, sliced and iterated over,
much like
lists
and other Python sequences.
一维数组可以被索引、切片和迭代,
很像
列表
以及其他 Python 序列。
>>> a = np.arange(10)**3
>>> a
array([ 0, 1, 8, 27, 64, 125, 216, 343, 512, 729])
>>> a[2]
8
>>> a[2:5]
array([ 8, 27, 64])
>>> # equivalent to a[0:6:2] = 1000;
>>> # from start to position 6, exclusive, set every 2nd element to 1000
>>> a[:6:2] = 1000
>>> a
array([1000, 1, 1000, 27, 1000, 125, 216, 343, 512, 729])
>>> a[::-1] # reversed a
array([ 729, 512, 343, 216, 125, 1000, 27, 1000, 1, 1000])
>>> for i in a:
... print(i**(1 / 3.))
...
9.999999999999998 # may vary
1.0
9.999999999999998
3.0
9.999999999999998
4.999999999999999
5.999999999999999
6.999999999999999
7.999999999999999
8.999999999999998
Multidimensional arrays can have one index per axis. These indices
are given in a tuple separated by commas:
多维数组每个轴可以有一个索引。这些索引以逗号分隔的元组形式给出:
>>> def f(x, y):
... return 10 * x + y
...
>>> b = np.fromfunction(f, (5, 4), dtype=int)
>>> b
array([[ 0, 1, 2, 3],
[10, 11, 12, 13],
[20, 21, 22, 23],
[30, 31, 32, 33],
[40, 41, 42, 43]])
>>> b[2, 3]
23
>>> b[0:5, 1] # each row in the second column of b
array([ 1, 11, 21, 31, 41])
>>> b[:, 1] # equivalent to the previous example
array([ 1, 11, 21, 31, 41])
>>> b[1:3, :] # each column in the second and third row of b
array([[10, 11, 12, 13],
[20, 21, 22, 23]])
When fewer indices are provided than the number of axes, the missing
indices are considered complete slices:
当提供的索引少于轴的数量时,缺失的索引被视为完整的切片 :
>>> b[-1] # the last row. Equivalent to b[-1, :]
array([40, 41, 42, 43])
The expression within brackets in b[i] is treated as an i
followed by as many instances of : as needed to represent the
remaining axes. NumPy also allows you to write this using dots as
b[i, ...].b[i] 中括号内的表达式被视为 i
随后是尽可能多的 : 实例,以表示
其余轴。NumPy 还允许你使用点来表示
b[i, ...] 。
The dots (...) represent as many colons as needed to produce a
complete indexing tuple. For example, if x is an array with 5
axes, then
点 ( ... ) 表示生成完整索引元组所需的冒号数量。例如,如果 x 是一个包含 5 个轴的数组,则
x[1, 2, ...]is equivalent tox[1, 2, :, :, :],x[1, 2, ...]等价于x[1, 2, :, :, :],x[..., 3]tox[:, :, :, :, 3]andx[..., 3]到x[:, :, :, :, 3]和x[4, ..., 5, :]tox[4, :, :, 5, :].x[4, ..., 5, :]更改为x[4, :, :, 5, :]。
>>> c = np.array([[[ 0, 1, 2], # a 3D array (two stacked 2D arrays)
... [ 10, 12, 13]],
... [[100, 101, 102],
... [110, 112, 113]]])
>>> c.shape
(2, 2, 3)
>>> c[1, ...] # same as c[1, :, :] or c[1]
array([[100, 101, 102],
[110, 112, 113]])
>>> c[..., 2] # same as c[:, :, 2]
array([[ 2, 13],
[102, 113]])
Iterating over multidimensional arrays is done with respect to the
first axis:
对多维数组的迭代是针对第一个轴进行的:
>>> for row in b:
... print(row)
...
[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]
However, if one wants to perform an operation on each element in the
array, one can use the flat attribute which is an
iterator
over all the elements of the array:
但是,如果要对数组中的每个元素执行操作,则可以使用 flat 属性,它是
迭代器
遍历数组的所有元素:
>>> for element in b.flat:
... print(element)
...
0
1
2
3
10
11
12
13
20
21
22
23
30
31
32
33
40
41
42
43
See also 参见
Indexing on ndarrays,
Indexing routines (reference),
newaxis,
ndenumerate,
indices
ndarrays 上的索引 ,
索引例程 (参考),
newaxis ,
ndenumerate ,
indices
Shape manipulation# 形状操作#
Changing the shape of an array#
改变数组的形状#
An array has a shape given by the number of elements along each axis:
数组的形状由沿每个轴的元素数量决定:
>>> a = np.floor(10 * rg.random((3, 4)))
>>> a
array([[3., 7., 3., 4.],
[1., 4., 2., 2.],
[7., 2., 4., 9.]])
>>> a.shape
(3, 4)
The shape of an array can be changed with various commands. Note that the
following three commands all return a modified array, but do not change
the original array:
可以使用各种命令更改数组的形状。请注意,以下三个命令均返回修改后的数组,但不会更改原始数组:
>>> a.ravel() # returns the array, flattened
array([3., 7., 3., 4., 1., 4., 2., 2., 7., 2., 4., 9.])
>>> a.reshape(6, 2) # returns the array with a modified shape
array([[3., 7.],
[3., 4.],
[1., 4.],
[2., 2.],
[7., 2.],
[4., 9.]])
>>> a.T # returns the array, transposed
array([[3., 1., 7.],
[7., 4., 2.],
[3., 2., 4.],
[4., 2., 9.]])
>>> a.T.shape
(4, 3)
>>> a.shape
(3, 4)
The order of the elements in the array resulting from ravel is
normally “C-style”, that is, the rightmost index “changes the fastest”,
so the element after a[0, 0] is a[0, 1]. If the array is reshaped to some
other shape, again the array is treated as “C-style”. NumPy normally
creates arrays stored in this order, so ravel will usually not need to
copy its argument, but if the array was made by taking slices of another
array or created with unusual options, it may need to be copied. The
functions ravel and reshape can also be instructed, using an
optional argument, to use FORTRAN-style arrays, in which the leftmost
index changes the fastest.ravel 生成的数组中元素的顺序通常是“C 风格”,也就是说,最右边的索引“变化最快”,因此 a[0, 0] 之后的元素是 a[0, 1] 。如果将数组重塑为其他形状,则该数组仍将被视为“C 风格”。NumPy 通常创建以此顺序存储的数组,因此 ravel 通常不需要复制其参数,但如果数组是通过对另一个数组进行切片或使用不寻常的选项创建的,则可能需要复制它。还可以使用可选参数指示函数 ravel 和 reshape 使用 FORTRAN 风格的数组,其中最左边的索引变化最快。
The reshape function returns its
argument with a modified shape, whereas the
ndarray.resize method modifies the array
itself:reshape 函数返回其
具有修改形状的参数,而
ndarray.resize 方法修改数组本身:
>>> a
array([[3., 7., 3., 4.],
[1., 4., 2., 2.],
[7., 2., 4., 9.]])
>>> a.resize((2, 6))
>>> a
array([[3., 7., 3., 4., 1., 4.],
[2., 2., 7., 2., 4., 9.]])
If a dimension is given as -1 in a reshaping operation, the other
dimensions are automatically calculated:
如果在重塑操作中将某个维度指定为 -1 ,则会自动计算其他维度:
>>> a.reshape(3, -1)
array([[3., 7., 3., 4.],
[1., 4., 2., 2.],
[7., 2., 4., 9.]])
See also 参见
ndarray.shape,
reshape,
resize,
ravelndarray.shape ,
reshape ,
resize ,
ravel
Stacking together different arrays#
将不同的数组堆叠在一起#
Several arrays can be stacked together along different axes:
可以沿不同的轴将多个数组堆叠在一起:
>>> a = np.floor(10 * rg.random((2, 2)))
>>> a
array([[9., 7.],
[5., 2.]])
>>> b = np.floor(10 * rg.random((2, 2)))
>>> b
array([[1., 9.],
[5., 1.]])
>>> np.vstack((a, b))
array([[9., 7.],
[5., 2.],
[1., 9.],
[5., 1.]])
>>> np.hstack((a, b))
array([[9., 7., 1., 9.],
[5., 2., 5., 1.]])
The function column_stack stacks 1D arrays as columns into a 2D array.
It is equivalent to hstack only for 2D arrays:
函数 column_stack 将一维数组按列堆叠成二维数组。它相当于二维数组的 hstack :
>>> from numpy import newaxis
>>> np.column_stack((a, b)) # with 2D arrays
array([[9., 7., 1., 9.],
[5., 2., 5., 1.]])
>>> a = np.array([4., 2.])
>>> b = np.array([3., 8.])
>>> np.column_stack((a, b)) # returns a 2D array
array([[4., 3.],
[2., 8.]])
>>> np.hstack((a, b)) # the result is different
array([4., 2., 3., 8.])
>>> a[:, newaxis] # view `a` as a 2D column vector
array([[4.],
[2.]])
>>> np.column_stack((a[:, newaxis], b[:, newaxis]))
array([[4., 3.],
[2., 8.]])
>>> np.hstack((a[:, newaxis], b[:, newaxis])) # the result is the same
array([[4., 3.],
[2., 8.]])
In general, for arrays with more than two dimensions,
hstack stacks along their second
axes, vstack stacks along their
first axes, and concatenate
allows for an optional arguments giving the number of the axis along
which the concatenation should happen.
一般来说,对于二维以上的数组,
hstack 沿着它们的第二个轴堆叠, vstack 沿着它们的第一个轴堆叠,并 concatenate
允许使用可选参数给出轴的编号
连接应该发生在哪里。
Note 笔记
In complex cases, r_ and c_ are useful for creating arrays by stacking
numbers along one axis. They allow the use of range literals :.
在复杂情况下, r_ 和 c_ 可用于通过沿一个轴堆叠数字来创建数组。它们允许使用范围字面量 :
>>> np.r_[1:4, 0, 4]
array([1, 2, 3, 0, 4])
When used with arrays as arguments,
r_ and
c_ are similar to
vstack and
hstack in their default behavior,
but allow for an optional argument giving the number of the axis along
which to concatenate.
当使用数组作为参数时,
r_ 和
c_ 类似于
vstack 和
hstack 具有默认行为,但允许使用可选参数给出要连接的轴的数量。
See also 参见
hstack,
vstack,
column_stack,
concatenate,
c_,
r_hstack
vstack
column_stack ,
concatenate ,
c_ ,
r_
Splitting one array into several smaller ones#
将一个数组拆分成几个较小的数组#
Using hsplit, you can split an
array along its horizontal axis, either by specifying the number of
equally shaped arrays to return, or by specifying the columns after
which the division should occur:
使用 hsplit ,您可以沿水平轴拆分数组,方法是指定要返回的相同形状数组的数量,或者指定应在之后进行划分的列:
>>> a = np.floor(10 * rg.random((2, 12)))
>>> a
array([[6., 7., 6., 9., 0., 5., 4., 0., 6., 8., 5., 2.],
[8., 5., 5., 7., 1., 8., 6., 7., 1., 8., 1., 0.]])
>>> # Split `a` into 3
>>> np.hsplit(a, 3)
[array([[6., 7., 6., 9.],
[8., 5., 5., 7.]]), array([[0., 5., 4., 0.],
[1., 8., 6., 7.]]), array([[6., 8., 5., 2.],
[1., 8., 1., 0.]])]
>>> # Split `a` after the third and the fourth column
>>> np.hsplit(a, (3, 4))
[array([[6., 7., 6.],
[8., 5., 5.]]), array([[9.],
[7.]]), array([[0., 5., 4., 0., 6., 8., 5., 2.],
[1., 8., 6., 7., 1., 8., 1., 0.]])]
vsplit splits along the vertical
axis, and array_split allows
one to specify along which axis to split.vsplit 沿垂直轴分割, array_split 允许指定沿哪个轴分割。
Copies and views# 副本和视图#
When operating and manipulating arrays, their data is sometimes copied
into a new array and sometimes not. This is often a source of confusion
for beginners. There are three cases:
在操作和操作数组时,有时会将数据复制到新数组中,有时则不会。这常常会让初学者感到困惑。以下有三种情况:
No copy at all# 完全没有复制#
Simple assignments make no copy of objects or their data.
简单分配不会复制对象或其数据。
>>> a = np.array([[ 0, 1, 2, 3],
... [ 4, 5, 6, 7],
... [ 8, 9, 10, 11]])
>>> b = a # no new object is created
>>> b is a # a and b are two names for the same ndarray object
True
Python passes mutable objects as references, so function calls make no
copy.
Python 将可变对象作为引用传递,因此函数调用不会进行复制。
>>> def f(x):
... print(id(x))
...
>>> id(a) # id is a unique identifier of an object
148293216 # may vary
>>> f(a)
148293216 # may vary
View or shallow copy# 视图或浅拷贝#
Different array objects can share the same data. The view method
creates a new array object that looks at the same data.
不同的数组对象可以共享相同的数据。view view 会创建一个新的数组对象来查看相同的数据。
>>> c = a.view()
>>> c is a
False
>>> c.base is a # c is a view of the data owned by a
True
>>> c.flags.owndata
False
>>>
>>> c = c.reshape((2, 6)) # a's shape doesn't change, reassigned c is still a view of a
>>> a.shape
(3, 4)
>>> c[0, 4] = 1234 # a's data changes
>>> a
array([[ 0, 1, 2, 3],
[1234, 5, 6, 7],
[ 8, 9, 10, 11]])
Slicing an array returns a view of it:
对数组进行切片将返回其视图:
>>> s = a[:, 1:3]
>>> s[:] = 10 # s[:] is a view of s. Note the difference between s = 10 and s[:] = 10
>>> a
array([[ 0, 10, 10, 3],
[1234, 10, 10, 7],
[ 8, 10, 10, 11]])
Deep copy# 深层复制#
The copy method makes a complete copy of the array and its data.copy 方法对数组及其数据进行完整复制。
>>> d = a.copy() # a new array object with new data is created
>>> d is a
False
>>> d.base is a # d doesn't share anything with a
False
>>> d[0, 0] = 9999
>>> a
array([[ 0, 10, 10, 3],
[1234, 10, 10, 7],
[ 8, 10, 10, 11]])
Sometimes copy should be called after slicing if the original array is not required anymore.
For example, suppose a is a huge intermediate result and the final result b only contains
a small fraction of a, a deep copy should be made when constructing b with slicing:
有时,如果原始数组不再需要,则应在切片后调用 copy 。例如,假设 a 是一个巨大的中间结果,而最终结果 b 仅包含 a 的一小部分,则在使用切片构造 b 时应进行深层复制:
>>> a = np.arange(int(1e8))
>>> b = a[:100].copy()
>>> del a # the memory of ``a`` can be released.
If b = a[:100] is used instead, a is referenced by b and will persist in memory
even if del a is executed.
如果改用 b = a[:100] ,则 a 被 b 引用,并且即使执行 del a 也将保留在内存中。
See also Copies and views.
另请参阅副本和视图 。
Functions and methods overview#
函数和方法概述#
Here is a list of some useful NumPy functions and methods names
ordered in categories. See Routines and objects by topic for the full list.
以下是一些有用的 NumPy 函数和方法名称的列表,按类别排序。完整列表请参阅按主题划分的例程和对象 。
- Array Creation 数组创建
arange,array,copy,empty,empty_like,eye,fromfile,fromfunction,identity,linspace,logspace,mgrid,ogrid,ones,ones_like,r_,zeros,zeros_likearange,array,copy,empty,empty_like,eye,fromfile,fromfunctionidentity,linspace,logspace,mgrid,ogrid,ones,ones_like,r_,zeros,zeros_like- Conversions 转换
ndarray.astype,atleast_1d,atleast_2d,atleast_3d, matndarray.astype,atleast_1d,atleast_2d,atleast_3d, 以及- Manipulations 操作
array_split,column_stack,concatenate,diagonal,dsplit,dstack,hsplit,hstack,ndarray.item,newaxis,ravel,repeat,reshape,resize,squeeze,swapaxes,take,transpose,vsplit,vstackarray_split,column_stack,concatenate,diagonal,dsplit,dstack,hsplit,hstackndarray.item,newaxis,ravelrepeat,reshape,resize,squeeze,swapaxes,take,transpose,vsplit,vstack- Questions 问题
- Ordering 订购
argmax,argmin,argsort,max,min,ptp,searchsorted,sortargmax,argminargsort,max,min,ptpsearchsorted,sort- Operations 运营
choose,compress,cumprod,cumsum,inner,ndarray.fill,imag,prod,put,putmask,real,sumchoose,compress,cumprod,cumsum,inner,ndarray.fill,imag,prod,put,putmask,real,sum- Basic Statistics 基本统计数据
- Basic Linear Algebra 基础线性代数
cross,dot,outer,linalg.svd,vdotcross,dot,outer,linalg.svd,vdot
Less basic# 不太基础#
Broadcasting rules# 广播规则#
Broadcasting allows universal functions to deal in a meaningful way with
inputs that do not have exactly the same shape.
广播允许通用函数以有意义的方式处理不完全相同形状的输入。
The first rule of broadcasting is that if all input arrays do not have
the same number of dimensions, a “1” will be repeatedly prepended to the
shapes of the smaller arrays until all the arrays have the same number
of dimensions.
广播的第一条规则是,如果所有输入数组的维数不同,则会在较小数组的形状前面重复添加“1”,直到所有数组的维数相同。
The second rule of broadcasting ensures that arrays with a size of 1
along a particular dimension act as if they had the size of the array
with the largest shape along that dimension. The value of the array
element is assumed to be the same along that dimension for the
“broadcast” array.
广播的第二条规则确保在特定维度上大小为 1 的数组,其大小将被视为与该维度上形状最大的数组相同。对于“广播”数组,数组元素的值在该维度上被假定为相同。
After application of the broadcasting rules, the sizes of all arrays
must match. More details can be found in Broadcasting.
应用广播规则后,所有数组的大小必须匹配。更多详细信息请参阅广播 。
Advanced indexing and index tricks#
高级索引和索引技巧#
NumPy offers more indexing facilities than regular Python sequences. In
addition to indexing by integers and slices, as we saw before, arrays
can be indexed by arrays of integers and arrays of booleans.
NumPy 提供了比常规 Python 序列更多的索引功能。除了之前提到的整数和切片索引之外,数组还可以通过整数数组和布尔数组进行索引。
Indexing with arrays of indices#
使用索引数组进行索引#
>>> a = np.arange(12)**2 # the first 12 square numbers
>>> i = np.array([1, 1, 3, 8, 5]) # an array of indices
>>> a[i] # the elements of `a` at the positions `i`
array([ 1, 1, 9, 64, 25])
>>>
>>> j = np.array([[3, 4], [9, 7]]) # a bidimensional array of indices
>>> a[j] # the same shape as `j`
array([[ 9, 16],
[81, 49]])
When the indexed array a is multidimensional, a single array of
indices refers to the first dimension of a. The following example
shows this behavior by converting an image of labels into a color image
using a palette.
当索引数组 a 是多维数组时,单个索引数组引用 a 的第一维。以下示例通过使用调色板将标签图像转换为彩色图像来展示此行为。
>>> palette = np.array([[0, 0, 0], # black
... [255, 0, 0], # red
... [0, 255, 0], # green
... [0, 0, 255], # blue
... [255, 255, 255]]) # white
>>> image = np.array([[0, 1, 2, 0], # each value corresponds to a color in the palette
... [0, 3, 4, 0]])
>>> palette[image] # the (2, 4, 3) color image
array([[[ 0, 0, 0],
[255, 0, 0],
[ 0, 255, 0],
[ 0, 0, 0]],
[[ 0, 0, 0],
[ 0, 0, 255],
[255, 255, 255],
[ 0, 0, 0]]])
We can also give indexes for more than one dimension. The arrays of
indices for each dimension must have the same shape.
我们也可以给出多个维度的索引。每个维度的索引数组必须具有相同的形状。
>>> a = np.arange(12).reshape(3, 4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> i = np.array([[0, 1], # indices for the first dim of `a`
... [1, 2]])
>>> j = np.array([[2, 1], # indices for the second dim
... [3, 3]])
>>>
>>> a[i, j] # i and j must have equal shape
array([[ 2, 5],
[ 7, 11]])
>>>
>>> a[i, 2]
array([[ 2, 6],
[ 6, 10]])
>>>
>>> a[:, j]
array([[[ 2, 1],
[ 3, 3]],
[[ 6, 5],
[ 7, 7]],
[[10, 9],
[11, 11]]])
In Python, arr[i, j] is exactly the same as arr[(i, j)]—so we can
put i and j in a tuple and then do the indexing with that.
在 Python 中, arr[i, j] 与 arr[(i, j)] 完全相同——因此我们可以将 i 和 j 放在一个 tuple 中,然后使用它进行索引。
>>> l = (i, j)
>>> # equivalent to a[i, j]
>>> a[l]
array([[ 2, 5],
[ 7, 11]])
However, we can not do this by putting i and j into an array,
because this array will be interpreted as indexing the first dimension
of a.
但是,我们不能通过将 i 和 j 放入数组来实现这一点,因为这个数组将被解释为索引 a 的第一个维度。
>>> s = np.array([i, j])
>>> # not what we want
>>> a[s]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: index 3 is out of bounds for axis 0 with size 3
>>> # same as `a[i, j]`
>>> a[tuple(s)]
array([[ 2, 5],
[ 7, 11]])
Another common use of indexing with arrays is the search of the maximum
value of time-dependent series:
使用数组索引的另一个常见用途是搜索时间相关序列的最大值:
>>> time = np.linspace(20, 145, 5) # time scale
>>> data = np.sin(np.arange(20)).reshape(5, 4) # 4 time-dependent series
>>> time
array([ 20. , 51.25, 82.5 , 113.75, 145. ])
>>> data
array([[ 0. , 0.84147098, 0.90929743, 0.14112001],
[-0.7568025 , -0.95892427, -0.2794155 , 0.6569866 ],
[ 0.98935825, 0.41211849, -0.54402111, -0.99999021],
[-0.53657292, 0.42016704, 0.99060736, 0.65028784],
[-0.28790332, -0.96139749, -0.75098725, 0.14987721]])
>>> # index of the maxima for each series
>>> ind = data.argmax(axis=0)
>>> ind
array([2, 0, 3, 1])
>>> # times corresponding to the maxima
>>> time_max = time[ind]
>>>
>>> data_max = data[ind, range(data.shape[1])] # => data[ind[0], 0], data[ind[1], 1]...
>>> time_max
array([ 82.5 , 20. , 113.75, 51.25])
>>> data_max
array([0.98935825, 0.84147098, 0.99060736, 0.6569866 ])
>>> np.all(data_max == data.max(axis=0))
True
You can also use indexing with arrays as a target to assign to:
您还可以使用数组索引作为要分配的目标:
>>> a = np.arange(5)
>>> a
array([0, 1, 2, 3, 4])
>>> a[[1, 3, 4]] = 0
>>> a
array([0, 0, 2, 0, 0])
However, when the list of indices contains repetitions, the assignment
is done several times, leaving behind the last value:
但是,当索引列表包含重复项时,会进行多次赋值,并留下最后一个值:
>>> a = np.arange(5)
>>> a[[0, 0, 2]] = [1, 2, 3]
>>> a
array([2, 1, 3, 3, 4])
This is reasonable enough, but watch out if you want to use Python’s
+= construct, as it may not do what you expect:
这是合理的,但如果你想使用 Python 的
+= 构造,因为它可能不会按照您的期望执行:
>>> a = np.arange(5)
>>> a[[0, 0, 2]] += 1
>>> a
array([1, 1, 3, 3, 4])
Even though 0 occurs twice in the list of indices, the 0th element is
only incremented once. This is because Python requires a += 1 to be
equivalent to a = a + 1.
即使索引列表中出现了两次 0,第 0 个元素也只会增加一次。这是因为 Python 要求 a += 1 等同于 a = a + 1 。
Indexing with boolean arrays#
使用布尔数组进行索引#
When we index arrays with arrays of (integer) indices we are providing
the list of indices to pick. With boolean indices the approach is
different; we explicitly choose which items in the array we want and
which ones we don’t.
当我们使用(整数)索引数组对数组进行索引时,我们实际上是提供了可供选取的索引列表。使用布尔索引时,方法有所不同;我们需要明确选择数组中需要哪些项,以及不需要哪些项。
The most natural way one can think of for boolean indexing is to use
boolean arrays that have the same shape as the original array:
对于布尔索引,人们能想到的最自然的方法是使用与原始数组具有相同形状的布尔数组:
>>> a = np.arange(12).reshape(3, 4)
>>> b = a > 4
>>> b # `b` is a boolean with `a`'s shape
array([[False, False, False, False],
[False, True, True, True],
[ True, True, True, True]])
>>> a[b] # 1d array with the selected elements
array([ 5, 6, 7, 8, 9, 10, 11])
This property can be very useful in assignments:
此属性在作业中非常有用:
>>> a[b] = 0 # All elements of `a` higher than 4 become 0
>>> a
array([[0, 1, 2, 3],
[4, 0, 0, 0],
[0, 0, 0, 0]])
You can look at the following
example to see
how to use boolean indexing to generate an image of the Mandelbrot
set:
您可以查看以下示例,了解如何使用布尔索引生成曼德布洛特集的图像:
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> def mandelbrot(h, w, maxit=20, r=2):
... """Returns an image of the Mandelbrot fractal of size (h,w)."""
... x = np.linspace(-2.5, 1.5, 4*h+1)
... y = np.linspace(-1.5, 1.5, 3*w+1)
... A, B = np.meshgrid(x, y)
... C = A + B*1j
... z = np.zeros_like(C)
... divtime = maxit + np.zeros(z.shape, dtype=int)
...
... for i in range(maxit):
... z = z**2 + C
... diverge = abs(z) > r # who is diverging
... div_now = diverge & (divtime == maxit) # who is diverging now
... divtime[div_now] = i # note when
... z[diverge] = r # avoid diverging too much
...
... return divtime
>>> plt.clf()
>>> plt.imshow(mandelbrot(400, 400))
The second way of indexing with booleans is more similar to integer
indexing; for each dimension of the array we give a 1D boolean array
selecting the slices we want:
使用布尔值进行索引的第二种方式与整数索引更相似;对于数组的每个维度,我们给出一个 1D 布尔数组来选择我们想要的切片:
>>> a = np.arange(12).reshape(3, 4)
>>> b1 = np.array([False, True, True]) # first dim selection
>>> b2 = np.array([True, False, True, False]) # second dim selection
>>>
>>> a[b1, :] # selecting rows
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>>
>>> a[b1] # same thing
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>>
>>> a[:, b2] # selecting columns
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])
>>>
>>> a[b1, b2] # a weird thing to do
array([ 4, 10])
Note that the length of the 1D boolean array must coincide with the
length of the dimension (or axis) you want to slice. In the previous
example, b1 has length 3 (the number of rows in a), and
b2 (of length 4) is suitable to index the 2nd axis (columns) of
a.
请注意,一维布尔数组的长度必须与要切片的维度(或轴)的长度一致。在上例中, b1 长度为 3( a 中的行数),并且
b2 (长度为 4)适合索引第二轴(列)
a 。
The ix_() function# ix_() 函数#
The ix_ function can be used to combine different vectors so as to
obtain the result for each n-uplet. For example, if you want to compute
all the a+b*c for all the triplets taken from each of the vectors a, b
and c:ix_ 函数可用于组合不同的向量,从而获得每个 n-uplet 的结果。例如,如果你想计算从向量 a、b 和 c 中取出的所有三元组的 a+b*c 值:
>>> a = np.array([2, 3, 4, 5])
>>> b = np.array([8, 5, 4])
>>> c = np.array([5, 4, 6, 8, 3])
>>> ax, bx, cx = np.ix_(a, b, c)
>>> ax
array([[[2]],
[[3]],
[[4]],
[[5]]])
>>> bx
array([[[8],
[5],
[4]]])
>>> cx
array([[[5, 4, 6, 8, 3]]])
>>> ax.shape, bx.shape, cx.shape
((4, 1, 1), (1, 3, 1), (1, 1, 5))
>>> result = ax + bx * cx
>>> result
array([[[42, 34, 50, 66, 26],
[27, 22, 32, 42, 17],
[22, 18, 26, 34, 14]],
[[43, 35, 51, 67, 27],
[28, 23, 33, 43, 18],
[23, 19, 27, 35, 15]],
[[44, 36, 52, 68, 28],
[29, 24, 34, 44, 19],
[24, 20, 28, 36, 16]],
[[45, 37, 53, 69, 29],
[30, 25, 35, 45, 20],
[25, 21, 29, 37, 17]]])
>>> result[3, 2, 4]
17
>>> a[3] + b[2] * c[4]
17
You could also implement the reduce as follows:
您还可以按如下方式实现 reduce:
>>> def ufunc_reduce(ufct, *vectors):
... vs = np.ix_(*vectors)
... r = ufct.identity
... for v in vs:
... r = ufct(r, v)
... return r
and then use it as:
然后将其用作:
>>> ufunc_reduce(np.add, a, b, c)
array([[[15, 14, 16, 18, 13],
[12, 11, 13, 15, 10],
[11, 10, 12, 14, 9]],
[[16, 15, 17, 19, 14],
[13, 12, 14, 16, 11],
[12, 11, 13, 15, 10]],
[[17, 16, 18, 20, 15],
[14, 13, 15, 17, 12],
[13, 12, 14, 16, 11]],
[[18, 17, 19, 21, 16],
[15, 14, 16, 18, 13],
[14, 13, 15, 17, 12]]])
The advantage of this version of reduce compared to the normal
ufunc.reduce is that it makes use of the
broadcasting rules
in order to avoid creating an argument array the size of the output
times the number of vectors.
与普通版本相比,此版本 reduce 的优势
ufunc.reduce 的优点在于它利用了
广播规则
为了避免创建输出大小的参数数组
乘以向量的数量。
Indexing with strings# 使用字符串进行索引#
See Structured arrays.
请参阅结构化数组 。
Tricks and tips# 技巧和窍门#
Here we give a list of short and useful tips.
这里我们给出了一些简短而实用的提示。
“Automatic” reshaping# “自动”重塑#
To change the dimensions of an array, you can omit one of the sizes
which will then be deduced automatically:
要更改数组的尺寸,您可以省略其中一个尺寸,然后系统将自动推断出该尺寸:
>>> a = np.arange(30)
>>> b = a.reshape((2, -1, 3)) # -1 means "whatever is needed"
>>> b.shape
(2, 5, 3)
>>> b
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]],
[[15, 16, 17],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26],
[27, 28, 29]]])
Vector stacking# 向量堆叠#
How do we construct a 2D array from a list of equally-sized row vectors?
In MATLAB this is quite easy: if x and y are two vectors of the
same length you only need do m=[x;y]. In NumPy this works via the
functions column_stack, dstack, hstack and vstack,
depending on the dimension in which the stacking is to be done. For
example:
如何从大小相等的行向量列表构建二维数组?在 MATLAB 中,这很容易:如果 x 和 y 是两个长度相同的向量,只需执行 m=[x;y] 即可。在 NumPy 中,这可以通过 column_stack 、 dstack 、 hstack 和 vstack 函数实现,具体取决于要进行堆叠的维度。例如:
>>> x = np.arange(0, 10, 2)
>>> y = np.arange(5)
>>> m = np.vstack([x, y])
>>> m
array([[0, 2, 4, 6, 8],
[0, 1, 2, 3, 4]])
>>> xy = np.hstack([x, y])
>>> xy
array([0, 2, 4, 6, 8, 0, 1, 2, 3, 4])
The logic behind those functions in more than two dimensions can be
strange.
这些二维以上功能背后的逻辑可能很奇怪。
See also 参见
Histograms# 直方图#
The NumPy histogram function applied to an array returns a pair of
vectors: the histogram of the array and a vector of the bin edges. Beware:
matplotlib also has a function to build histograms (called hist,
as in Matlab) that differs from the one in NumPy. The main difference is
that pylab.hist plots the histogram automatically, while
numpy.histogram only generates the data.
应用于数组的 NumPy histogram 函数返回一对
向量:数组的直方图和 bin 边界的向量。注意:
matplotlib 也有一个用于构建直方图的函数(在 Matlab 中称为 hist ),它与 NumPy 中的函数不同。主要区别在于 pylab.hist 会自动绘制直方图,而
numpy.histogram 仅生成数据。
>>> import numpy as np
>>> rg = np.random.default_rng(1)
>>> import matplotlib.pyplot as plt
>>> # Build a vector of 10000 normal deviates with variance 0.5^2 and mean 2
>>> mu, sigma = 2, 0.5
>>> v = rg.normal(mu, sigma, 10000)
>>> # Plot a normalized histogram with 50 bins
>>> plt.hist(v, bins=50, density=True) # matplotlib version (plot)
(array...)
>>> # Compute the histogram with numpy and then plot it
>>> (n, bins) = np.histogram(v, bins=50, density=True) # NumPy version (no plot)
>>> plt.plot(.5 * (bins[1:] + bins[:-1]), n)
With Matplotlib >=3.4 you can also use plt.stairs(n, bins).
使用 Matplotlib >=3.4 您还可以使用 plt.stairs(n, bins) 。