indexing.py 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456
  1. """
  2. ==============
  3. Array indexing
  4. ==============
  5. Array indexing refers to any use of the square brackets ([]) to index
  6. array values. There are many options to indexing, which give numpy
  7. indexing great power, but with power comes some complexity and the
  8. potential for confusion. This section is just an overview of the
  9. various options and issues related to indexing. Aside from single
  10. element indexing, the details on most of these options are to be
  11. found in related sections.
  12. Assignment vs referencing
  13. =========================
  14. Most of the following examples show the use of indexing when
  15. referencing data in an array. The examples work just as well
  16. when assigning to an array. See the section at the end for
  17. specific examples and explanations on how assignments work.
  18. Single element indexing
  19. =======================
  20. Single element indexing for a 1-D array is what one expects. It work
  21. exactly like that for other standard Python sequences. It is 0-based,
  22. and accepts negative indices for indexing from the end of the array. ::
  23. >>> x = np.arange(10)
  24. >>> x[2]
  25. 2
  26. >>> x[-2]
  27. 8
  28. Unlike lists and tuples, numpy arrays support multidimensional indexing
  29. for multidimensional arrays. That means that it is not necessary to
  30. separate each dimension's index into its own set of square brackets. ::
  31. >>> x.shape = (2,5) # now x is 2-dimensional
  32. >>> x[1,3]
  33. 8
  34. >>> x[1,-1]
  35. 9
  36. Note that if one indexes a multidimensional array with fewer indices
  37. than dimensions, one gets a subdimensional array. For example: ::
  38. >>> x[0]
  39. array([0, 1, 2, 3, 4])
  40. That is, each index specified selects the array corresponding to the
  41. rest of the dimensions selected. In the above example, choosing 0
  42. means that the remaining dimension of length 5 is being left unspecified,
  43. and that what is returned is an array of that dimensionality and size.
  44. It must be noted that the returned array is not a copy of the original,
  45. but points to the same values in memory as does the original array.
  46. In this case, the 1-D array at the first position (0) is returned.
  47. So using a single index on the returned array, results in a single
  48. element being returned. That is: ::
  49. >>> x[0][2]
  50. 2
  51. So note that ``x[0,2] = x[0][2]`` though the second case is more
  52. inefficient as a new temporary array is created after the first index
  53. that is subsequently indexed by 2.
  54. Note to those used to IDL or Fortran memory order as it relates to
  55. indexing. NumPy uses C-order indexing. That means that the last
  56. index usually represents the most rapidly changing memory location,
  57. unlike Fortran or IDL, where the first index represents the most
  58. rapidly changing location in memory. This difference represents a
  59. great potential for confusion.
  60. Other indexing options
  61. ======================
  62. It is possible to slice and stride arrays to extract arrays of the
  63. same number of dimensions, but of different sizes than the original.
  64. The slicing and striding works exactly the same way it does for lists
  65. and tuples except that they can be applied to multiple dimensions as
  66. well. A few examples illustrates best: ::
  67. >>> x = np.arange(10)
  68. >>> x[2:5]
  69. array([2, 3, 4])
  70. >>> x[:-7]
  71. array([0, 1, 2])
  72. >>> x[1:7:2]
  73. array([1, 3, 5])
  74. >>> y = np.arange(35).reshape(5,7)
  75. >>> y[1:5:2,::3]
  76. array([[ 7, 10, 13],
  77. [21, 24, 27]])
  78. Note that slices of arrays do not copy the internal array data but
  79. only produce new views of the original data. This is different from
  80. list or tuple slicing and an explicit ``copy()`` is recommended if
  81. the original data is not required anymore.
  82. It is possible to index arrays with other arrays for the purposes of
  83. selecting lists of values out of arrays into new arrays. There are
  84. two different ways of accomplishing this. One uses one or more arrays
  85. of index values. The other involves giving a boolean array of the proper
  86. shape to indicate the values to be selected. Index arrays are a very
  87. powerful tool that allow one to avoid looping over individual elements in
  88. arrays and thus greatly improve performance.
  89. It is possible to use special features to effectively increase the
  90. number of dimensions in an array through indexing so the resulting
  91. array acquires the shape needed for use in an expression or with a
  92. specific function.
  93. Index arrays
  94. ============
  95. NumPy arrays may be indexed with other arrays (or any other sequence-
  96. like object that can be converted to an array, such as lists, with the
  97. exception of tuples; see the end of this document for why this is). The
  98. use of index arrays ranges from simple, straightforward cases to
  99. complex, hard-to-understand cases. For all cases of index arrays, what
  100. is returned is a copy of the original data, not a view as one gets for
  101. slices.
  102. Index arrays must be of integer type. Each value in the array indicates
  103. which value in the array to use in place of the index. To illustrate: ::
  104. >>> x = np.arange(10,1,-1)
  105. >>> x
  106. array([10, 9, 8, 7, 6, 5, 4, 3, 2])
  107. >>> x[np.array([3, 3, 1, 8])]
  108. array([7, 7, 9, 2])
  109. The index array consisting of the values 3, 3, 1 and 8 correspondingly
  110. create an array of length 4 (same as the index array) where each index
  111. is replaced by the value the index array has in the array being indexed.
  112. Negative values are permitted and work as they do with single indices
  113. or slices: ::
  114. >>> x[np.array([3,3,-3,8])]
  115. array([7, 7, 4, 2])
  116. It is an error to have index values out of bounds: ::
  117. >>> x[np.array([3, 3, 20, 8])]
  118. <type 'exceptions.IndexError'>: index 20 out of bounds 0<=index<9
  119. Generally speaking, what is returned when index arrays are used is
  120. an array with the same shape as the index array, but with the type
  121. and values of the array being indexed. As an example, we can use a
  122. multidimensional index array instead: ::
  123. >>> x[np.array([[1,1],[2,3]])]
  124. array([[9, 9],
  125. [8, 7]])
  126. Indexing Multi-dimensional arrays
  127. =================================
  128. Things become more complex when multidimensional arrays are indexed,
  129. particularly with multidimensional index arrays. These tend to be
  130. more unusual uses, but they are permitted, and they are useful for some
  131. problems. We'll start with the simplest multidimensional case (using
  132. the array y from the previous examples): ::
  133. >>> y[np.array([0,2,4]), np.array([0,1,2])]
  134. array([ 0, 15, 30])
  135. In this case, if the index arrays have a matching shape, and there is
  136. an index array for each dimension of the array being indexed, the
  137. resultant array has the same shape as the index arrays, and the values
  138. correspond to the index set for each position in the index arrays. In
  139. this example, the first index value is 0 for both index arrays, and
  140. thus the first value of the resultant array is y[0,0]. The next value
  141. is y[2,1], and the last is y[4,2].
  142. If the index arrays do not have the same shape, there is an attempt to
  143. broadcast them to the same shape. If they cannot be broadcast to the
  144. same shape, an exception is raised: ::
  145. >>> y[np.array([0,2,4]), np.array([0,1])]
  146. <type 'exceptions.ValueError'>: shape mismatch: objects cannot be
  147. broadcast to a single shape
  148. The broadcasting mechanism permits index arrays to be combined with
  149. scalars for other indices. The effect is that the scalar value is used
  150. for all the corresponding values of the index arrays: ::
  151. >>> y[np.array([0,2,4]), 1]
  152. array([ 1, 15, 29])
  153. Jumping to the next level of complexity, it is possible to only
  154. partially index an array with index arrays. It takes a bit of thought
  155. to understand what happens in such cases. For example if we just use
  156. one index array with y: ::
  157. >>> y[np.array([0,2,4])]
  158. array([[ 0, 1, 2, 3, 4, 5, 6],
  159. [14, 15, 16, 17, 18, 19, 20],
  160. [28, 29, 30, 31, 32, 33, 34]])
  161. What results is the construction of a new array where each value of
  162. the index array selects one row from the array being indexed and the
  163. resultant array has the resulting shape (number of index elements,
  164. size of row).
  165. An example of where this may be useful is for a color lookup table
  166. where we want to map the values of an image into RGB triples for
  167. display. The lookup table could have a shape (nlookup, 3). Indexing
  168. such an array with an image with shape (ny, nx) with dtype=np.uint8
  169. (or any integer type so long as values are with the bounds of the
  170. lookup table) will result in an array of shape (ny, nx, 3) where a
  171. triple of RGB values is associated with each pixel location.
  172. In general, the shape of the resultant array will be the concatenation
  173. of the shape of the index array (or the shape that all the index arrays
  174. were broadcast to) with the shape of any unused dimensions (those not
  175. indexed) in the array being indexed.
  176. Boolean or "mask" index arrays
  177. ==============================
  178. Boolean arrays used as indices are treated in a different manner
  179. entirely than index arrays. Boolean arrays must be of the same shape
  180. as the initial dimensions of the array being indexed. In the
  181. most straightforward case, the boolean array has the same shape: ::
  182. >>> b = y>20
  183. >>> y[b]
  184. array([21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34])
  185. Unlike in the case of integer index arrays, in the boolean case, the
  186. result is a 1-D array containing all the elements in the indexed array
  187. corresponding to all the true elements in the boolean array. The
  188. elements in the indexed array are always iterated and returned in
  189. :term:`row-major` (C-style) order. The result is also identical to
  190. ``y[np.nonzero(b)]``. As with index arrays, what is returned is a copy
  191. of the data, not a view as one gets with slices.
  192. The result will be multidimensional if y has more dimensions than b.
  193. For example: ::
  194. >>> b[:,5] # use a 1-D boolean whose first dim agrees with the first dim of y
  195. array([False, False, False, True, True])
  196. >>> y[b[:,5]]
  197. array([[21, 22, 23, 24, 25, 26, 27],
  198. [28, 29, 30, 31, 32, 33, 34]])
  199. Here the 4th and 5th rows are selected from the indexed array and
  200. combined to make a 2-D array.
  201. In general, when the boolean array has fewer dimensions than the array
  202. being indexed, this is equivalent to y[b, ...], which means
  203. y is indexed by b followed by as many : as are needed to fill
  204. out the rank of y.
  205. Thus the shape of the result is one dimension containing the number
  206. of True elements of the boolean array, followed by the remaining
  207. dimensions of the array being indexed.
  208. For example, using a 2-D boolean array of shape (2,3)
  209. with four True elements to select rows from a 3-D array of shape
  210. (2,3,5) results in a 2-D result of shape (4,5): ::
  211. >>> x = np.arange(30).reshape(2,3,5)
  212. >>> x
  213. array([[[ 0, 1, 2, 3, 4],
  214. [ 5, 6, 7, 8, 9],
  215. [10, 11, 12, 13, 14]],
  216. [[15, 16, 17, 18, 19],
  217. [20, 21, 22, 23, 24],
  218. [25, 26, 27, 28, 29]]])
  219. >>> b = np.array([[True, True, False], [False, True, True]])
  220. >>> x[b]
  221. array([[ 0, 1, 2, 3, 4],
  222. [ 5, 6, 7, 8, 9],
  223. [20, 21, 22, 23, 24],
  224. [25, 26, 27, 28, 29]])
  225. For further details, consult the numpy reference documentation on array indexing.
  226. Combining index arrays with slices
  227. ==================================
  228. Index arrays may be combined with slices. For example: ::
  229. >>> y[np.array([0, 2, 4]), 1:3]
  230. array([[ 1, 2],
  231. [15, 16],
  232. [29, 30]])
  233. In effect, the slice and index array operation are independent.
  234. The slice operation extracts columns with index 1 and 2,
  235. (i.e. the 2nd and 3rd columns),
  236. followed by the index array operation which extracts rows with
  237. index 0, 2 and 4 (i.e the first, third and fifth rows).
  238. This is equivalent to::
  239. >>> y[:, 1:3][np.array([0, 2, 4]), :]
  240. array([[ 1, 2],
  241. [15, 16],
  242. [29, 30]])
  243. Likewise, slicing can be combined with broadcasted boolean indices: ::
  244. >>> b = y > 20
  245. >>> b
  246. array([[False, False, False, False, False, False, False],
  247. [False, False, False, False, False, False, False],
  248. [False, False, False, False, False, False, False],
  249. [ True, True, True, True, True, True, True],
  250. [ True, True, True, True, True, True, True]])
  251. >>> y[b[:,5],1:3]
  252. array([[22, 23],
  253. [29, 30]])
  254. Structural indexing tools
  255. =========================
  256. To facilitate easy matching of array shapes with expressions and in
  257. assignments, the np.newaxis object can be used within array indices
  258. to add new dimensions with a size of 1. For example: ::
  259. >>> y.shape
  260. (5, 7)
  261. >>> y[:,np.newaxis,:].shape
  262. (5, 1, 7)
  263. Note that there are no new elements in the array, just that the
  264. dimensionality is increased. This can be handy to combine two
  265. arrays in a way that otherwise would require explicitly reshaping
  266. operations. For example: ::
  267. >>> x = np.arange(5)
  268. >>> x[:,np.newaxis] + x[np.newaxis,:]
  269. array([[0, 1, 2, 3, 4],
  270. [1, 2, 3, 4, 5],
  271. [2, 3, 4, 5, 6],
  272. [3, 4, 5, 6, 7],
  273. [4, 5, 6, 7, 8]])
  274. The ellipsis syntax maybe used to indicate selecting in full any
  275. remaining unspecified dimensions. For example: ::
  276. >>> z = np.arange(81).reshape(3,3,3,3)
  277. >>> z[1,...,2]
  278. array([[29, 32, 35],
  279. [38, 41, 44],
  280. [47, 50, 53]])
  281. This is equivalent to: ::
  282. >>> z[1,:,:,2]
  283. array([[29, 32, 35],
  284. [38, 41, 44],
  285. [47, 50, 53]])
  286. Assigning values to indexed arrays
  287. ==================================
  288. As mentioned, one can select a subset of an array to assign to using
  289. a single index, slices, and index and mask arrays. The value being
  290. assigned to the indexed array must be shape consistent (the same shape
  291. or broadcastable to the shape the index produces). For example, it is
  292. permitted to assign a constant to a slice: ::
  293. >>> x = np.arange(10)
  294. >>> x[2:7] = 1
  295. or an array of the right size: ::
  296. >>> x[2:7] = np.arange(5)
  297. Note that assignments may result in changes if assigning
  298. higher types to lower types (like floats to ints) or even
  299. exceptions (assigning complex to floats or ints): ::
  300. >>> x[1] = 1.2
  301. >>> x[1]
  302. 1
  303. >>> x[1] = 1.2j
  304. TypeError: can't convert complex to int
  305. Unlike some of the references (such as array and mask indices)
  306. assignments are always made to the original data in the array
  307. (indeed, nothing else would make sense!). Note though, that some
  308. actions may not work as one may naively expect. This particular
  309. example is often surprising to people: ::
  310. >>> x = np.arange(0, 50, 10)
  311. >>> x
  312. array([ 0, 10, 20, 30, 40])
  313. >>> x[np.array([1, 1, 3, 1])] += 1
  314. >>> x
  315. array([ 0, 11, 20, 31, 40])
  316. Where people expect that the 1st location will be incremented by 3.
  317. In fact, it will only be incremented by 1. The reason is because
  318. a new array is extracted from the original (as a temporary) containing
  319. the values at 1, 1, 3, 1, then the value 1 is added to the temporary,
  320. and then the temporary is assigned back to the original array. Thus
  321. the value of the array at x[1]+1 is assigned to x[1] three times,
  322. rather than being incremented 3 times.
  323. Dealing with variable numbers of indices within programs
  324. ========================================================
  325. The index syntax is very powerful but limiting when dealing with
  326. a variable number of indices. For example, if you want to write
  327. a function that can handle arguments with various numbers of
  328. dimensions without having to write special case code for each
  329. number of possible dimensions, how can that be done? If one
  330. supplies to the index a tuple, the tuple will be interpreted
  331. as a list of indices. For example (using the previous definition
  332. for the array z): ::
  333. >>> indices = (1,1,1,1)
  334. >>> z[indices]
  335. 40
  336. So one can use code to construct tuples of any number of indices
  337. and then use these within an index.
  338. Slices can be specified within programs by using the slice() function
  339. in Python. For example: ::
  340. >>> indices = (1,1,1,slice(0,2)) # same as [1,1,1,0:2]
  341. >>> z[indices]
  342. array([39, 40])
  343. Likewise, ellipsis can be specified by code by using the Ellipsis
  344. object: ::
  345. >>> indices = (1, Ellipsis, 1) # same as [1,...,1]
  346. >>> z[indices]
  347. array([[28, 31, 34],
  348. [37, 40, 43],
  349. [46, 49, 52]])
  350. For this reason it is possible to use the output from the np.nonzero()
  351. function directly as an index since it always returns a tuple of index
  352. arrays.
  353. Because the special treatment of tuples, they are not automatically
  354. converted to an array as a list would be. As an example: ::
  355. >>> z[[1,1,1,1]] # produces a large array
  356. array([[[[27, 28, 29],
  357. [30, 31, 32], ...
  358. >>> z[(1,1,1,1)] # returns a single value
  359. 40
  360. """