Sunday, February 9, 2014

Post 20: Advanced NumPy Array Constructs

In this post, we would look at some of the advanced features of NumPy arrays. 

NumPy arrays have a shape (or dimension). So if we create an array of shape = (M, N), it would create it as:


Please note that there could few or more dimensions while creating the NumPy arrays, as NumPy arrays are really N-Dimensional.

- To create a 1-Dimensional array:
>> np.zeros(shape=5, dtype='float')

- To create a 3-Dimensional array of shape 3x3x3:
>> np.zeros(shape=(3,3,3), dtype='float')

Arrays have a specified data type.

- You can create arrays from python lists:
>> np.array([[1,2,3], [4,5,6]], dtype='float')

Creating NumPy arrays from python lists is not very efficient as native python data types are slow. So we often read and write directly from files instead Or use some other utilities, like, zeros(), diag(), ones(), arrange().

- Evenly spaced values on an interval:
>> arange([start,] end, [,step])

arange  allows fractional and negative steps

- Values equidistant on a linear scale:
>> np.linspace(start, end, num)

- Values equidistant on a log scale:
>> np.logspace(start, end, num)

- Zero-initialized array of shape 2x2:
>> np.zeros((2,2))

- One-initialized array of shape 1x5:
>> np.ones((1, 5))

- Uninitialized empty array:
>> np.empty((1, 3))

- Constant diagonal value of 1 of size 3x3:
>> np.eye(3)

- Multiple diagonal values of 1, 2, 3, 4 of size 4x4:
>> np.diag([1,2,3,4])

Note that Array slices never create copies.


Array Assignment:

- Assign all elements of an array to a scalar:
>> a[:] = x
This copying of the scalar, as needed, is an example of NumPy broadcasting.

- Row or column assignment to a scalar:
>> a[i, :] = x
>> a[i, :] = [A, B, C, D, E, F]


Array Math
- Operation with scalars apply to all elements
>> a = np.arange(10)
>> a + 20

- Operations on other arrays are element-wise
>> a + a

These operations create new arrays for the result

- Vectorized conditionals
    * Conditional operations make Boolean arrays
      >> a = np.arange(5)
      >> a > 2

    * np.where selects from an array
      >> np.where(a > 2)
      >> np.where(a > 2, a, False) 

           - Putting false in the parameter would make sure that the false values are populated with 0s.



    * Predicates: 
       >> a.any() - Returns true only if the conditional has at least one true
       >> a.all() - Returns true only if the conditional has all true

- Reductions: Few of the reduction functions in NumPy are -  
a.mean(), a.argmin(), a.argmax(), a.trace(), a.cumsum(), a.cumprod()

- Manipulation: a.argsort(), a.transpose(), a.reshape(…), a.ravel(), a.fill(…), a.clip(…) 
 
- Complex Numbers: These properties and functions can be used on the real number
a.real, a.imag, a.conj()

Reshape Arrays:

- Reshape will reshape an array to any matching size
- Takes tuples as arguments

- Use  –1 for a wildcard dimension
>> a = np.arange(30)
>> b = a.reshape((3, -1, 2))

>> b
>> b.shape
 

- np.ravel and np.flatten reshape arrays to one dimension
- np.squeeze removes singular dimensions: np.squeeze(b).shape





Array Data Type


- NumPy array elements have a single data type
- The type of object is accessible through the .dtype attribute
- Most common attributes of dtype object:
  - dtype.byteorder – big or little endian
  - dtype.itemsize – element size of the dtype
  - dtype.name – a name of this dtype object
  - dtype.type – type object used to create scalars etc.
- Array dtypes are usually inferred automatically
- But can be specified explicitly: a = np.array([1, 2,3], dtype=np.float32)



np.datetime64 is a new addition NumPy 1.7


NumPy Data Model
- Metadata: dtype, shape, strides
- Memory pointers to data
- When slicing arrays, NumPy creates new dtype/shape/stride information, but reuses the pointer to the original data
>>> a = np.arange(10)>>> b = a[3:7]>>> b
array([3, 4, 5, 6])
 

>>> b[:] = 0
>>> a
array([0, 1, 2, 0, 0, 0, 0, 7, 8, 9])


>>> b.flags.owndata
False



Array Memory Layout

When we extract values from NumPy arrays and return them into python, what NumPy does underneath the cover is extract row values from the C memory structures, wraps them in a small header object, and that is what is exposed to python. These array scalar as they called, are very light weight wrappers, and they make it so that python can deal with values inside the numpy array as transparently and as seamlessly as if the array was just a python list.






- Universal Functions
NumPy ufuncs are functions that operate element-wise on one or more arrays. Essentially ufuncs are two functions in one. In Python level, it returns a new array object. When called, ufuncs dispatch to optimized C loops on array dtype.


 Broadcasting
- Broadcasting is a key feature of NumPy.
  - Arrays with different, but compatible shapes can be used as arguments to ufuncs

>> c = a + 10

Here a scalar 10 is broadcasted to an array

- In order for a operation to broadcast, the size of all the trailing dimensions for both arrays must either: be equal or be one.

>> A (1-d array):       3
>> B (2-d array): 2 x 3
Result:                2 x 3

>> A (1-d array):       6 x 1
>> B (2-d array): 1 x 6 x 4
Result:                1 x 6 x 4

>> A (1-d array):  3 x 1 x 6 x 1
>> B (2-d array):        2 x 1 x 4
Result:                 3 x 2 x 6 x 4


NumPy.random Example:
- np.random.random((4, 5))
- np.random.normal(loc, scale, size)
- np.random.hypergeometric(20, 10, 5, (10,))
- np.random.permutation(20, 10, 5, (10,))


numpy.linalg Example:
- Transpose of a matrix - a.T
- Inverse of a matrix - np.linalg.inv(a)
- Dot product of Matrices - np.dot(a, x)
- np.linalg.solve(a, y)





Array Subclasses:
- numpy.matrix – Matrix Operators
- numpy.recarray – Record Arrays
- numpy.ma – Masked Arrays
- numpy.memmap – Memory-mapped Arrays



NumPy.matrix Example:

NumPy matrix objects can be created using matlab like syntax:
>> from numpy import matrix
>> A = matrix('1.0, 2.0; 3.0, 4.0')
>> Y = matrix ('5.0; 7.0')
>> from numpy.linalg import solve
>> x = solve(A, Y)
>> A * x
>> A.I





No comments:

Post a Comment