文件操作

读取.mat数据集

.mat means the data has been saved in a native Octave or Matlab matrix format. But powerful python can read it by using the loadmat function within the scipy.io module. Then it returns a dictionary字典,键值对

组合路径

import os
os.path.join("Data", "dataset1")

Data is the name of folder, dataset1 is the name of the file.

索引方法

For array of numpy, we can use either y[0, 0] or y[0][0] to index the value.

For array not of numpy, we can only use y[0][0] to index the element.

To check an array is numpy type or not:

import numpy as np
isinstance(y, np.ndarray)

unroll矩阵

For two-dimension array or matrix, we can use numpy.ravel() to unroll it from matrix to vector.

y = np.array([[1,2,3],[4,5,6]])
>>> y.ravel()
array([1, 2, 3, 4, 5, 6])

字典查找

X = data['X']  # find the value of key"X" and assign it to X
y[y == 10] = 0 # find the value equal to 10 and change their value to 0
'''
y == 10会返回一个布尔数组,由False和True组成
再用这个布尔数组去索引y,y[y==10]的结果将是一个元素全是10的数组
再将其赋值为0
'''

随机选取

np.random.choice(400, 100, replace=false)
# Randomly select 100 numbers from 0 to 399, 400 elements totally
# replace = false means no overlapping, every chosen element is sole

组合矩阵

X_t = np.concatenate([np.ones((5, 1)), np.arange(1, 16).reshape(5, 3, order='F')/10.0], axis=1)

np.arange(j, k): generate an array(vector). From j to (k - 1), step = 1.

.reshape(5, 3, order='F'): reshape the matrix to 5 by 3 dimension. Order means the way to reshape it. F means according to the column; C means according to the row.

np.concatenate: connect several matrices to a new matrix. axis=1 means according to the column.

矩阵乘法

image-20240106161535940

For numpy array, we can use some pretty useful function:

A.dot(B.T)
# calculate the product matrix of A and transpose matrix of B

查看文件夹

import os 
dir_path = 'path/to/dir'
os.listdir(dir_path)
# return a list contaning the content inside dir_path

随机排列

indices = np.random.permutation(m)

If inputting a real number, it will return a random array consisting of from 0 to m - 1

If inputting an array, it will retuan a random permutation of the array

矩阵滚动

np.roll(Theta2, 1, axis=0)
# 按照axis=0轴滚动一行

image-20240107130316573