daily notes[34]
摘要:本文介绍了奇异矩阵(Singular Matrix)的特性,包括行列式为零、不可逆、存在线性相关的行/列以及有零特征值等判定条件。通过Python代码示例展示了如何使用scikit-learn的train_test_split分割数据集,
·
Singular Matrix
- A is a singular matrix which is not invertible if and only if
- d e t ( A ) = 0 det(A) = 0 det(A)=0
- its matrix inverse don’t exists.
- rows (or columns) have an approximately linear dependence
- one of its eigenvalue is zero.
sklearn.model_selection.train_test_split
divides arrays or matrices into train part and test part of datas.
import numpy as np
from sklearn.model_selection import train_test_split
X, y = np.arange(100).reshape((25, 4)), range(25)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
print(X_train, X_test, y_train, y_test)
[[20 21 22 23]
[ 8 9 10 11]
[48 49 50 51]
[60 61 62 63]
[12 13 14 15]
[16 17 18 19]
[80 81 82 83]
[68 69 70 71]
[84 85 86 87]
[72 73 74 75]
[96 97 98 99]
[28 29 30 31]
[40 41 42 43]
[56 57 58 59]
[76 77 78 79]
[24 25 26 27]] [[32 33 34 35]
[64 65 66 67]
[ 0 1 2 3]
[92 93 94 95]
[44 45 46 47]
[36 37 38 39]
[52 53 54 55]
[ 4 5 6 7]
[88 89 90 91]] [5, 2, 12, 15, 3, 4, 20, 17, 21, 18, 24, 7, 10, 14, 19, 6] [8, 16, 0, 23, 11, 9, 13, 1, 22]
we create a set of datas formed as matrix ,some columns of it have linear relations with each other, as follows.
import numpy as np
from sklearn.model_selection import train_test_split
x1, y = np.arange(100).reshape((25, 4)), range(25)
x2=(x1[:,1]*2+x1[:,3]*5).transpose()
x = np.column_stack((x1,x2))
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42)
Next, we try to deal with a set of datas which is a singular matrix.
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.linear_model import Ridge # L2正则化
def is_singular_by_eig(matrix, tol=1e-8):
"""
通过特征值判断
有零或接近零特征值的矩阵是奇异矩阵
"""
eigvals = np.linalg.eigvals(matrix)
return any(np.abs(eigvals) < tol)
x, y = np.arange(400).reshape((20, 20)), range(20)
print(x)
# 示例
print(is_singular_by_eig(x)) # True
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42)
#regressor = LinearRegression().fit(X_train, y_train)
# 使用Ridge回归
regressor = Ridge(alpha=1.0) # alpha是正则化强度
regressor.fit(X_train, y_train)
print("Ridge回归得分:", regressor.score(X_train, y_train))
print(regressor.coef_)
print(regressor.intercept_)
y_pred = regressor.predict(X_test)
print(mean_squared_error(y_test, y_pred))
PS E:\learn\learnpy> & D:/Python312/python.exe e:/learn/learnpy/learn1.py
[[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 19]
[ 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
38 39]
[ 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
58 59]
[ 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77
78 79]
[ 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
98 99]
[100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117
118 119]
[120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137
138 139]
[140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157
158 159]
[160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177
178 179]
[180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197
198 199]
[200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217
218 219]
[220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237
238 239]
[240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257
258 259]
[260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277
278 279]
[280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297
298 299]
[300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317
318 319]
[320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337
338 339]
[340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357
358 359]
[360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377
378 379]
[380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397
398 399]]
True
Ridge回归得分: 0.9999999999998942
[0.0025 0.0025 0.0025 0.0025 0.0025 0.0025 0.0025 0.0025 0.0025 0.0025
0.0025 0.0025 0.0025 0.0025 0.0025 0.0025 0.0025 0.0025 0.0025 0.0025]
-0.4749965178404061
4.4036472951780946e-12
references
更多推荐
所有评论(0)