Warning
The JupyterLab development team is excited to have a robust
third-party extension community. However, we do not review
third-party extensions, and some extensions may introduce security
risks or contain malicious code that runs on your machine. Moreover in order
to work, this panel needs to fetch data from web services. Do you agree to
activate this feature?
Please read the privacy policy.
Installed
Discover
Open TabsClose All
KernelsShut Down All
Python 3 (ipykernel)
- GiftWrapping_ConvexHull_Solution.ipynb (9fd49195)
- GiftWrapping_ConvexHull_Solution.ipynb
- Roof_example.ipynb (47d0d05c)
- Roof_example.ipynb
- GiftWrapping_ConvexHull.ipynb (79bdac9b)
- GiftWrapping_ConvexHull.ipynb
- 04E - Raster Data Exploration.ipynb (30fd7f20)
- 04E - Raster Data Exploration.ipynb
- 04E - Raster Data Exploration (Solution).ipynb (78656c61)
- 04E - Raster Data Exploration (Solution).ipynb
- 05 - Logistic Regression Decision Boundary.ipynb (b3c2fe0d)
- 05 - Logistic Regression Decision Boundary.ipynb
- 05 - Land Cover Prediction from Hyperspectral Satellite Image.ipynb (3f349317)
- 05 - Land Cover Prediction from Hyperspectral Satellite Image.ipynb
- E06 - SVMs on Hurricanes.ipynb (423bce23)
- E06 - SVMs on Hurricanes.ipynb
- SVM_parameter_exploration.ipynb (64095fd1)
- SVM_parameter_exploration.ipynb
- Classification-Metrics.ipynb (c00c001b)
- Classification-Metrics.ipynb
Language serversShut Down All
Recently ClosedForget All
WorkspacesDelete All
- auto-B
- auto-L
- default
TerminalsShut Down All
Classification-Metrics.ipynb
- Classification Metrics: Accuracy, Precision, and Recall
- Accuracy
- Precision
- Recall
- F1-Score
- Simple Raster Classification Example
- Classification-Metrics.ipynblast mo.5.5 KB
- E06 - SVMs on Hurricanes.ipynb43m ago150.3 KB
- hurricanes.csvlast mo.21.4 KB
- SVM_parameter_exploration.ipynblast mo.6.7 KB
- Open in... 打开方式...
Classification Metrics: Accuracy, Precision, and Recall¶
分类指标:准确性、精确度和召回率 ¶
- TP: True Positives TP:真阳性
- FP: False Positives FP:误报
- TN: True Negatives TN:真阴性
- FN: False Negatives FN:假阴性
Accuracy¶ 精度 ¶
Definition: 定义:
The ratio of correctly predicted samples to the total number of samples.
正确预测的样本与样本总数的比率。
Interpretation: 解释:
It measures overall correctness of the model, but can be misleading if classes are imbalanced.
它衡量模型的整体正确性,但如果类不平衡,可能会产生误导。
Precision¶ 精度 ¶
Definition: 定义:
The ratio of true positives to all predicted positives.
真阳性与所有预测阳性的比率。
Interpretation: 解释:
It answers: "When the model predicts positive, how often is it correct?"
它回答:“当模型预测为正时,它多久是正确的?
Useful when false positives are costly (e.g., spam detection).
当误报成本高昂时很有用(例如,垃圾邮件检测)。
Recall¶ 召回 ¶
Definition: 定义:
The ratio of true positives to all actual positives.
真阳性与所有实际阳性的比率。
Interpretation: 解释:
It answers: "Of all actual positives, how many did the model identify?"
它回答说:“在所有实际的积极因素中,模型识别了多少?
Important when missing positives is costly (e.g., disease detection).
当漏失阳性代价高昂时(例如,疾病检测),这一点很重要。
F1-Score¶ F1 分数 ¶
Definition: 定义:
The harmonic mean of precision and recall.
精度和召回率的调和平均值。
Or, in terms of TP, FP, and FN:
或者,就 TP、FP 和 FN 而言:
Interpretation: 解释:
F1-Score balances precision and recall in a single metric.
F1-Score 在单个指标中平衡了精度和召回率。
It is low when either precision or recall is low, and is most useful when:
当精度或召回率较低时,它为低,在以下情况下最有用:
- You need to balance false positives and false negatives.
您需要平衡误报和漏报。 - There is a class imbalance and you want a single metric to compare models.
存在类不平衡,您需要一个指标来比较模型。
The harmonic mean gives a more conservative estimate than the arithmetic mean, favoring models that perform well on both precision and recall.
调和均值给出了比算术均值更保守的估计值,有利于在精度和召回率方面都表现良好的模型。
Simple Raster Classification Example¶
简单栅格分类示例 ¶
We simulate a simple land-sea classification on a 10x10 raster grid:
我们在 10x10 栅格网格上模拟简单的陆海分类:
- Each cell in the raster represents either landmine (
1) or safe (0).
栅格中的每个像元表示地雷 (1) 或安全 (0)。 - The ground truth raster has:
地面实况栅格具有:- One cell labeled as landmine (Positive).
一个细胞被标记为地雷(阳性)。 - All other cells labeled as safe (Negative).
所有其他细胞都标记为安全(阴性)。
- One cell labeled as landmine (Positive).
- The predicted raster contains:
- All cells labeled as safe (Negative), so the model fails to detect the single land cell.
What is the accuracy of our prediction?
Note that we don't actually train a model, we just look at a potential prediction and compute the metrics for it.
Overall accuracy: 0.99
precision recall f1-score support
clear 0.99 1.00 0.99 99
landmine 0.00 0.00 0.00 1
accuracy 0.99 100
macro avg 0.49 0.50 0.50 100
weighted avg 0.98 0.99 0.99 100
/opt/conda/lib/python3.11/site-packages/sklearn/metrics/_classification.py:1531: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
/opt/conda/lib/python3.11/site-packages/sklearn/metrics/_classification.py:1531: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
/opt/conda/lib/python3.11/site-packages/sklearn/metrics/_classification.py:1531: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
- Open in...
Zonal Statistics from Raster Data with Numpy¶
In this exercise, we work with continuous field data in a raster representation, and overlay it with vector data to get zonal statistics and to query point locations.
Learning outcomes:
- Get familiar with numpy
- Get to know the typical management of raster data files and raster data
- Know how to access raster data values and make general computations with it
- Understand no data values
- Calculate zonal statistics using dedicated Python libraries
- Generate plots / maps with meaningful color coding of the calculated results
As datasets, we use:
- monthly avg temperature for September (wc2.1_10m_tavg_09.tif)
- world country masks (masks.tif)
The climate datasets are provided by WorldClim and can be downloaded here: https://worldclim.org/data/worldclim21.html. They provide different datasets (temperature, precipitation etc.) in different resolutions. We use the coarsest data for average temperature which has a resolution of 10 minutes (~340 km^2).
Before you begin, copy the accompaying file wc2.1_10m_tavg.zip into the directory of this note- book and unzip the data with the following command. Be aware that the command is commented out by default with the # symbol. So, just remove this # symbol, execute the following cell, and then better comment the command again with the # symbol to avoid unzipping the data over and over again.
/home/jovyan/01geo_data_science/Ex 04 - Raster Data with Numpy-20250519
Projection: EPSG:4326
Number of bands: 1
array data type: <class 'numpy.ndarray'> ndim: 2 shape: (1080, 2160) size: 2332800
value at 0,0: -3.4e+38 value at 400, 1200 (in Africa): 29.07325 value at 210, 1200 (in Europe): 13.274368
GeoTransform: | 0.17, 0.00,-180.00| | 0.00,-0.17, 90.00| | 0.00, 0.00, 1.00| actual coordinates index 400,1200: (20.083333333333314, 23.25) actual coordinates index 210,1200: (20.083333333333314, 54.91666666666667)
crs coordinates: 13.416666666666657 52.583333333333336
unique values: [-3.4000000e+38 -6.4563248e+01 -6.4562752e+01 ... 3.5770752e+01 3.5774750e+01 3.5790001e+01] no data value: -3.3999999521443642e+38
array([[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True]])<matplotlib.image.AxesImage at 0x7f3a94ecbb90>
<matplotlib.image.AxesImage at 0x7f3a94d85ed0>
(1080, 2160) (1080, 2160)
overall mean: -inf masked mean: -1.9185529 overall max: 35.79 masked max: 35.79 overall min: -3.4e+38 masked min: -64.56325
/opt/conda/lib/python3.11/site-packages/numpy/core/_methods.py:118: RuntimeWarning: overflow encountered in reduce ret = umr_sum(arr, axis, dtype, out, keepdims, where=where)
overall mean: -1.9185504
<matplotlib.image.AxesImage at 0x7f3a94b51c90>
array data type: <class 'numpy.ndarray'> ndim: 2 shape: (1080, 2160) size: 2332800 values: [ 0 1 2 3 5 6 8 9 10 11 13 14 15 16 17 18 19 20 21 22 23 24 26 27 28 29 30 31 32 34 35 36 37 38 39 40 41 42 43 45 46 47 48 49 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 90 91 92 93 94 95 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 137 138 140 141 142 143 144 145 146 147 148 149 151 152 153 154 156 157 158 159 160 161 162 163 165 166 168 169 170 171 172 173 174 175 176 177 179 180 181 182 183 184 185 186 192 195 196 197 198 199 200 201 202 203 204 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 229 230 231 232 233 235 236 237 238 239 241 242 243 244 246 247 249 250 251]
array([ 0, 1, 2, 3, 5, 6, 8, 9, 10, 11, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 26, 27, 28, 29,
30, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
45, 46, 47, 48, 49, 51, 52, 53, 54, 55, 56, 57, 58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,
72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
85, 86, 87, 90, 91, 92, 93, 94, 95, 97, 98, 99, 100,
101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113,
114, 115, 116, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127,
128, 129, 130, 131, 132, 133, 134, 135, 137, 138, 140, 141, 142,
143, 144, 145, 146, 147, 148, 149, 151, 152, 153, 154, 156, 157,
158, 159, 160, 161, 162, 163, 165, 166, 168, 169, 170, 171, 172,
173, 174, 175, 176, 177, 179, 180, 181, 182, 183, 184, 185, 186,
192, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 206, 207,
208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220,
221, 222, 223, 224, 225, 226, 229, 230, 231, 232, 233, 235, 236,
237, 238, 239, 241, 242, 243, 244, 246, 247, 249, 250, 251],
dtype=uint16)value at 0,0: 0 value at 400, 1200 (in Africa): 46 value at 210, 1200 (in Europe): 179
<matplotlib.image.AxesImage at 0x7f3a94469850>
country id: 179
<matplotlib.image.AxesImage at 0x7f3a944df390>
[ 1 2 3 5 6 8 9 10 11 13 14 15 16 17 18 19 20 21 22 23 24 26 27 28 29 30 31 32 34 35 36 37 38 39 40 41 42 43 45 46 47 48 49 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 90 91 92 93 94 95 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 137 138 140 141 142 143 144 145 146 147 148 149 151 152 153 154 156 157 158 159 160 161 162 163 165 166 168 169 170 171 172 173 174 175 176 177 179 180 181 182 183 184 185 186 192 195 196 197 198 199 200 201 202 203 204 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 229 230 231 232 233 235 236 237 238 239 241 242 243 244 246 247 249 250 251]
x dim: 1080 y dim: 2160 country id: 1 found 2349 values for country 1 mean value 18.873606557387298 country id: 2 found 109 values for country 2 mean value 16.79780496369808 country id: 3 found 6661 values for country 3 mean value 27.558252762395185 country id: 5 found 2 values for country 5 mean value 16.06737518310547 country id: 6 found 3738 values for country 6 mean value 22.502671495359042 country id: 8 found 211325 values for country 8 mean value -45.0966453536155 country id: 9 found 0 values for country 9 mean value nan country id: 10 found 10289 values for country 10 mean value 12.107646602092878 country id: 11 found 119 values for country 11 mean value 17.976976843441236 country id: 13 found 23488 values for country 13 mean value 19.89054715923701 country id: 14 found 402 values for country 14 mean value 13.741359532769046 country id: 15 found 653 values for country 15 mean value 18.90772010429347 country id: 16 found 0 values for country 16 mean value nan country id: 17 found 0 values for country 17 mean value nan country id: 18 found 2 values for country 18 mean value 32.80362510681152 country id: 19 found 477 values for country 19 mean value 13.864787739730856 country id: 20 found 0 values for country 20 mean value nan country id: 21 found 967 values for country 21 mean value 10.468221360673086 country id: 22 found 0 values for country 22 mean value nan country id: 23 found 59 values for country 23 mean value 26.935574159783833 country id: 24 found 352 values for country 24 mean value 27.98873221874237 country id: 26 found 156 values for country 26 mean value 5.640519251426061 country id: 27 found 3474 values for country 27 mean value 22.86677807497443 country id: 28 found 0 values for country 28 mean value nan country id: 29 found 216 values for country 29 mean value 13.40365163485209 country id: 30 found 1900 values for country 30 mean value 23.16966353567023 country id: 31 found 0 values for country 31 mean value nan country id: 32 found 25026 values for country 32 mean value 24.90931414174565 country id: 34 found 0 values for country 34 mean value nan country id: 35 found 0 values for country 35 mean value nan country id: 36 found 471 values for country 36 mean value 13.063108268057464 country id: 37 found 847 values for country 37 mean value 30.00475261737774 country id: 38 found 89 values for country 38 mean value 19.80923038654113 country id: 39 found 0 values for country 39 mean value nan country id: 40 found 545 values for country 40 mean value 26.47621983519388 country id: 41 found 1418 values for country 41 mean value 24.78962467957619 country id: 42 found 41121 values for country 42 mean value 3.1718289332493077 country id: 43 found 2 values for country 43 mean value 22.110000610351562 country id: 45 found 1897 values for country 45 mean value 25.484561546136657 country id: 46 found 4033 values for country 46 mean value 29.186000427181774 country id: 47 found 2721 values for country 47 mean value 7.018872113532165 country id: 48 found 35543 values for country 48 mean value 12.66542724738775 country id: 49 found 0 values for country 49 mean value nan country id: 51 found 2821 values for country 51 mean value 24.54758001421834 country id: 52 found 0 values for country 52 mean value nan country id: 53 found 1047 values for country 53 mean value 23.883442908781646 country id: 54 found 7133 values for country 54 mean value 24.14694360093684 country id: 55 found 2 values for country 55 mean value 23.560476303100586 country id: 56 found 118 values for country 56 mean value 25.303928003472798 country id: 57 found 987 values for country 57 mean value 25.84965325561338 country id: 58 found 236 values for country 58 mean value 13.379382428476365 country id: 59 found 152 values for country 59 mean value 27.66752683488946 country id: 60 found 0 values for country 60 mean value nan country id: 61 found 33 values for country 61 mean value 17.759545412930574 country id: 62 found 169 values for country 62 mean value 13.105935514325926 country id: 63 found 239 values for country 63 mean value 6.0675711372407415 country id: 64 found 34 values for country 64 mean value 31.490741112652948 country id: 65 found 0 values for country 65 mean value nan country id: 66 found 6 values for country 66 mean value 28.219969113667805 country id: 67 found 526 values for country 67 mean value 19.710752923225723 country id: 68 found 1980 values for country 68 mean value 26.226762414219404 country id: 69 found 61 values for country 69 mean value 26.422872918551086 country id: 70 found 86 values for country 70 mean value 24.228967134342636 country id: 71 found 135 values for country 71 mean value 31.0214640016909 country id: 72 found 259 values for country 72 mean value 7.957163917512047 country id: 73 found 58 values for country 73 mean value 20.61734054828512 country id: 74 found 3107 values for country 74 mean value 23.989445535937882 country id: 75 found 16 values for country 75 mean value 4.404160603880882 country id: 76 found 0 values for country 76 mean value nan country id: 77 found 0 values for country 77 mean value nan country id: 78 found 1455 values for country 78 mean value 5.654943007537999 country id: 79 found 1646 values for country 79 mean value 14.22964346510237 country id: 80 found 28 values for country 80 mean value 27.04789767946516 country id: 81 found 0 values for country 81 mean value nan country id: 82 found 9 values for country 82 mean value 2.20082590315077 country id: 83 found 756 values for country 83 mean value 23.606316503393586 country id: 84 found 31 values for country 84 mean value 30.36542474069903 country id: 85 found 279 values for country 85 mean value 16.883271501055756 country id: 86 found 1233 values for country 86 mean value 12.341848748540453 country id: 87 found 729 values for country 87 mean value 26.469824739934975 country id: 90
--------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) Cell In[25] 0 <Error retrieving source code with stack_data see ipython/ipython#13598> KeyboardInterrupt:
- Open in...
./hurricanes.csv
| RowNames | Number | Name | Year | Type | FirstLat | FirstLon | MaxLat | MaxLon | LastLat | LastLon | MaxInt | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 430 | NOTNAMED | 1944 | 1 | 30.2 | -76.1 | 32.1 | -74.8 | 35.1 | -69.2 | 80 |
| 1 | 2 | 432 | NOTNAMED | 1944 | 0 | 25.6 | -74.9 | 31.0 | -78.1 | 32.6 | -78.2 | 80 |
| 2 | 3 | 433 | NOTNAMED | 1944 | 0 | 14.2 | -65.2 | 16.6 | -72.2 | 20.6 | -88.5 | 105 |
| 3 | 4 | 436 | NOTNAMED | 1944 | 0 | 20.8 | -58.0 | 26.3 | -72.3 | 42.1 | -71.5 | 120 |
| 4 | 5 | 437 | NOTNAMED | 1944 | 0 | 20.0 | -84.2 | 20.6 | -84.9 | 19.1 | -93.9 | 70 |
| RowNames | Number | Name | Year | Type | FirstLat | FirstLon | MaxLat | MaxLon | LastLat | LastLon | MaxInt | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 332 | 333 | 1227 | GORDON | 2000 | 1 | 25.2 | -85.4 | 26.1 | -84.9 | 28.0 | -83.8 | 70 |
| 333 | 334 | 1229 | ISAAC | 2000 | 0 | 14.3 | -33.2 | 26.6 | -54.2 | 39.7 | -47.9 | 120 |
| 334 | 335 | 1230 | JOYCE | 2000 | 0 | 12.4 | -38.8 | 12.2 | -42.5 | 10.5 | -48.6 | 80 |
| 335 | 336 | 1231 | KEITH | 2000 | 0 | 17.9 | -86.4 | 17.9 | -87.2 | 22.6 | -97.9 | 120 |
| 336 | 337 | 1233 | MICHAEL | 2000 | 3 | 30.1 | -70.9 | 44.0 | -58.5 | 51.0 | -53.5 | 85 |
| RowNames | Number | Name | Type | FirstLat | FirstLon | MaxLat | MaxLon | LastLat | LastLon | MaxInt | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Year | |||||||||||
| 1944 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 1945 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1946 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 1947 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1948 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
| 1949 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 1950 | 11 | 11 | 11 | 11 | 11 | 11 | 11 | 11 | 11 | 11 | 11 |
| 1951 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
| 1952 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
| 1953 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
| 1954 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
| 1955 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 |
| 1956 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 1957 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 1958 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 1959 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 1960 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 1961 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
| 1962 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 1963 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 1964 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
| 1965 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 1966 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 1967 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
| 1968 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1969 | 12 | 12 | 12 | 12 | 12 | 12 | 12 | 12 | 12 | 12 | 12 |
| 1970 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1971 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
| 1972 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 1973 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 1974 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 1975 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
| 1976 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
| 1977 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1978 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1979 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1980 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 |
| 1981 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 1982 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| 1983 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 1984 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1985 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 1986 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 1987 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 1988 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1989 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 1990 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
| 1991 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 1992 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 1993 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 1994 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 1995 | 11 | 11 | 11 | 11 | 11 | 11 | 11 | 11 | 11 | 11 | 11 |
| 1996 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 | 9 |
| 1997 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 1998 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
| 1999 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
| 2000 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
| Number of Hurricanes | |
|---|---|
| Year | |
| 1944 | 7 |
| 1945 | 5 |
| 1946 | 3 |
| 1947 | 5 |
| 1948 | 6 |
| 1949 | 7 |
| 1950 | 11 |
| 1951 | 8 |
| 1952 | 6 |
| 1953 | 6 |
| 1954 | 8 |
| 1955 | 9 |
| 1956 | 4 |
| 1957 | 3 |
| 1958 | 7 |
| 1959 | 7 |
| 1960 | 4 |
| 1961 | 8 |
| 1962 | 3 |
| 1963 | 7 |
| 1964 | 6 |
| 1965 | 4 |
| 1966 | 7 |
| 1967 | 6 |
| 1968 | 5 |
| 1969 | 12 |
| 1970 | 5 |
| 1971 | 6 |
| 1972 | 3 |
| 1973 | 4 |
| 1974 | 4 |
| 1975 | 6 |
| 1976 | 6 |
| 1977 | 5 |
| 1978 | 5 |
| 1979 | 5 |
| 1980 | 9 |
| 1981 | 7 |
| 1982 | 2 |
| 1983 | 3 |
| 1984 | 5 |
| 1985 | 7 |
| 1986 | 4 |
| 1987 | 3 |
| 1988 | 5 |
| 1989 | 7 |
| 1990 | 8 |
| 1991 | 4 |
| 1992 | 4 |
| 1993 | 4 |
| 1994 | 3 |
| 1995 | 11 |
| 1996 | 9 |
| 1997 | 3 |
| 1998 | 10 |
| 1999 | 8 |
| 2000 | 8 |
| RowNames | Number | Name | FirstLat | FirstLon | MaxLat | MaxLon | LastLat | LastLon | MaxInt | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Year | Type | ||||||||||
| 1944 | 0 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | |
| 1945 | 0 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 1946 | 0 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 1999 | 0 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 2000 | 0 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
128 rows × 10 columns
| Tropical | Baroclinic influences | Baroclinic initiation | |
|---|---|---|---|
| Year | |||
| 1944 | 5.0 | 2.0 | NaN |
| 1945 | 4.0 | 1.0 | NaN |
| 1946 | 2.0 | 1.0 | NaN |
| 1947 | 5.0 | NaN | NaN |
| 1948 | 5.0 | 1.0 | NaN |
| 1949 | 7.0 | NaN | NaN |
| 1950 | 8.0 | 2.0 | 1.0 |
| 1951 | 4.0 | 2.0 | 2.0 |
| 1952 | 6.0 | NaN | NaN |
| 1953 | 6.0 | NaN | NaN |
| 1954 | 5.0 | 2.0 | 1.0 |
| 1955 | 9.0 | NaN | NaN |
| 1956 | 3.0 | 1.0 | NaN |
| 1957 | 2.0 | NaN | 1.0 |
| 1958 | 7.0 | NaN | NaN |
| 1959 | 1.0 | 3.0 | 3.0 |
| 1960 | 2.0 | NaN | 2.0 |
| 1961 | 7.0 | NaN | 1.0 |
| 1962 | NaN | 2.0 | 1.0 |
| 1963 | 5.0 | 1.0 | 1.0 |
| 1964 | 5.0 | 1.0 | NaN |
| 1965 | 1.0 | 3.0 | NaN |
| 1966 | 2.0 | 1.0 | 4.0 |
| 1967 | 2.0 | 2.0 | 2.0 |
| 1968 | 1.0 | 2.0 | 2.0 |
| 1969 | 5.0 | 3.0 | 4.0 |
| 1970 | 1.0 | 2.0 | 2.0 |
| 1971 | 2.0 | 1.0 | 3.0 |
| 1972 | NaN | 1.0 | 2.0 |
| 1973 | 1.0 | 2.0 | 1.0 |
| 1974 | 3.0 | 1.0 | NaN |
| 1975 | 4.0 | 1.0 | 1.0 |
| 1976 | 1.0 | 4.0 | 1.0 |
| 1977 | NaN | 4.0 | 1.0 |
| 1978 | 2.0 | 2.0 | 1.0 |
| 1979 | 2.0 | 3.0 | NaN |
| 1980 | 4.0 | 2.0 | 3.0 |
| 1981 | 4.0 | 2.0 | 1.0 |
| 1982 | NaN | 2.0 | NaN |
| 1983 | NaN | 1.0 | 2.0 |
| 1984 | NaN | 1.0 | 4.0 |
| 1985 | 4.0 | 1.0 | 2.0 |
| 1986 | NaN | 2.0 | 2.0 |
| 1987 | 1.0 | NaN | 2.0 |
| 1988 | 4.0 | NaN | 1.0 |
| 1989 | 5.0 | 2.0 | NaN |
| 1990 | 3.0 | 2.0 | 3.0 |
| 1991 | NaN | NaN | 4.0 |
| 1992 | NaN | 1.0 | 3.0 |
| 1993 | 1.0 | 3.0 | NaN |
| 1994 | 1.0 | NaN | 2.0 |
| 1995 | 9.0 | 1.0 | 1.0 |
| 1996 | 8.0 | 1.0 | NaN |
| 1997 | 1.0 | NaN | 2.0 |
| 1998 | 5.0 | 3.0 | 2.0 |
| 1999 | 7.0 | 1.0 | NaN |
| 2000 | 5.0 | 1.0 | 2.0 |
array([1, 0])
Type 0 187 1 150 Name: Type, dtype: int64
21.35 -47.85
Training set has 269 samples. Testing set has 68 samples.
SVC(kernel='linear')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
SVC(kernel='linear')
0.8661710037174721
Use the model to make predictions from the test data.
Use the predicted classes and the true classes of the test data to calculate the accuracy score.
0.8970588235294118
Tasks¶
- Train the SVM with different parameters (kernels, C value etc) and explore how the training and test score change.
- How does the scores relate to overfitting?
- You can also play around with other settings, like for instance the stratification of the train-test-split (see the documentation), which makes sure that the hurricane types are distributed equally between test and train set.
Further questions:
Which coordinate (latitude, longitude) is more significant when predicting the formation of tropical hurricanes?
Try and see if the maximum intensity makes a difference in the model!
Would it improve the model to add all other features as well? Which features would make sense? Can we create other interesting features from the data we have?
- Open in...
SVM Parameter Exploration¶
This interactive tool aims to give you a deeper understanding of the SVM parameters, and how it handles different types of data.
There are four synthetic datasets:
- Linearly Separable Blobs: These datasets consist of points grouped into two distinct blobs that can be separated by a straight line, hence "linearly separable".
- Non-Linearly Separable Blobs: These datasets still consist of points grouped into blobs, but they cannot be separated by a straight line.
- Circles: This dataset is composed of points arranged into two circular patterns, one inside the other.
- Moons: This dataset contains points in the shape of two interlocking half-circles, or "moons".
Through interactive widgets, you can choose the SVM kernel (Linear, Polynomial of degree 3, or Radial basis function), and adjust the parameters that control the behavior of the SVM.
Pay close attention to how changing these parameters affects the SVM's decision boundary, margins, and support vectors. Also, observe how the SVM handles different types of datasets.
The continuous grey line represents the decision boundary, the dashed lines denote the margins. Support vectors are enclosed in circles.
- E06 - SVMs on Hurricanes.ipynb
E06 - Hurricanes.ipynb 上的 SVM - 04E - Raster Data Exploration.ipynb
04E - 栅格数据探索.ipynb - SVM_parameter_exploration.ipynb
- Classification-Metrics.ipynb
分类指标.ipynb