GPU Benchmarking
Within the PoA’s health check, the drill test incorporates sophisticated benchmarking techniques such as MLPerf to evaluate machine performance comprehensively. By conducting benchmarking assessments, including MLPerf, the algorithm quantifies the machine’s efficiency. This quantitative measure serves as a reliable indicator of the machine’s condition, ensuring robustness and reliability in its operational capabilities.
Work flow of Drill Test
Here are some sample results of the drill test on Nvidia A100:
MLPerf Results Summary:
| Field | Value |
|---|---|
| SUT name | BERT SERVER |
| Scenario | Offline |
| Mode | PerformanceOnly |
| Samples per second | 1532.17 |
| Result | VALID |
| Min duration satisfied | Yes |
| Min queries satisfied | Yes |
| Early stopping satisfied | Yes |
Additional Stats:
| Metric | Value (ns) |
|---|---|
| Min latency | 3,559,383,281 |
| Max latency | 1,292,280,950,807 |
| Mean latency | 788,846,755,872 |
| 50.00 percentile latency | 840,201,049,914 |
| 90.00 percentile latency | 1,234,598,190,171 |
| 95.00 percentile latency | 1,268,998,116,410 |
| 97.00 percentile latency | 1,280,065,956,777 |
| 99.00 percentile latency | 1,289,280,826,440 |
| 99.90 percentile latency | 1,292,043,266,934 |
Test Parameters Used:
| Parameter | Value |
|---|---|
| samples_per_query | 1,980,000 |
| target_qps | 3,000 |
| target_latency (ns) | 0 |
| max_async_queries | 1 |
| min_duration (ms) | 600,000 |
| max_duration (ms) | 0 |
| min_query_count | 1 |
| max_query_count | 0 |
| qsl_rng_seed | 13,281,865,557,512,327,830 |
| sample_index_rng_seed | 198,141,574,272,810,017 |
| schedule_rng_seed | 7,575,108,116,881,280,410 |
| accuracy_log_rng_seed | 0 |
| accuracy_log_probability | 0 |
| accuracy_log_sampling_target | 0 |
| print_timestamps | 0 |
| performance_issue_unique | 0 |
| performance_issue_same | 0 |
| performance_issue_same_index | 0 |
| performance_sample_count | 10,833 |