ubuntu:gpu:amd_gpu:install_tools
This is an old revision of the document!
Table of Contents
Ubuntu - GPU - AMD GPU - Install Tools
- clinfo
- clpeak
- radeontop
clinfo
Check supported OpenCL extensions.
sudo apt install clinfo
clinfo
returns:
clinfo Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.1 AMD-APP (3513.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Name: AMD Accelerated Parallel Processing Number of devices: 1 Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Board name: Radeon RX 7900 XTX Device Topology: PCI[ B#12, D#0, F#0 ] Max compute units: 48 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 256 Preferred vector width char: 4 Preferred vector width short: 2 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Native vector width char: 4 Native vector width short: 2 Native vector width int: 1 Native vector width long: 1 Native vector width float: 1 Native vector width double: 1 Max clock frequency: 3220Mhz Address bits: 64 Max memory allocation: 21890072576 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 16384 Max image 3D height: 16384 Max image 3D depth: 8192 Max samplers within kernel: 29772 Max size of kernel argument: 1024 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 32768 Global memory size: 25753026560 Constant buffer size: 21890072576 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 65536 Max pipe arguments: 16 Max pipe active reservations: 16 Max pipe packet size: 415236096 Max global variable size: 21890072576 Max global variable preferred total size: 25753026560 Max read/write image args: 64 Max on device events: 1024 Queue on device max size: 8388608 Max on device queues: 1 Queue on device preferred size: 262144 SVM capabilities: Coarse grain buffer: Yes Fine grain buffer: Yes Fine grain system: No Atomics: No Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 32 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue on Host properties: Out-of-Order: No Profiling : Yes Queue on Device properties: Out-of-Order: Yes Profiling : Yes Platform ID: 0x7f1ae3ff0eb0 Name: gfx1100 Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 2.0 Driver version: 3513.0 (HSA1.1,LC) Profile: FULL_PROFILE Version: OpenCL 2.0 Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
clpeak
Measure speed.
sudo apt install clpeak
clpeak
returns:
Platform: AMD Accelerated Parallel Processing Device: gfx1100 Driver version : 3513.0 (HSA1.1,LC) (Linux x64) Compute units : 48 Clock frequency : 3220 MHz Global memory bandwidth (GBPS) float : 742.69 float2 : 790.45 float4 : 825.86 float8 : 857.23 float16 : 878.19 Single-precision compute (GFLOPS) float : 34058.41 float2 : 34472.16 float4 : 34262.00 float8 : 34510.05 float16 : 32940.55 Half-precision compute (GFLOPS) half : 34022.91 half2 : 65896.89 half4 : 66729.46 half8 : 62740.96 half16 : 64157.85 Double-precision compute (GFLOPS) double : 1190.26 double2 : 1188.94 double4 : 1186.53 double8 : 1180.77 double16 : 1148.82 Integer compute (GIOPS) int : 8555.43 int2 : 8381.68 int4 : 8347.36 int8 : 8427.74 int16 : 8431.46 Transfer bandwidth (GBPS) enqueueWriteBuffer : 18.65 enqueueReadBuffer : 17.34 enqueueMapBuffer(for read) : 233422.14 memcpy from mapped ptr : 19.16 enqueueUnmap(after write) : 360921.62 memcpy to mapped ptr : 18.79 Kernel launch latency : 13.96 us
radeontop
A tool to view Radeon GPU utilization, both for the total activity percent and individual blocks.
sudo apt install radeontop
radeontop
ubuntu/gpu/amd_gpu/install_tools.1685908375.txt.gz · Last modified: 2023/06/04 19:52 by peter