====== Ubuntu - GPU - AMD GPU - Install Tools ======
* clinfo
* clpeak
* glxinfo
* radeontop
----
===== clinfo =====
Check supported OpenCL extensions.
sudo apt install clinfo
clinfo
returns:
clinfo
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 2.1 AMD-APP (3513.0)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback
Platform Name: AMD Accelerated Parallel Processing
Number of devices: 1
Device Type: CL_DEVICE_TYPE_GPU
Vendor ID: 1002h
Board name: Radeon RX 7900 XTX
Device Topology: PCI[ B#12, D#0, F#0 ]
Max compute units: 48
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 1024
Max work group size: 256
Preferred vector width char: 4
Preferred vector width short: 2
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 4
Native vector width short: 2
Native vector width int: 1
Native vector width long: 1
Native vector width float: 1
Native vector width double: 1
Max clock frequency: 3220Mhz
Address bits: 64
Max memory allocation: 21890072576
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 16384
Max image 3D height: 16384
Max image 3D depth: 8192
Max samplers within kernel: 29772
Max size of kernel argument: 1024
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 32768
Global memory size: 25753026560
Constant buffer size: 21890072576
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 65536
Max pipe arguments: 16
Max pipe active reservations: 16
Max pipe packet size: 415236096
Max global variable size: 21890072576
Max global variable preferred total size: 25753026560
Max read/write image args: 64
Max on device events: 1024
Queue on device max size: 8388608
Max on device queues: 1
Queue on device preferred size: 262144
SVM capabilities:
Coarse grain buffer: Yes
Fine grain buffer: Yes
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 32
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 0x7f1ae3ff0eb0
Name: gfx1100
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 2.0
Driver version: 3513.0 (HSA1.1,LC)
Profile: FULL_PROFILE
Version: OpenCL 2.0
Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
----
===== clpeak =====
Measure speed.
sudo apt install clpeak
clpeak
returns:
Platform: AMD Accelerated Parallel Processing
Device: gfx1100
Driver version : 3513.0 (HSA1.1,LC) (Linux x64)
Compute units : 48
Clock frequency : 3220 MHz
Global memory bandwidth (GBPS)
float : 742.69
float2 : 790.45
float4 : 825.86
float8 : 857.23
float16 : 878.19
Single-precision compute (GFLOPS)
float : 34058.41
float2 : 34472.16
float4 : 34262.00
float8 : 34510.05
float16 : 32940.55
Half-precision compute (GFLOPS)
half : 34022.91
half2 : 65896.89
half4 : 66729.46
half8 : 62740.96
half16 : 64157.85
Double-precision compute (GFLOPS)
double : 1190.26
double2 : 1188.94
double4 : 1186.53
double8 : 1180.77
double16 : 1148.82
Integer compute (GIOPS)
int : 8555.43
int2 : 8381.68
int4 : 8347.36
int8 : 8427.74
int16 : 8431.46
Transfer bandwidth (GBPS)
enqueueWriteBuffer : 18.65
enqueueReadBuffer : 17.34
enqueueMapBuffer(for read) : 233422.14
memcpy from mapped ptr : 19.16
enqueueUnmap(after write) : 360921.62
memcpy to mapped ptr : 18.79
Kernel launch latency : 13.96 us
----
==== glxinfo ====
glxinfo shows information about the OpenGL and GLX implementations running on a given X display.
sudo apt update
sudo apt install mesa-utils
glxinfo -B
returns:
name of display: :0
display: :0 screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
Vendor: AMD (0x1002)
Device: Radeon RX 7900 XTX (gfx1100, LLVM 15.0.3, DRM 3.48, 5.19.0-43-generic) (0x744c)
Version: 22.3.0
Accelerated: yes
Video memory: 24576MB
Unified memory: no
Preferred profile: core (0x1)
Max core profile version: 4.6
Max compat profile version: 4.6
Max GLES1 profile version: 1.1
Max GLES[23] profile version: 3.2
Memory info (GL_ATI_meminfo):
VBO free memory - total: 23811 MB, largest block: 23811 MB
VBO free aux. memory - total: 32053 MB, largest block: 32053 MB
Texture free memory - total: 23811 MB, largest block: 23811 MB
Texture free aux. memory - total: 32053 MB, largest block: 32053 MB
Renderbuffer free memory - total: 23811 MB, largest block: 23811 MB
Renderbuffer free aux. memory - total: 32053 MB, largest block: 32053 MB
Memory info (GL_NVX_gpu_memory_info):
Dedicated video memory: 24576 MB
Total available memory: 56703 MB
Currently available dedicated video memory: 23811 MB
OpenGL vendor string: AMD
OpenGL renderer string: Radeon RX 7900 XTX (gfx1100, LLVM 15.0.3, DRM 3.48, 5.19.0-43-generic)
OpenGL core profile version string: 4.6 (Core Profile) Mesa 22.3.0-devel
OpenGL core profile shading language version string: 4.60
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL version string: 4.6 (Compatibility Profile) Mesa 22.3.0-devel
OpenGL shading language version string: 4.60
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 22.3.0-devel
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
----
====== radeontop ======
A tool to view Radeon GPU utilization, both for the total activity percent and individual blocks.
sudo apt install radeontop
radeontop
----