====== Ubuntu - GPU - Troubleshooting - *ERROR* ring gfx_0.0.0 timeout ====== Random crashes when using a browser or icaclient (Citrix client). [ 85.861734] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=13365, emitted seq=13367 [ 85.862162] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process kwin_x11 pid 819 thread kwin_x11:cs0 pid 838 ---- ===== Fix ===== It is often a power saving issue. echo high > /sys/class/drm/card0/device/power_dpm_force_performance_level ---- ===== An alternative workaround ===== Use the **amdgpu.ppfeaturemask** parameter to narrow down which power feature is causing problems. * The bits in that parameter are defined by the **PP_FEATURE_MASK** enum here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/include/amd_shared.h#n199 cat /sys/class/drm/card0/device/pp_features returns: features high: 0x0003ebb8 low: 0x71ffffff No. Feature Bit : State 00. FW_DATA_READ ( 0) : enabled 01. DPM_GFXCLK ( 1) : enabled 02. DPM_GFX_POWER_OPTIMIZER ( 2) : enabled 03. DPM_UCLK ( 3) : enabled 04. DPM_FCLK ( 4) : enabled 05. DPM_SOCCLK ( 5) : enabled 06. DPM_MP0CLK ( 6) : enabled 07. DPM_LINK ( 7) : enabled 08. DPM_DCN ( 8) : enabled 09. VMEMP_SCALING ( 9) : enabled 10. VDDIO_MEM_SCALING (10) : enabled 11. DS_GFXCLK (11) : enabled 12. DS_SOCCLK (12) : enabled 13. DS_FCLK (13) : enabled 14. DS_LCLK (14) : enabled 15. DS_DCFCLK (15) : enabled 16. DS_UCLK (16) : enabled 17. GFX_ULV (17) : enabled 18. FW_DSTATE (18) : enabled 19. GFXOFF (19) : enabled 20. BACO (20) : enabled 21. MM_DPM (21) : enabled 22. SOC_MPCLK_DS (22) : enabled 23. BACO_MPCLK_DS (23) : enabled 24. THROTTLERS (24) : enabled 25. SMARTSHIFT (25) : disabled 26. GTHR (26) : disabled 27. ACDC (27) : disabled 28. VR0HOT (28) : enabled 29. FW_CTF (29) : enabled 30. FAN_CONTROL (30) : enabled 31. GFX_DCS (31) : disabled 32. GFX_READ_MARGIN (32) : disabled 33. LED_DISPLAY (33) : disabled 34. GFXCLK_SPREAD_SPECTRUM (34) : disabled 35. OUT_OF_BAND_MONITOR (35) : enabled 36. OPTIMIZED_VMIN (36) : enabled 37. GFX_IMU (37) : enabled 38. BOOT_TIME_CAL (38) : disabled 39. GFX_PCC_DFLL (39) : enabled 40. SOC_CG (40) : enabled 41. DF_CSTATE (41) : enabled 42. GFX_EDC (42) : disabled 43. BOOT_POWER_OPT (43) : enabled 44. CLOCK_POWER_DOWN_BYPASS (44) : disabled 45. DS_VCN (45) : enabled 46. BACO_CG (46) : enabled 47. MEM_TEMP_READ (47) : enabled 48. ATHUB_MMHUB_PG (48) : enabled 49. SOC_PCC (49) : enabled ---- ==== Change Kernel Boot Parameters ==== Try adding the **amdgpu.ppfeaturemask** to the kernel boot parameters, **/etc/default/grub**: GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amdgpu.ppfeaturemask=0xfffd3fff" and update Grub sudo update-grub Then reboot, and check if the fault happens again. **NOTE:** It may be that the fault is caused by one or more of those features: PP_OVERDRIVE_MASK = 0x4000, PP_GFXOFF_MASK = 0x8000, PP_STUTTER_MODE = 0x20000, ---- ===== Use Mesa Environment Parameters to identify the cause ===== Add **RADV_DEBUG=hang** to the **/etc/environment**, then try triggering the fault again. This dumps a report to $HOME/radv_dumps__