Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling the wgpu feature results in a panic "The surface isn't supported by this adapter" #5269

Open
VorpalBlade opened this issue Oct 15, 2024 · 17 comments
Labels
bug Something is broken crash crash, panic, segfault, freeze, … egui-wgpu egui-winit porblems related to winit native-linux Problem specific to Linux

Comments

@VorpalBlade
Copy link

VorpalBlade commented Oct 15, 2024

Describe the bug
Enabling the wgpu feature causes the program to crash immediately at startup. This can be reproduced on the hello world example even.

cargo run    
    Finished dev [unoptimized + debuginfo] target(s) in 0.17s
     Running `/home/arvid/src/egui/target/debug/hello_world`
wp_linux_drm_syncobj_manager_v1#55: error 0: surface already exists
Protocol error 0 on object wp_linux_drm_syncobj_manager_v1@55: 
[2024-10-15T20:47:48Z ERROR wgpu_hal::vulkan::adapter] get_physical_device_surface_present_modes: ERROR_SURFACE_LOST_KHR
[2024-10-15T20:47:48Z ERROR wgpu_hal::vulkan::adapter] get_physical_device_surface_formats: ERROR_SURFACE_LOST_KHR
thread 'main' panicked at crates/egui-wgpu/src/winit.rs:173:18:
The surface isn't supported by this adapter
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

The full backtrace is:

stack backtrace:
   0: rust_begin_unwind
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
   2: core::panicking::panic_display
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:196:5
   3: core::panicking::panic_str
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:171:5
   4: core::option::expect_failed
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/option.rs:1980:5
   5: core::option::Option<T>::expect
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/option.rs:894:21
   6: egui_wgpu::winit::Painter::configure_surface
             at /home/arvid/src/egui/crates/egui-wgpu/src/winit.rs:170:15
   7: egui_wgpu::winit::Painter::resize_and_generate_depth_texture_view_and_msaa_view
             at /home/arvid/src/egui/crates/egui-wgpu/src/winit.rs:342:9
   8: egui_wgpu::winit::Painter::on_window_resized
             at /home/arvid/src/egui/crates/egui-wgpu/src/winit.rs:405:13
   9: eframe::native::wgpu_integration::WgpuWinitRunning::on_window_event
             at /home/arvid/src/egui/crates/eframe/src/native/wgpu_integration.rs:779:25
  10: <eframe::native::wgpu_integration::WgpuWinitApp as eframe::native::winit_integration::WinitApp>::window_event
             at /home/arvid/src/egui/crates/eframe/src/native/wgpu_integration.rs:457:16
  11: <eframe::native::run::WinitAppWrapper<T> as winit::application::ApplicationHandler<eframe::native::winit_integration::UserEvent>>::window_event::{{closure}}
             at /home/arvid/src/egui/crates/eframe/src/native/run.rs:285:22
  12: eframe::native::event_loop_context::with_event_loop_context
             at /home/arvid/src/egui/crates/eframe/src/native/event_loop_context.rs:53:5
  13: <eframe::native::run::WinitAppWrapper<T> as winit::application::ApplicationHandler<eframe::native::winit_integration::UserEvent>>::window_event
             at /home/arvid/src/egui/crates/eframe/src/native/run.rs:280:9
  14: winit::event_loop::dispatch_event_for_app
             at /home/arvid/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.30.5/src/event_loop.rs:642:52
  15: winit::platform::run_on_demand::EventLoopExtRunOnDemand::run_app_on_demand::{{closure}}
             at /home/arvid/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.30.5/src/platform/run_on_demand.rs:76:13
  16: core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:294:13
  17: winit::platform_impl::linux::wayland::event_loop::EventLoop<T>::single_iteration
             at /home/arvid/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.30.5/src/platform_impl/linux/wayland/event_loop/mod.rs:398:17
  18: winit::platform_impl::linux::wayland::event_loop::EventLoop<T>::pump_events
             at /home/arvid/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.30.5/src/platform_impl/linux/wayland/event_loop/mod.rs:211:13
  19: winit::platform_impl::linux::wayland::event_loop::EventLoop<T>::run_on_demand
             at /home/arvid/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.30.5/src/platform_impl/linux/wayland/event_loop/mod.rs:181:19
  20: winit::platform_impl::linux::EventLoop<T>::run_on_demand
             at /home/arvid/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.30.5/src/platform_impl/linux/mod.rs:813:56
  21: <winit::event_loop::EventLoop<T> as winit::platform::run_on_demand::EventLoopExtRunOnDemand>::run_on_demand
             at /home/arvid/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.30.5/src/platform/run_on_demand.rs:89:9
  22: winit::platform::run_on_demand::EventLoopExtRunOnDemand::run_app_on_demand
             at /home/arvid/.cargo/registry/src/index.crates.io-6f17d22bba15001f/winit-0.30.5/src/platform/run_on_demand.rs:75:9
  23: eframe::native::run::run_and_return
             at /home/arvid/src/egui/crates/eframe/src/native/run.rs:300:5
  24: eframe::native::run::run_wgpu::{{closure}}
             at /home/arvid/src/egui/crates/eframe/src/native/run.rs:357:13
  25: eframe::native::run::with_event_loop::{{closure}}
             at /home/arvid/src/egui/crates/eframe/src/native/run.rs:52:12
  26: std::thread::local::LocalKey<T>::try_with
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:270:16
  27: std::thread::local::LocalKey<T>::with
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/thread/local.rs:246:9
  28: eframe::native::run::with_event_loop
             at /home/arvid/src/egui/crates/eframe/src/native/run.rs:42:5
  29: eframe::native::run::run_wgpu
             at /home/arvid/src/egui/crates/eframe/src/native/run.rs:355:16
  30: eframe::run_native
             at /home/arvid/src/egui/crates/eframe/src/lib.rs:265:13
  31: hello_world::main
             at ./src/main.rs:12:5
  32: core::ops::function::FnOnce::call_once
             at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:250:5

To Reproduce
Steps to reproduce the behavior:

  1. Go to the egui/eframe examples/hello_world directory
  2. Enable wgpu feature for eframe in Cargo.toml
  3. cargo run

Expected behavior
Program should start without errors. Or at least not panic (return an Error instead). And that error should be less cryptic. (I'm not a GPU expert, I'm trying to make a simple GUI.).

I don't know what the error means, but it should probably select a suitable surface instead of an unsupported surface. Whatever a surface is in this context.

Screenshots

Desktop (please complete the following information):

  • OS: Arch Linux (rolling release)
  • Browser: Not applicable, doing a native program, not WASM.
  • Desktop environment: KDE Plasma 6.2.0 Wayland
  • Eframe version: 0.29.1 (but also reproduces on latest master as of writing this, commit g707cd033)
  • Mesa 24.2.4-1
  • Kernel 6.11.2
  • Intel i7-8550U (Skylake) + nVidia MX150 running in pure Intel graphics mode. (But it doesn't make a difference to use hybrid mode or prime-run to run on the nVidia GPU).

Additional context

@VorpalBlade VorpalBlade added the bug Something is broken label Oct 15, 2024
egui-wgpu egui-winit porblems related to winit crash crash, panic, segfault, freeze, … labels Oct 23, 2024
Copy link
Owner

I agree we shouldn't panic here. Maybe @Wumpf wants to take a look at some point? 🙏

native-linux Problem specific to Linux label Oct 23, 2024
Copy link
Collaborator

Oct 23, 2024

yep, we definitely shouldn't crash and instead try to forward an error if needed

As to actual source of the issue (and how to show it better): this might be a tricky one. This piece of log output seems to come directly from wayland

wp_linux_drm_syncobj_manager_v1#55: error 0: surface already exists
Protocol error 0 on object wp_linux_drm_syncobj_manager_v1@55: 

And searching for this a bit brings up people recommending to downgrade the nvidia driver from 560 to 555, e.g. here
This is also being discussed in the nvidia forum here.
Given that you're saying you're only running with Intel, that's probably not the whole story 🤔

@VorpalBlade
Copy link
Author

Is it possible to force Egui/eframe with wgpu to use xwayland as a test? Some environment variable perhaps?

Also, vkcube and glxgears both work fine under Wayland with both Intel and nvidia graphics. This indicates to me that both basic platform APIs work (at least to some extent). So it seems like the issue could be on the wgpu side.

Copy link
Collaborator

Hmm yeah then it is probably something that wgpu could do different. There's unfortunately quite many wayland issues there that are waiting for contributors to look into (the maintainers either don't have local repro cases (that includes myself) or no high priority to look into these (most of the time both)).

You could try with unset WAYLAND_DISPLAY and see if that makes a difference

@VorpalBlade
Copy link
Author

So uh, tried this again. It isn't happening any more (at least in hybrid mode, with and without prime-run, I will reboot to pure Intel and see if it happens there).

Though according to nvidia-smi in hybrid mode without prime-run the WGPU build is still selecting the nVidia GPU. That sounds like a bug (the glow build respects the environment variables prime-run sets). What a mess (and I don't know if this is a nvidia, mesa or wgpu bug).

As I'm on Arch Linux (which is rolling release) that could be from some update or other. In this case the only difference I can find is that I'm now on KDE Plasma 6.2.1 and Kernel 6.11.3. Mesa is the same version.

I did not note down the nvidia driver version before, but now it is 560.35.03-16.

@VorpalBlade
Copy link
Author

VorpalBlade commented Oct 23, 2024

Unsetting WAYLAND_DISPLAY does not make the program use X for some reason (program is not listed in output of xlsclients which some other programs are). No idea how that works.

Building eframe without the wayland feature disabled doesn't make it use X11 either (huh?).

@VorpalBlade
Copy link
Author

Aha, in pure Intel mode it still fails. That is interesting. If WGPU incorrectly uses the nVidia GPU when it is not supposed to (as indicated above, in hybrid mode) that could definitely cause issues.

As the programs exits too quickly, it is hard to know. I tried to set a breakpoint to where it crashes and use nvidia-smi to see if anything was using the nVidia GPU at that point, but nvidia-smi comes up blank (so unclear).

However, if I run vkcube while booted into pure Intel mode, it does select the nvidia GPU by default, and manages to run on it somehow. nvidia-smi reports that. For vkcube you can override this with --gpu_number 0. I can't spot any way to force one GPU or the other for eframe + WGPU though? Maybe this field has something but the link to the actual type is broken (goes to a page reading "docs.rs failed to build egui-wgpu-0.29.1")

Copy link
Collaborator

Huh, strange why would docs fail for 0.29 Oo. They're up for 0.28 and the struct hasn't changed. Yes, indeed you should be able to adjust which gpu is choosen by changing the power_preference.
To figure out more about the device selection process you may also want to run with RUST_LOG=trace

@lucasmerlin
Copy link
Collaborator

Docs were fixed by this: #5204

But we'd need to make a new patch release for the docs to show up

@VorpalBlade
Copy link
Author

Setting WGPU_BACKEND=opengl works around this issue. So it is an issue with vulkan + wgpu + some not yet clear combo of mesa/kernel/PRIME/nvidia.

There should be a way (either automatically by egui/eframe or by the application) to fall back to opengl if vulkan isn't working. Possibly this also means being able to fall back to glow if wgpu isn't working and you compiled in support for both.

A prerequisite for this is to not panic though (I would rather not mess around with catch_unwind, I believe it is generally a bad idea to do so).

Yes, indeed you should be able to adjust which gpu is choosen by changing the power_preference.

I will try to look into that soon, I'm currently dealing with a rather nasty cold, so it may take a few days.

Copy link
Collaborator

Yeah I think that "this surface doesn't have any formats" should be an exclusion criteirum for an adapter. That ofc still leaves the separate issue of the surface mysteriously not advertising any formats.
I think this should actually already on the wgpu level within its "compatible surface" check, but we should be able to do this on within egui-wgpu in addition to getting a patch upstream 🤔

@VorpalBlade
Copy link
Author

Unfortunately WGPU_POWER_PREF (the environment variable variant of this) has absolutely no effect regardless of the value it is set to:

  • In Intel mode it always crashes.
  • In hybrid mode it always uses the nvidia GPU.

I have not tested messing with this programatically.

Copy link
Collaborator

for reference, found this issue on egl-wayland today which might be related. However, the user there reports that glxgears has the issue as well which wasn't the case for you 🤔
NVIDIA/egl-wayland#96

Copy link
Collaborator

Dec 4, 2024

This probably relevant fix landed in wgpu 23.0.1: gfx-rs/wgpu#6510
this fixes vulkan sometimes incorrectly being identified as a viable adapter

we have to update wgpu in the egui lock file as well for this to take effect on the egui demo

there are ofc two remaining issues:

  • egui shouldn't crash in the first place here - replace except with error
  • why is vulkan not a viable adapter here, something else is likely wrong as well

Copy link
Collaborator

wgpu 23.0.1 is now on egui master branch. @VorpalBlade could you please check if that makes any difference for you?

@VorpalBlade
Copy link
Author

I had time to test it today, using commit 13352d6 I enabled the wgpu feature and ran the hello world demo. (I think this is what I did last time, but it has been a while.)

It appears to work fine. I don't know for sure how to tell if it uses OpenGL or Vulkan though. But this seems to indicate GLES?

RUST_LOG=warn cargo run 
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.38s
     Running `/home/arvid/src/egui/target/debug/hello_world`
[2024-12-09T21:58:21Z WARN  wgpu_hal::vulkan::instance] InstanceFlags::VALIDATION requested, but unable to find layer: VK_LAYER_KHRONOS_validation
[2024-12-09T21:58:21Z WARN  wgpu_hal::gles::egl] Re-initializing Gles context due to Wayland window
[2024-12-09T21:58:21Z WARN  wgpu_hal::gles::adapter] Detected skylake derivative running on mesa i915. Clears to srgb textures will use manual shader clears.
[2024-12-09T21:58:21Z WARN  wgpu_hal::gles::adapter] Detected skylake derivative running on mesa i915. Clears to srgb textures will use manual shader clears.

Copy link
Collaborator

Dec 10, 2024

nice! That's probably GLES: the we we run with wgpu here it tries through all adapters before it settles on one. If you run with RUST_LOG=debug it should show up somewhere more clearly

regardless let's keep this ticket open: egui still shouldn't panic on surface not supported
also (harder and orthogonal) we still need to figure out why wgpu fails picking Vulkan in some cases where it evidently (== vkcube works) should work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken crash crash, panic, segfault, freeze, … egui-wgpu egui-winit porblems related to winit native-linux Problem specific to Linux
Projects
None yet
Development

No branches or pull requests

4 participants