Skip to content

Conversation

@yamt
Copy link
Contributor

@yamt yamt commented Sep 8, 2023

Note: this doesn't work for some situations like blocking system calls.
Most of them will be fixed (at least for posix-like platforms) by #2516

yamt added 5 commits September 8, 2023 12:55
IMO, it's evil to clear exceptions. However, with the current API,
it's necessary for repl at least.

struct spread_exception_data data;
data.skip = exec_env;
data.skip = NULL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If data.skip is set to NULL, seems no need to check if (exec_env != data->skip) in set_exception_visitor as exec_env is always not NULL? And also no need to use spread_exception_data struct, how about just passing exception to user_data argument of set_exception_visitor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it can be simplified if we want.

Copy link
Collaborator

@wenyongh wenyongh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


bh_assert(module_inst_comm->module_type == Wasm_Module_Bytecode
|| module_inst_comm->module_type == Wasm_Module_AoT);
wasm_set_exception(module_inst, "terminated by user");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems our purpose is to terminate all threads, how about call wasm_cluster_terminate_all directly rather than setting an exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasm_cluster_terminate_all seems unused/untested and broken. (eg. cluster_list_lock vs cluster->lock locking order)
IMO it should be removed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yamt We are implementing the thread suspension/resume mechanism in PR #2320,
in which when a thread wants to suspend/terminate other threads, it will lock the whole thread list firstly (and also get the acknowledge of other threads, other threads will run into waiting status). Shall we change to call wasm_cluster_terminate_all after the PR is implemented and tested?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yamt We are implementing the thread suspension/resume mechanism in PR #2320, in which when a thread wants to suspend/terminate other threads, it will lock the whole thread list firstly (and also get the acknowledge of other threads, other threads will run into waiting status). Shall we change to call wasm_cluster_terminate_all after the PR is implemented and tested?

how it's implemented is not critical for me as far as it works.
if you want to replace it with something else later, it's fine.

however, i want to provide a way for the terminated thread (the embedder app calling one of our execute APIs) to notice the fact it was terminated by wasm_runtime_terminate.
a nice property of a trap here is that the existing embedders are likely checking it already.
i'm not sure how wasm_cluster_terminate_all can work in that regard.

also, the current wasm_cluster_terminate_all seems to join all the threads by itself.
i'm not sure how it's supposed to work for the usage like iwasm, where wasm module is executed by the main thread.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a trap may be a good option for the embedders to check, maybe we can add a flag to indicate whether to throw the exception or not in wasm_runtime_terminate and wasm_cluster_terminate_all.

The current wasm_cluster_terminate_all joins all the threads which may include the main thread that initializes the runtime (if embedder calls the API in a another thread created by himself), it means that wasm_cluster_terminate_all waits until the main thread exits (and runtime is destroyed). This is not what embedder expects, I think we can change to wait until the cluster is destroyed in the cluster set. A potential issue of this PR is that wasm_cluster_terminate_all doesn't wait until all threads end their jobs, I am not sure whether it will cause unexpected behavior, it should be better if wasm_cluster_terminate_all can figure it out.

@wenyongh wenyongh merged commit a6fda9b into bytecodealliance:main Sep 13, 2023
vickiegpt pushed a commit to vickiegpt/wamr-aot-gc-checkpoint-restore that referenced this pull request May 27, 2024
Add API wasm_runtime_terminate to terminate a module instance
by setting "terminated by user" exception to the module instance.

And update the product-mini of posix platforms.

Note: this doesn't work for some situations like blocking system calls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants