RTC: Use prepared queries instead of *_post_meta functions#11325
RTC: Use prepared queries instead of *_post_meta functions#11325chriszarate wants to merge 9 commits intoWordPress:trunkfrom
*_post_meta functions#11325Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
c6314bc to
bfb5f12
Compare
src/wp-includes/collaboration/class-wp-sync-post-meta-storage.php
Outdated
Show resolved
Hide resolved
| $meta_value = sanitize_meta( self::SYNC_UPDATE_META_KEY, $meta_value, 'post', self::POST_TYPE ); | ||
| $meta_value = maybe_serialize( $meta_value ); | ||
|
|
||
| $result = $wpdb->insert( |
There was a problem hiding this comment.
@dmsnell have you ever found a way to run native prepared statements via $wpdb? Or is concatenating strings via $wpdb->insert() the best alternative that $wpdb offers?
src/wp-includes/collaboration/class-wp-sync-post-meta-storage.php
Outdated
Show resolved
Hide resolved
src/wp-includes/collaboration/class-wp-sync-post-meta-storage.php
Outdated
Show resolved
Hide resolved
src/wp-includes/collaboration/class-wp-sync-post-meta-storage.php
Outdated
Show resolved
Hide resolved
| 'meta_value' => wp_json_encode( $awareness ), | ||
| ), | ||
| array( '%d', '%s', '%s' ) | ||
| ); |
There was a problem hiding this comment.
When this code paths is processed in two concurrent requests, we may end up with two post meta records.
Unfortunately, WordPress doesn't give us many options here. If we had a unique index, we could use INSERT INTO ON DUPLICATE KEY UPDATE, but we don't. If we always used InnoDB tables, we could use SELECT FOR UPDATE, but we don't.
The only solution I can think if are advisory locks, so something like:
global $wpdb;
$lock_name = "meta_{$post_id}_{$meta_key}";
$lock_acquired = $wpdb->get_var(
$wpdb->prepare("SELECT GET_LOCK(%s, 5)", $lock_name)
);
if ($lock_acquired) {
try {
// Safe to do get_var/update/insert
} finally {
$wpdb->query($wpdb->prepare("SELECT RELEASE_LOCK(%s)", $lock_name));
}
}I don't trust $wpdb->prepare(), but I also don't see any alternatives. insert() and update() use prepare() internally anyway and $wpdb the public API doesn't seem to provide any alternatives. I'd really love to expose actual prepared statements and deprecate all these string-stitching methods in 7.1 or 7.2 .
There was a problem hiding this comment.
Isn't this the same race condition that exists in update_post_meta? For single meta values, if the meta key does not exist, it delegates to add_post_meta. That function bails if the key does exist:
wordpress-develop/src/wp-includes/meta.php
Lines 265 to 267 in 0e73dcd
wordpress-develop/src/wp-includes/meta.php
Lines 95 to 102 in 0e73dcd
At any rate, I can think of two alternate solutions:
- Accept the race condition and add
ORDER BY meta_id DESC LIMIT 1to theSELECTquery. A duplicate row is created, but all peers will coalesce on the same row.- Incidentally, a similar race condition exists for the post creation in
get_storage_post_idand is resolved similarly.
- Incidentally, a similar race condition exists for the post creation in
- Since we don't have a
UNIQUEindex to help us, abandon post meta and store the awareness state in the post itself (e.g.,post_content).
@adamziel What do you think? Are advisory locks still a better solution?
There was a problem hiding this comment.
I've gone ahead and pushed up solution 1 in 4a4596d.
I haven't bothered to implement clean up since there should be at most one duplicate row per room.
There was a problem hiding this comment.
Incidentally, a similar race condition exists for the post creation in
get_storage_post_id
Just FYI (since I brought it up), I don't think this race condition requires solving. With the polling provider, no sync updates are sent until another collaborator is detected. In other words, we are guaranteed at least two consecutive polling requests sent by the same user to "initialize" the room, so we should never hit this condition.
There was a problem hiding this comment.
@chriszarate Thank you for elaborating!
Isn't this the same race condition that exists in update_post_meta?
Sure. I'd have the same note if we've used update_post_meta here.
Accept the race condition and add ORDER BY meta_id DESC LIMIT 1 to the SELECT query. A duplicate row is created, but all peers will coalesce on the same row.
As long as some important piece of information isn't getting trapped in that second record, this sounds like a good solution. Thank you! And good inline commenting as well.
I haven't bothered to implement clean up since there should be at most one duplicate row per room.
I'm not sure I follow – I don't see any DELETE query for the awareness state so I think we're good?
Also, theoretically you can exploit this race condition to insert a 100 duplicates. It doesn't seem very likely, but it is possible. I think ignoring those duplicates as you did in 4a4596d is a smart idea.
With the polling provider, no sync updates are sent until another collaborator is detected.
While I understand that's what the current implementation intends to do, sometimes things don't go as planned. Accounting for unexpected duplicate meta keys early on may save hours or days of trying to reproduce a weird, non-deterministic bug report later.
There was a problem hiding this comment.
As long as some important piece of information isn't getting trapped in that second record, this sounds like a good solution.
Awareness state is not critical to sync behavior. If this race condition is hit, we lose the awareness state for a single peer but it will be restored on the next poll (and by that point the race condition no longer exists).
I'm not sure I follow – I don't see any
DELETEquery for the awareness state so I think we're good?
Oops. I initially implemented a cleanup function but removed it and forgot to update the comment. Fixed in 5168269.
Also, theoretically you can exploit this race condition to insert a 100 duplicates. It doesn't seem very likely, but it is possible.
The race condition can only be hit when the meta key does not yet exist, so it does seem pretty unlikely. Maybe a determined bad actor could squeeze out x - 1 duplicates, where x is the number of PHP workers?
While I understand that's what the current implementation intends to do, sometimes things don't go as planned. Accounting for unexpected duplicate meta keys early on may save hours or days of trying to reproduce a weird, non-deterministic bug report later.
Great advice. I'll think about ways to address that in a separate update.
|
@chriszarate almost there! I've left one more comment to resolve a race condition issue. The rest of the change looks good to me! |
adamziel
left a comment
There was a problem hiding this comment.
My notes were addressed so I'll give my approval. Thank you @chriszarate! There's still some pending comments from others to address before merging.
Opening this for conversation: What if we used direct database operations instead of post meta functions? Does that alleviate concerns about changing the APIs? What are the risks, aside from the security of the operations themselves?
Trac ticket: https://core.trac.wordpress.org/ticket/64696
Trac ticket: https://core.trac.wordpress.org/ticket/64916
AI usage
Unit tests written with assistance from Claude Opus 4.6.