feat: support concurrent chunk uploads#74
Conversation
Greptile SummaryThis PR adds concurrent chunk uploads to the
Confidence Score: 3/5The new concurrent upload path works correctly on the happy path but leaks curl resources when a chunk upload fails mid-batch; long-running workers doing large uploads could exhaust file descriptors over time. All three upload service files share the same curl handle cleanup gap: when AppwriteException is thrown inside the foreach loop, curl_multi_close is never called and unprocessed handles are abandoned. In high-throughput or long-running PHP-FPM environments this can deplete file descriptors. Additionally, Storage.php has a redundant dead-code initialisation, and the header serialisation in the concurrent path omits the conventional space after the colon, diverging from the sequential path. src/Appwrite/Services/Storage.php, src/Appwrite/Services/Functions.php, and src/Appwrite/Services/Sites.php all share the same concurrent upload implementation and need the curl cleanup and header-format fixes reviewed before merging. Important Files Changed
Reviews (1): Last reviewed commit: "feat: support concurrent chunk uploads" | Re-trigger Greptile |
| foreach ($handles as $handleInfo) { | ||
| $ch = $handleInfo['handle']; | ||
| $body = curl_multi_getcontent($ch); | ||
| $statusCode = curl_getinfo($ch, CURLINFO_HTTP_CODE); | ||
| $contentType = $responseHeaders[spl_object_id($ch)]['content-type'] ?? ''; | ||
|
|
||
| if (curl_errno($ch)) { | ||
| throw new AppwriteException(curl_error($ch), $statusCode, '', $body); | ||
| } | ||
|
|
||
| $chunkResponse = str_starts_with($contentType, 'application/json') ? json_decode($body, true) : $body; | ||
|
|
||
| if($statusCode >= 400) { | ||
| if(is_array($chunkResponse)) { | ||
| throw new AppwriteException($chunkResponse['message'], $statusCode, $chunkResponse['type'] ?? '', json_encode($chunkResponse)); | ||
| } | ||
|
|
||
| throw new AppwriteException($chunkResponse, $statusCode, '', $chunkResponse); | ||
| } | ||
|
|
||
| $completedCount++; | ||
| $uploadedSize += $handleInfo['chunk']['end'] - $handleInfo['chunk']['start']; | ||
| if($isUploadComplete($chunkResponse)) { | ||
| $response = $chunkResponse; | ||
| } | ||
| if($onProgress !== null) { | ||
| $onProgress([ | ||
| '$id' => $uploadId, | ||
| 'progress' => $uploadedSize / $size * 100, | ||
| 'sizeUploaded' => $uploadedSize, | ||
| 'chunksTotal' => $totalChunks, | ||
| 'chunksUploaded' => $completedCount, | ||
| ]); | ||
| } | ||
|
|
||
| curl_multi_remove_handle($multiHandle, $ch); | ||
| } | ||
|
|
||
| curl_multi_close($multiHandle); |
There was a problem hiding this comment.
curl handles never explicitly closed after removal
After curl_multi_remove_handle($multiHandle, $ch), the individual curl handles are never closed with curl_close($ch). In the normal path the multi handle's reference to each handle is released when curl_multi_close is called, allowing eventual GC. However when an exception is thrown inside the foreach loop (either the curl_errno branch or the $statusCode >= 400 branch), both the multi handle and all remaining individual handles are abandoned without cleanup. In long-running PHP-FPM workers or CLI scripts with large multi-chunk uploads, this can exhaust available file descriptors. The same pattern appears in Functions.php and Sites.php.
| $uploadId = ''; | ||
| $uploadId = $fileId ?? ''; | ||
| $totalChunks = (int) ceil($size / Client::CHUNK_SIZE); |
There was a problem hiding this comment.
Dead code:
$uploadId = '' on the first line is immediately overwritten by the assignment on the next line and has no effect. This appears to be a copy-paste artefact from Functions.php (which has no corresponding $fileId initialisation line).
| $uploadId = ''; | |
| $uploadId = $fileId ?? ''; | |
| $totalChunks = (int) ceil($size / Client::CHUNK_SIZE); | |
| $uploadId = $fileId ?? ''; | |
| $totalChunks = (int) ceil($size / Client::CHUNK_SIZE); |
| $headers = []; | ||
| foreach ($chunkHeaders as $key => $value) { | ||
| $headers[] = $key . ':' . $value; | ||
| } |
There was a problem hiding this comment.
HTTP header format missing space after colon
Headers are assembled as key . ':' . value (no space), while the conventional and Client::call()-consistent form is key . ': ' . value. Most servers tolerate both, but some strict reverse-proxies or WAFs reject the no-space variant, meaning the concurrent upload path could produce 400/502 responses that the sequential first-chunk path would not. The same pattern appears in Sites.php and Storage.php.
|
Closing in favor of focused PR #75. |
This PR updates the SDK to support concurrent chunk uploads.