forked from tensorflow/tensorflow
-
Notifications
You must be signed in to change notification settings - Fork 97
Develop upstream sync 251224 #3170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mmakevic-amd
wants to merge
763
commits into
develop-upstream
Choose a base branch
from
develop-upstream-sync-251224
base: develop-upstream
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+64,242
−45,014
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+ Allow the chain to start from <transpose, reshape, bitcast> instead of only reshape + Add a layout sensitive mode to the simplification PiperOrigin-RevId: 846150097
Imported from GitHub PR openxla/xla#35479 Add clangd files and directories to .gitignore Copybara import of the project: -- 2999b064c6b756dfc0355d863b863aff1bdea2fa by Eugene Zhulenev <ezv@amazon.com>: Add clangd files and directories to .gitignore Add clangd files and directories to .gitignore Merging this change closes tensorflow#35479 PiperOrigin-RevId: 846156873
PiperOrigin-RevId: 846167560
…intExpression. Helps with narrowing down which constraints are unsat. There can be many constraints (e.g. WGMMA in Mosaic), and while debugging it's unclear which one is violated at a glance. As a follow up, we can also introduce names to each Constraint to make the identification even easier. PiperOrigin-RevId: 846168559
PiperOrigin-RevId: 846171859
PiperOrigin-RevId: 846173555
…TF normalization in emitters 0) Fix a bug (?) in normalization util when normalized dim contains a single dimension 1) Perform normalization OTF for Transpose emitter selection 2) Use normalized shape for unrolling decision in kLoop emitter 3) Use normalized shape to detect slow transposes in triton fusion rewriter PiperOrigin-RevId: 846191206
…t.cc This change updates custom_call_test.cc to dynamically register custom call targets and FFI handlers using the runtime-determined platform name (CUDA or ROCM). This replaces the use of static registration macros, allowing the tests to run correctly across different GPU platforms and the reference interpreter. This way we can avoid compile time branches like `#ifdef GOOGLE_CUDA` and similar. Also: 1. Converts usage of raw CUDA driver API functions to StreamExecutor functionality 2. Replaces some legacy CustomCalls by FFI 3. Converts the while test target to HloRunnerPjRt 4. Removes a test case from the Token tests with a nested type in the output type, since that's not supported by our PjRt implementation. PiperOrigin-RevId: 846196106
The `fd.Size()` check doesn't work when the file descriptor is invalid and only the path was given. PiperOrigin-RevId: 846207406
PiperOrigin-RevId: 846213195
PiperOrigin-RevId: 846214738
PiperOrigin-RevId: 846217449
PiperOrigin-RevId: 846221230
PiperOrigin-RevId: 846221752
The ROCm code path doesn't go through NcclCollectives anymore. Therefore these checks are obsolete. PiperOrigin-RevId: 846226180
PiperOrigin-RevId: 846226345
PiperOrigin-RevId: 846231902
PiperOrigin-RevId: 846234559
PiperOrigin-RevId: 846238886
This migrates `builder.create<Op>()` => `Op::create()` PiperOrigin-RevId: 846246070
This change moves the definition of `AotCompilationResult` into a new header file `compiled_module.h` and renames the class to `CompiledModule`. `CompilationResult` would have been the preferred name, but it's already in-use elsewhere. The original `AotCompilationResult` is kept as a deprecated alias. PiperOrigin-RevId: 846246415
…ests, rather than on the original dimensions. These are simpler both to write and to think about. No behavior changes are intended. PiperOrigin-RevId: 846253300
… its allocation later Imported from GitHub PR openxla/xla#35510 📝 Summary of Changes Initialize collectives pointer to nullptr 🎯 Justification Gpu runtime options are initialized in TF and transferred to XLA to execute thunks. Since the memory is not cleared collectives point to an uninitialized memory resulting in segfault during nccl collective initialization and operation. 🚀 Kind of Contribution Please remove what does not apply: 🐛 Bug Fix, Copybara import of the project: -- 2bfc6fbddbf2f9a926dd504169c56be45d2f1a0a by Harsha HS <Harsha.HavanurShamsundara@amd.com>: [ROCm] Initialze collectives to nullptr to force its allocation later Merging this change closes tensorflow#35510 PiperOrigin-RevId: 846266642
This migrates `builder.create<Op>()` => `Op::create()` PiperOrigin-RevId: 846268375
…utor_test. The local_defines for CUDA/ROCM are not required for this test. Added explicit includes for headers used in gpu_executor_test.cc. PiperOrigin-RevId: 846269233
Imported from GitHub PR openxla/xla#35482 Sometime json incorrectly parse compile commands from bazel, and we end up passing them as ``` "-isystem path/to/includes" ``` to `clangd`, and these flags parsed incorrectly Copybara import of the project: -- adf291e21b098d79fa3be4065ee02fafdf5c660a by Eugene Zhulenev <ezhulenev@google.com>: Correctly generate compile_commands.json Merging this change closes tensorflow#35482 PiperOrigin-RevId: 846269357
Depending on the compiler, `testing::TempDir() + __FUNCTION__` may generate and invalid file name. PiperOrigin-RevId: 846275995
…iguous send/recv buffers Imported from GitHub PR openxla/xla#35463 With latest NCCL we can use `ncclAlltoall` API directly without having to launch grouped send and recv operations. Copybara import of the project: -- 0630f4d48049b211442dcb1754e521a4b1f37f7b by Eugene Zhulenev <ezv@amazon.com>: [xla:gpu] Support ncclAlltoall directly for contiguous send/recv buffers Merging this change closes tensorflow#35463 PiperOrigin-RevId: 846277559
…is supported by libraries. PiperOrigin-RevId: 846299624
PiperOrigin-RevId: 848297480
Modify Thunk's serialization PiperOrigin-RevId: 848309350
Modify Thunk's serialization PiperOrigin-RevId: 848323137
…lectiveDeviceListBase in place of vector<vector<int>> and reduce cognitive complexity in `GetDefaultCollectiveOpsCreator`. PiperOrigin-RevId: 848356290
PiperOrigin-RevId: 848375547
PiperOrigin-RevId: 848382953
PiperOrigin-RevId: 848393091
PiperOrigin-RevId: 848423026
PiperOrigin-RevId: 848429925
PiperOrigin-RevId: 848434764
PiperOrigin-RevId: 848441651
…stub. The `xtile_compiler` target now acts as a selector, depending on either `xtile_compiler_impl` or `xtile_compiler_stub` based on whether CUDA or ROCm is configured. The full implementation is moved to the new `xtile_compiler_impl` target, while `xtile_compiler_stub` provides a minimal version for other configurations. This has the advantage that build_cleaner can run on xtile_compiler_impl. (Doing that removed around 20 dependencies) PiperOrigin-RevId: 848442213
PiperOrigin-RevId: 848455572
PiperOrigin-RevId: 848467225
PiperOrigin-RevId: 848467272
PiperOrigin-RevId: 848475361
It has to become a part of Compiler::CompilerOptions, but CompilerOptions should not depend on PJRT. So, moving it here. PiperOrigin-RevId: 848523186
PiperOrigin-RevId: 848534440
Collaborator
|
This test is failed, seems backend config ( |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Bi-weekly sync from TensorFlow upstream
Disabled tests:
Submission Checklist