-
Notifications
You must be signed in to change notification settings - Fork 74
Switch amdgpu-windows-interops to rocm-systems subfolder #1358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Switch amdgpu-windows-interops to rocm-systems subfolder #1358
Conversation
…/amdgpu-windows-interop-superrepo
- Separate the dvc exec calls into its own function. - Move the execution of the calls.
build_tools/fetch_sources.py
Outdated
# presently, only amdgpu-windows-interop in rocm-systems uses DVC, but... | ||
# eventually, DVC will be rolled out to math libraries and in linux | ||
print(f"dvc detected in {project_dir}, running dvc pull") | ||
exec(["dvc", "pull"], cwd=project_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to check that dvc
exists before running this command. We can then log a helpful error message like
Could not find
dvc
on PATH so large files could not be fetched
To install dvc, runpip install dvc
orpip install -r requirements.txt
Can check with shutil.which
Thankfully, our documentation already suggests running fetch_sources.py from a venv with the requirements installed (https://github.com/ROCm/TheRock?tab=readme-ov-file#building-from-source), but I would definitely prefer if that wasn't a hard requirement. On my systems I may opt for a system-wide install of dvc since I'm already juggling quite a few Python venvs for various builds and tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied this feedback. Thanks!
build_tools/fetch_sources.py
Outdated
for project in dvc_projects: | ||
if project in projects: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can save on indentation by returning/continuing early here
https://testing.googleblog.com/2017/06/code-health-reduce-nesting-reduce.html
def pull_large_files(dvc_projects, projects):
for project in dvc_projects:
if not project in projects:
continue
submodule_path = get_submodule_path(project)
...
if not dvc_config_file.exists():
log(...)
continue
# check for DVC config in the ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied this feedback. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Structurally this LGTM. I'm wondering about the cost of this new dvc dependency for build environment setup. Curious to see if we can trim it down somehow.
requirements.txt
Outdated
# data version control for pal .lib files | ||
# and eventually large kernels from miopen | ||
# and large logic files from hipblaslt | ||
dvc==3.62.0 | ||
dvc-s3==3.2.2 | ||
# dvc dependencies that pip takes time | ||
# resolving versioning without guidance | ||
aiobotocore==2.24.1 | ||
boto3==1.39.11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes, these new requirements pull in a lot of indirect deps. 108 packages to be specific, compared to 131 total for this whole requirements.txt.
That's another reason to allow (and maybe recommend) installing outside of pip.
We could make these deps conditional on the platform (; platform_system == "Windows"
), though that won't help for when MIOpen and hipblaslt also start to use it. I wonder if there's a more lightweight package or dependency chain we could use instead...
λ py -V:3.12 -m venv 3.12.venv && .\3.12.venv\Scripts\activate.bat
(3.12.venv) λ pip freeze
(3.12.venv) λ pip install dvc==3.62.0 dvc-s3==3.2.2 aiobotocore==2.24.1 boto3==1.39.11
(3.12.venv) λ pip freeze
aiobotocore==2.24.1
aiohappyeyeballs==2.6.1
aiohttp==3.12.15
aiohttp-retry==2.9.1
aioitertools==0.12.0
aiosignal==1.4.0
amqp==5.3.1
annotated-types==0.7.0
antlr4-python3-runtime==4.9.3
appdirs==1.4.4
asyncssh==2.21.0
atpublic==6.0.1
attrs==25.3.0
billiard==4.2.1
boto3==1.39.11
botocore==1.39.11
celery==5.5.3
certifi==2025.8.3
cffi==1.17.1
charset-normalizer==3.4.3
click==8.2.1
click-didyoumean==0.3.1
click-plugins==1.1.1.2
click-repl==0.3.0
colorama==0.4.6
configobj==5.0.9
cryptography==45.0.7
dictdiffer==0.9.0
diskcache==5.6.3
distro==1.9.0
dpath==2.2.0
dulwich==0.24.1
dvc==3.62.0
dvc-data==3.16.12
dvc-http==2.32.0
dvc-objects==5.1.1
dvc-render==1.0.2
dvc-s3==3.2.2
dvc-studio-client==0.22.0
dvc-task==0.40.2
entrypoints==0.4
filelock==3.19.1
flatten-dict==0.4.2
flufl.lock==8.2.0
frozenlist==1.7.0
fsspec==2025.9.0
funcy==2.0
gitdb==4.0.12
GitPython==3.1.45
grandalf==0.8
gto==1.8.0
hydra-core==1.3.2
idna==3.10
iterative-telemetry==0.0.10
jmespath==1.0.1
kombu==5.5.4
markdown-it-py==4.0.0
mdurl==0.1.2
multidict==6.6.4
networkx==3.5
omegaconf==2.3.0
orjson==3.11.3
packaging==25.0
pathspec==0.12.1
platformdirs==4.4.0
prompt_toolkit==3.0.52
propcache==0.3.2
psutil==7.0.0
pycparser==2.22
pydantic==2.11.7
pydantic_core==2.33.2
pydot==4.0.1
pygit2==1.18.2
Pygments==2.19.2
pygtrie==2.5.0
pyparsing==3.2.3
python-dateutil==2.9.0.post0
pywin32==311
PyYAML==6.0.2
requests==2.32.5
rich==14.1.0
ruamel.yaml==0.18.15
ruamel.yaml.clib==0.2.12
s3fs==2025.9.0
s3transfer==0.13.1
scmrepo==3.5.2
semver==3.0.4
setuptools==80.9.0
shellingham==1.5.4
shortuuid==1.0.13
shtab==1.7.2
six==1.17.0
smmap==5.0.2
sqltrie==0.11.2
tabulate==0.9.0
tomlkit==0.13.3
tqdm==4.67.1
typer==0.17.3
typing-inspection==0.4.1
typing_extensions==4.15.0
tzdata==2025.2
urllib3==2.5.0
vine==5.1.0
voluptuous==0.15.2
wcwidth==0.2.13
wrapt==1.17.3
yarl==1.20.1
zc.lockfile==3.0.post1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was something I was looking at on Friday. Initially, before I added some of the sub-dependencies to requirements.txt, pip would iterate over every version to find matches for those.
Example: https://github.com/ROCm/TheRock/actions/runs/17331389446/job/49208004639
I managed to at least stop any of that iterative version download sequence with what I added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooof. That doesn't inspire much confidence in those packages being safe to depend on :/. Far too many indirect dependencies in the network to be developer-friendly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll snoop around Hugging Face and some of the other advertised consumers of dvc to see how they're dealing with dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option for the short term when it's still only for Windows, we could add it the chocolatey install list instead of with pip. I did ask in the dvc Discord about this topic of their python dependency list in the context of a CI runner that is mainly using the tool for pull command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Discord suggested installing the binary package or using the github action https://github.com/iterative/setup-dvc
# check for DVC config in the submodule and run dvc pull if found. | ||
# presently, only amdgpu-windows-interop in rocm-systems uses DVC, but... | ||
# eventually, DVC will be rolled out to math libraries and in linux | ||
print(f"dvc detected in {project_dir}, running dvc pull") | ||
exec(["dvc", "pull"], cwd=project_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this on my dev machine. Seems to work well enough:
(.venv) λ python ./build_tools/fetch_sources.py
++ Exec [D:\projects\TheRock]$ git submodule update --init -- base/half comm-libs/rccl comm-libs/rccl-tests base/rocm-cmake profiler/rocprof-trace-decoder/binaries compiler/hipify compiler/amd-llvm compiler/spirv-llvm-translator rocm-libraries rocm-systems math-libs/BLAS/hipSOLVER math-libs/BLAS/rocSOLVER
dvc detected in D:\projects\TheRock\rocm-systems, running dvc pull
++ Exec [D:\projects\TheRock\rocm-systems]$ dvc pull
Collecting |32.0 [00:00, 483entry/s] Fetching
Building workspace index |6.00 [00:00, 187entry/s] Comparing indexes |39.0 [00:00, 13.6kentry/s] Applying changes |32.0 [00:00, 848file/s] A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddRpcClient.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddRpcShared.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddEventClient.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\zstd.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddNet.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\SettingsRpcService2.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\DriverUtilsService.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\stb_sprintf.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\amdrdf.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\addrlib.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddEventServer.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddEventParser.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\devdriver.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddRpcServer.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\palCompilerDeps.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddCore.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\metrohash.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddYaml.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddCommon.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\pal_lz4.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\dd_settings.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\cwpack.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\vam.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\pal.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\dd_common.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddSocket.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\ddEventStreamer.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\UberTraceService.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\palUtil.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\mpack.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\dd_libyaml.lib
A shared\amdgpu-windows-interop\pal\lib\Release\x64\pal_uuid.lib
32 files added and 32 files fetched
++ Exec [D:\projects\TheRock]$ git update-index --no-skip-worktree -- base/half comm-libs/rccl comm-libs/rccl-tests base/rocm-cmake profiler/rocprof-trace-decoder/binaries compiler/hipify compiler/amd-llvm compiler/spirv-llvm-translator rocm-libraries rocm-systems math-libs/BLAS/hipSOLVER math-libs/BLAS/rocSOLVER
Co-authored-by: Scott Todd <scott.todd0@gmail.com>
… of https://github.com/ROCm/TheRock into users/jayhawk-commits/amdgpu-windows-interop-superrepo
if shutil.which("dvc") is None: | ||
print("Could not find `dvc` on PATH so large files could not be fetched") | ||
print( | ||
"To install dvc, run `pip install dvc` or `pip install -r requirements.txt`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requirements.txt - we need to add this dvc to requirements.txt file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are removing it from the requirements due to how many extra dependencies it pulls and adding the setup to be done in the runner image. Will change the output when we decide on this.
Depends on ROCm/rocm-systems#808 and Windows CI being green again. When Windows CI is green, I can temporarily change the rocm-systems submodule to point to PR 808's branch to run CI against the combination of both changes.
The dvc usage is only needed for the pal libs, so it is guarded with an if Windows. However, this tool will be used by miopen and hipblaslt at some point, and I could not confirm that the Linux pipelines install pip modules before the fetch sources sequence. The Windows pipelines appear to install the pip modules before fetch sources.