MPI FFTW #997

anandrdbz · 2025-09-08T07:46:17Z

User description

Description

Fast Fourier transform for energy cascade implemented on multiple ranks. Improves upon Conrad's implementation by using Pencil decomposition (2D) instead of Slabs (1D) in post_process. This required the use of cartesian sub-communicators.

This is in the draft stage, as I have yet to test for correctness, but code ran successfully on 1-8 ranks. Will subsequently test for correctness on TGV problem before merging.

Fixes #(issue) [optional]

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
[x ] New feature (non-breaking change which adds functionality)
Something else

Scope

[ x] This PR comprises a set of related changes with a common goal

If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration

Test A
Test B

Test Configuration:

What computers and compilers did you use to test this:

Checklist

I have added comments for the new code
I added Doxygen docstrings to the new code
I have made corresponding changes to the documentation (docs/)
I have added regression tests to the test suite so that people can verify in the future that the feature is behaving as expected
I have added example cases in examples/ that demonstrate my new feature performing as expected.
They run to completion and demonstrate "interesting physics"
I ran ./mfc.sh format before committing my code
New and existing tests pass locally with my changes, including with GPU capability enabled (both NVIDIA hardware with NVHPC compilers and AMD hardware with CRAY compilers) and disabled
This PR does not introduce any repeated code (it follows the DRY principle)
I cannot think of a way to condense this code and reduce any introduced additional line count

If your code changes any code source files (anything in `src/simulation`)

To make sure the code is performing as expected on GPU devices, I have:

Checked that the code compiles using NVHPC compilers
Checked that the code compiles using CRAY compilers
Ran the code on either V100, A100, or H100 GPUs and ensured the new feature performed as expected (the GPU results match the CPU results)
Ran the code on MI200+ GPUs and ensure the new features performed as expected (the GPU results match the CPU results)
Enclosed the new feature via nvtx ranges so that they can be identified in profiles
Ran a Nsight Systems profile using ./mfc.sh run XXXX --gpu -t simulation --nsys, and have attached the output file (.nsys-rep) and plain text results to this PR
Ran a Rocprof Systems profile using ./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace, and have attached the output file and plain text results to this PR.
Ran my code using various numbers of different GPUs (1, 2, and 8, for example) in parallel and made sure that the results scale similarly to what happens if you run without the new code/feature

PR Type

Enhancement

Description

Implement MPI-based 3D FFT for energy cascade analysis
Add pencil decomposition using cartesian sub-communicators
Integrate FFTW library with MPI transpose operations
Add input validation for FFT requirements

Diagram Walkthrough

flowchart LR
  A["3D Velocity Field"] --> B["X-direction FFT"]
  B --> C["MPI Transpose X→Y"]
  C --> D["Y-direction FFT"]
  D --> E["MPI Transpose Y→Z"]
  E --> F["Z-direction FFT"]
  F --> G["Energy Spectrum Calculation"]
  G --> H["Energy Cascade Output"]

File Walkthrough

Relevant files

Enhancement

m_mpi_common.fpp `Add FFT-specific processor topology optimization` src/common/m_mpi_common.fpp Add conditional processor topology optimization for FFT operations Implement 2D pencil decomposition for post-processing with FFT Maintain existing 3D decomposition logic for non-FFT cases	+108/-1
m_checker.fpp `Add FFT input validation constraints` src/post_process/m_checker.fpp Add `s_check_inputs_fft` subroutine for FFT parameter validation Prohibit `file_per_process` when FFT is enabled Require 3D domain and even local dimensions for FFT	+8/-0
m_global_parameters.fpp `Add FFT write parameter` src/post_process/m_global_parameters.fpp Add `fft_wrt` logical parameter for FFT output control Initialize `fft_wrt` to false in default settings	+2/-0
m_mpi_proxy.fpp `Include FFT parameter in MPI broadcasts` src/post_process/m_mpi_proxy.fpp Add `fft_wrt` to MPI broadcast variable list	+1/-1
m_start_up.fpp `Implement complete MPI-based 3D FFT system` src/post_process/m_start_up.fpp Integrate FFTW3 library with C bindings Implement 3D FFT with MPI pencil decomposition Add energy spectrum calculation and cascade analysis Create MPI transpose routines for data redistribution Initialize FFTW plans and cartesian communicators	+339/-3

Configuration changes

case_dicts.py `Add FFT parameter to toolchain configuration` toolchain/mfc/run/case_dicts.py Add `fft_wrt` parameter to post-processing parameter dictionary	+1/-0

qodo-merge-pro · 2025-09-08T07:47:11Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Possible Issue FFT pencil sizes and MPI cartesian splits appear inconsistent: local sizes are derived using different processor axes than the allocation and transpose logic assumes, which can cause out-of-bounds or misaddressed data during Alltoall and k-index remapping. Verify Nxloc/Nyloc/Nyloc2 definitions, proc_coords usage, and that communicator dimensions match the intended pencils. if(fft_wrt) then num_procs_x = (m_glb + 1) / (m + 1) num_procs_y = (n_glb + 1) / (n + 1) num_procs_z = (p_glb + 1) / (p + 1) Nx = m_glb + 1 Ny = n_glb + 1 Nz = p_glb + 1 Nxloc = (m_glb + 1) / num_procs_y Nyloc = n + 1 Nyloc2 = (n_glb + 1) / num_procs_z Nzloc = p + 1 Nf = max(Nx, Ny, Nz) @:ALLOCATE(data_in(NxNylocNzloc)) @:ALLOCATE(data_out(NxNylocNzloc)) @:ALLOCATE(data_cmplx(Nx, Nyloc, Nzloc)) @:ALLOCATE(data_cmplx_y(Nxloc, Ny, Nzloc)) @:ALLOCATE(data_cmplx_z(Nxloc, Nyloc2, Nz)) @:ALLOCATE(En_real(Nxloc, Nyloc2, Nz)) @:ALLOCATE(En(Nf)) size_n(1) = Nx inembed(1) = Nx onembed(1) = Nx fwd_plan_x = fftw_plan_many_dft(1, size_n, NylocNzloc, & data_in, inembed, 1, Nx, & data_out, onembed, 1, Nx, & FFTW_FORWARD, FFTW_MEASURE) size_n(1) = Ny inembed(1) = Ny onembed(1) = Ny fwd_plan_y = fftw_plan_many_dft(1, size_n, NxlocNzloc, & data_out, inembed, 1, Ny, & data_in, onembed, 1, Ny, & FFTW_FORWARD, FFTW_MEASURE) size_n(1) = Nz inembed(1) = Nz onembed(1) = Nz fwd_plan_z = fftw_plan_many_dft(1, size_n, NxlocNyloc2, & data_in, inembed, 1, Nz, & data_out, onembed, 1, Nz, & FFTW_FORWARD, FFTW_MEASURE) call MPI_CART_CREATE(MPI_COMM_WORLD, 3, (/num_procs_x, & num_procs_y, num_procs_z/), & (/.true., .true., .true./), & .false., MPI_COMM_CART, ierr) call MPI_CART_COORDS(MPI_COMM_CART, proc_rank, 3, & cart3d_coords, ierr) call MPI_Cart_SUB(MPI_COMM_CART, (/.true., .true., .false./), MPI_COMM_CART12, ierr) call MPI_COMM_RANK(MPI_COMM_CART12, proc_rank12, ierr) call MPI_CART_COORDS(MPI_COMM_CART12, proc_rank12, 2, cart2d12_coords, ierr) call MPI_Cart_SUB(MPI_COMM_CART, (/.true., .false., .true./), MPI_COMM_CART13, ierr) call MPI_COMM_RANK(MPI_COMM_CART13, proc_rank13, ierr) call MPI_CART_COORDS(MPI_COMM_CART13, proc_rank13, 2, cart2d13_coords, ierr) end if Buffer Mismatch* MPI_Alltoall uses MPI_DOUBLE_COMPLEX while the Fortran type is complex(c_double_complex); ensure MPI datatype compatibility or provide a derived MPI type. Also check send/recv counts and contiguous packing sizes match allocations to avoid buffer overruns. call MPI_Alltoall(sendbuf, NxlocNylocNzloc, MPI_DOUBLE_COMPLEX, & recvbuf, NxlocNylocNzloc, MPI_DOUBLE_COMPLEX, MPI_COMM_CART12, ierr) do src_rank = 0, num_procs_y - 1 do l = 1, Nzloc do k = 1, Nyloc do j = 1, Nxloc data_cmplx_y(j, k + src_rankNyloc, l) = recvbuf(j + (k-1)Nxloc + (l-1)NxlocNyloc + src_rankNxlocNylocNzloc) end do end do end do end do deallocate(sendbuf) deallocate(recvbuf) end subroutine s_mpi_transpose_x2y subroutine s_mpi_transpose_y2z complex(c_double_complex), allocatable :: sendbuf(:), recvbuf(:) integer :: dest_rank, src_rank integer :: j, k, l allocate(sendbuf(NyNxlocNzloc)) allocate(recvbuf(NyNxlocNzloc)) do dest_rank = 0, num_procs_z - 1 do l = 1, Nzloc do j = 1, Nxloc do k = 1, Nyloc2 sendbuf(k + (j-1)Nyloc2 + (l-1)(Nyloc2Nxloc) + dest_rankNyloc2NxlocNzloc) = data_cmplx_y(j, k + dest_rankNyloc2, l) end do end do end do end do call MPI_Alltoall(sendbuf, Nyloc2NxlocNzloc, MPI_DOUBLE_COMPLEX, & recvbuf, Nyloc2NxlocNzloc, MPI_DOUBLE_COMPLEX, MPI_COMM_CART13, ierr) Validation Logic FFT constraints require local dimensions divisible by 2, but decomposition may yield odd local sizes depending on global sizes and proc topology. Add checks for divisibility by num_procs and ensure Ny/Nz local sizes used by transposes match the enforced constraints. !> Checks constraints on fft_wrt impure subroutine s_check_inputs_fft @:PROHIBIT(fft_wrt .and. file_per_process, "Turn off file_per_process with fft_wrt") @:PROHIBIT(fft_wrt .and. (n == 0 .or. p == 0), "FFT WRT only in 3D") @:PROHIBIT(fft_wrt .and. (MOD(m+1,2) == 1 .or. MOD(n+1,2) == 1 .or. MOD(p+1,2) == 1), "FFT WRT requires local dimensions divisible by 2") end subroutine s_check_inputs_fft

qodo-merge-pro · 2025-09-08T07:49:39Z

src/post_process/m_start_up.fpp

+    end do
+
+    call s_mpi_transpose_x2y !!Change Pencil from data_cmplx to data_cmpx_y 
+
+    do l = 1, Nzloc 
+        do k = 1, Nxloc 
+            do j = 1, Ny
+                data_out(j + (k-1)*Ny + (l-1)*Ny*Nxloc) = data_cmplx_y(k, j, l)
+            end do 
+        end do 
+    end do
+
+    call fftw_execute_dft(fwd_plan_y, data_out, data_in)
+
+    do l = 1, Nzloc 
+        do k = 1, Nxloc 
+            do j = 1, Ny 
+                data_cmplx_y(k, j, l) = data_in(j + (k-1)*Ny + (l-1)*Ny*Nxloc)
+            end do 
+        end do 
+    end do
+
+    call s_mpi_transpose_y2z !!Change Pencil from data_cmplx_y to data_cmpx_z
+
+    do l = 1, Nyloc2 
+        do k = 1, Nxloc 
+            do j = 1, Nz
+                data_in(j + (k-1)*Nz + (l-1)*Nz*Nxloc) = data_cmplx_z(k, l, j)
+            end do 
+        end do 
+    end do
+
+    call fftw_execute_dft(fwd_plan_z, data_in, data_out)
+
+    do l = 1, Nyloc2 
+        do k = 1, Nxloc 
+            do j = 1, Nz
+                data_cmplx_z(k, l, j) = data_out(j + (k-1)*Nz + (l-1)*Nz*Nxloc)
+            end do 
+        end do 
+    end do
+
+    end subroutine s_mpi_FFT_fwd
+
+    subroutine s_mpi_transpose_x2y
+    complex(c_double_complex), allocatable :: sendbuf(:), recvbuf(:)
+    integer :: dest_rank, src_rank
+    integer :: i, j, k, l
+
+    allocate(sendbuf(Nx*Nyloc*Nzloc))
+    allocate(recvbuf(Nx*Nyloc*Nzloc))
+
+    do dest_rank = 0, num_procs_y - 1
+        do l = 1, Nzloc 
+            do k = 1, Nyloc 
+                do j = 1, Nxloc
+                    sendbuf(j + (k-1)*Nxloc + (l-1)*Nxloc*Nyloc + dest_rank*Nxloc*Nyloc*Nzloc) = data_cmplx(j + dest_rank*Nxloc, k, l)
+                end do 
+            end do 
+        end do
+    end do
+
+    call MPI_Alltoall(sendbuf, Nxloc*Nyloc*Nzloc, MPI_DOUBLE_COMPLEX, & 
+                          recvbuf, Nxloc*Nyloc*Nzloc, MPI_DOUBLE_COMPLEX, MPI_COMM_CART12, ierr)
+
+    do src_rank = 0, num_procs_y - 1
+        do l = 1, Nzloc 
+            do k = 1, Nyloc 
+                do j = 1, Nxloc
+                    data_cmplx_y(j, k + src_rank*Nyloc, l) = recvbuf(j + (k-1)*Nxloc + (l-1)*Nxloc*Nyloc + src_rank*Nxloc*Nyloc*Nzloc) 
+                end do 
+            end do 
+        end do
+    end do
+
+    deallocate(sendbuf)
+    deallocate(recvbuf)
+
+    end subroutine s_mpi_transpose_x2y
+
+    subroutine s_mpi_transpose_y2z
+    complex(c_double_complex), allocatable :: sendbuf(:), recvbuf(:)
+    integer :: dest_rank, src_rank
+    integer :: j, k, l
+
+    allocate(sendbuf(Ny*Nxloc*Nzloc))
+    allocate(recvbuf(Ny*Nxloc*Nzloc))
+
+    do dest_rank = 0, num_procs_z - 1
+        do l = 1, Nzloc 
+            do j = 1, Nxloc
+                do k = 1, Nyloc2 
+                    sendbuf(k + (j-1)*Nyloc2 + (l-1)*(Nyloc2*Nxloc) + dest_rank*Nyloc2*Nxloc*Nzloc) = data_cmplx_y(j, k + dest_rank*Nyloc2, l)
+                end do 
+            end do 
+        end do
+    end do
+
+    call MPI_Alltoall(sendbuf, Nyloc2*Nxloc*Nzloc, MPI_DOUBLE_COMPLEX, & 
+                          recvbuf, Nyloc2*Nxloc*Nzloc, MPI_DOUBLE_COMPLEX, MPI_COMM_CART13, ierr)


✅ Suggestion: The current Alltoall uses 2D Cartesian subcommunicators, causing sendcount*comm_size to mismatch the packed buffer size and risking out-of-bounds. Create 1D subcommunicators along Y and Z only and use the MPI datatype that matches complex(c_double_complex). This fixes the communicator size mismatch and avoids undefined behavior due to datatype mismatch. [possible issue, importance: 10]

Suggested change

call MPI_Cart_SUB(MPI_COMM_CART, (/.true., .true., .false./), MPI_COMM_CART12, ierr)

call MPI_COMM_RANK(MPI_COMM_CART12, proc_rank12, ierr)

call MPI_CART_COORDS(MPI_COMM_CART12, proc_rank12, 2, cart2d12_coords, ierr)

call MPI_Cart_SUB(MPI_COMM_CART, (/.true., .false., .true./), MPI_COMM_CART13, ierr)

call MPI_COMM_RANK(MPI_COMM_CART13, proc_rank13, ierr)

call MPI_CART_COORDS(MPI_COMM_CART13, proc_rank13, 2, cart2d13_coords, ierr)

end if

end subroutine s_initialize_modules

subroutine s_mpi_FFT_fwd

integer :: j, k, l

do l = 1, Nzloc

do k = 1, Nyloc

do j = 1, Nx

data_in(j + (k-1)*Nx + (l-1)*Nx*Nyloc) = data_cmplx(j, k, l)

end do

end do

end do

call fftw_execute_dft(fwd_plan_x, data_in, data_out)

do l = 1, Nzloc

do k = 1, Nyloc

do j = 1, Nx

data_cmplx(j, k, l) = data_out(j + (k-1)*Nx + (l-1)*Nx*Nyloc)

end do

end do

end do

call s_mpi_transpose_x2y !!Change Pencil from data_cmplx to data_cmpx_y

do l = 1, Nzloc

do k = 1, Nxloc

do j = 1, Ny

data_out(j + (k-1)*Ny + (l-1)*Ny*Nxloc) = data_cmplx_y(k, j, l)

end do

end do

end do

call fftw_execute_dft(fwd_plan_y, data_out, data_in)

do l = 1, Nzloc

do k = 1, Nxloc

do j = 1, Ny

data_cmplx_y(k, j, l) = data_in(j + (k-1)*Ny + (l-1)*Ny*Nxloc)

end do

end do

end do

call s_mpi_transpose_y2z !!Change Pencil from data_cmplx_y to data_cmpx_z

do l = 1, Nyloc2

do k = 1, Nxloc

do j = 1, Nz

data_in(j + (k-1)*Nz + (l-1)*Nz*Nxloc) = data_cmplx_z(k, l, j)

end do

end do

end do

call fftw_execute_dft(fwd_plan_z, data_in, data_out)

do l = 1, Nyloc2

do k = 1, Nxloc

do j = 1, Nz

data_cmplx_z(k, l, j) = data_out(j + (k-1)*Nz + (l-1)*Nz*Nxloc)

end do

end do

end do

end subroutine s_mpi_FFT_fwd

subroutine s_mpi_transpose_x2y

complex(c_double_complex), allocatable :: sendbuf(:), recvbuf(:)

integer :: dest_rank, src_rank

integer :: i, j, k, l

allocate(sendbuf(Nx*Nyloc*Nzloc))

allocate(recvbuf(Nx*Nyloc*Nzloc))

do dest_rank = 0, num_procs_y - 1

do l = 1, Nzloc

do k = 1, Nyloc

do j = 1, Nxloc

sendbuf(j + (k-1)*Nxloc + (l-1)*Nxloc*Nyloc + dest_rank*Nxloc*Nyloc*Nzloc) = data_cmplx(j + dest_rank*Nxloc, k, l)

end do

end do

end do

end do

call MPI_Alltoall(sendbuf, Nxloc*Nyloc*Nzloc, MPI_DOUBLE_COMPLEX, &

recvbuf, Nxloc*Nyloc*Nzloc, MPI_DOUBLE_COMPLEX, MPI_COMM_CART12, ierr)

do src_rank = 0, num_procs_y - 1

do l = 1, Nzloc

do k = 1, Nyloc

do j = 1, Nxloc

data_cmplx_y(j, k + src_rank*Nyloc, l) = recvbuf(j + (k-1)*Nxloc + (l-1)*Nxloc*Nyloc + src_rank*Nxloc*Nyloc*Nzloc)

end do

end do

end do

end do

deallocate(sendbuf)

deallocate(recvbuf)

end subroutine s_mpi_transpose_x2y

subroutine s_mpi_transpose_y2z

complex(c_double_complex), allocatable :: sendbuf(:), recvbuf(:)

integer :: dest_rank, src_rank

integer :: j, k, l

allocate(sendbuf(Ny*Nxloc*Nzloc))

allocate(recvbuf(Ny*Nxloc*Nzloc))

do dest_rank = 0, num_procs_z - 1

do l = 1, Nzloc

do j = 1, Nxloc

do k = 1, Nyloc2

sendbuf(k + (j-1)*Nyloc2 + (l-1)*(Nyloc2*Nxloc) + dest_rank*Nyloc2*Nxloc*Nzloc) = data_cmplx_y(j, k + dest_rank*Nyloc2, l)

end do

end do

end do

end do

call MPI_Alltoall(sendbuf, Nyloc2*Nxloc*Nzloc, MPI_DOUBLE_COMPLEX, &

recvbuf, Nyloc2*Nxloc*Nzloc, MPI_DOUBLE_COMPLEX, MPI_COMM_CART13, ierr)

call MPI_Cart_SUB(MPI_COMM_CART, (/.false., .true., .false./), MPI_COMM_CART12, ierr)

call MPI_Cart_SUB(MPI_COMM_CART, (/.false., .false., .true./), MPI_COMM_CART13, ierr)

...

call MPI_Alltoall(sendbuf, Nxloc*Nyloc*Nzloc, MPI_C_DOUBLE_COMPLEX, &

recvbuf, Nxloc*Nyloc*Nzloc, MPI_C_DOUBLE_COMPLEX, MPI_COMM_CART12, ierr)

...

call MPI_Alltoall(sendbuf, Nyloc2*Nxloc*Nzloc, MPI_C_DOUBLE_COMPLEX, &

recvbuf, Nyloc2*Nxloc*Nzloc, MPI_C_DOUBLE_COMPLEX, MPI_COMM_CART13, ierr)

qodo-merge-pro · 2025-09-08T07:49:40Z

src/post_process/m_start_up.fpp

+                        j_glb = j + proc_coords(2)*Nxloc
+                        k_glb = k + proc_coords(3)*Nyloc2


✅ Suggestion: proc_coords is undefined in this scope, which will fail compilation. Use the computed 3D Cartesian coordinates returned by MPI_CART_COORDS. [possible issue, importance: 10]

Suggested change

j_glb = j + proc_coords(2)*Nxloc

k_glb = k + proc_coords(3)*Nyloc2

j_glb = j + cart3d_coords(2)*Nxloc

k_glb = k + cart3d_coords(3)*Nyloc2

qodo-merge-pro · 2025-09-08T07:49:40Z

src/post_process/m_start_up.fpp

+            Nf = max(Nx, Ny, Nz)
+
+            @:ALLOCATE(data_in(Nx*Nyloc*Nzloc))
+            @:ALLOCATE(data_out(Nx*Nyloc*Nzloc))
+
+            @:ALLOCATE(data_cmplx(Nx, Nyloc, Nzloc))
+            @:ALLOCATE(data_cmplx_y(Nxloc, Ny, Nzloc))
+            @:ALLOCATE(data_cmplx_z(Nxloc, Nyloc2, Nz))
+
+            @:ALLOCATE(En_real(Nxloc, Nyloc2, Nz))
+            @:ALLOCATE(En(Nf)) 


Suggestion: The spectrum bin index kf can exceed Nf, causing out-of-bounds. Size Nf to the maximum possible radial wavenumber and guard the accumulation to stay within bounds. [possible issue, importance: 9]

Suggested change

Nf = max(Nx, Ny, Nz)

@:ALLOCATE(data_in(Nx*Nyloc*Nzloc))

@:ALLOCATE(data_out(Nx*Nyloc*Nzloc))

@:ALLOCATE(data_cmplx(Nx, Nyloc, Nzloc))

@:ALLOCATE(data_cmplx_y(Nxloc, Ny, Nzloc))

@:ALLOCATE(data_cmplx_z(Nxloc, Nyloc2, Nz))

@:ALLOCATE(En_real(Nxloc, Nyloc2, Nz))

@:ALLOCATE(En(Nf))

Nf = int(sqrt(real((Nx/2)**2 + (Ny/2)**2 + (Nz/2)**2, wp))) + 1

@:ALLOCATE(En(Nf))

...

kf = int(nint(sqrt(real(kx, wp)**2 + real(ky, wp)**2 + real(kz, wp)**2))) + 1

if (kf <= Nf) then

En(kf) = En(kf) + En_real(j, k, l)

end if

codecov · 2025-09-08T10:34:11Z

Codecov Report

❌ Patch coverage is 14.28571% with 180 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.62%. Comparing base (fe87fc5) to head (c3425ee).
⚠️ Report is 3 commits behind head on master.

Files with missing lines	Patch %	Lines
src/post_process/m_start_up.fpp	0.61%	160 Missing and 2 partials ⚠️
src/common/m_mpi_common.fpp	56.09%	17 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #997      +/-   ##
==========================================
- Coverage   40.92%   40.62%   -0.30%     
==========================================
  Files          70       70              
  Lines       20299    20464     +165     
  Branches     2521     2530       +9     
==========================================
+ Hits         8307     8314       +7     
- Misses      10454    10609     +155     
- Partials     1538     1541       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Anand added 3 commits September 8, 2025 03:01

Compiles, need to test

a7c8e14

FFTW incompatible with file_per_process

e5194de

Successfuly ran with 4 procs

0da3b44

anandrdbz requested review from a team as code owners September 8, 2025 07:46

anandrdbz changed the title ~~MPI FFTW~~ MPI FFTW (Draft) Sep 8, 2025

qodo-merge-pro bot changed the title ~~MPI FFTW (Draft)~~ MPI FFTW Sep 8, 2025

qodo-merge-pro bot added the Review effort 4/5 label Sep 8, 2025

qodo-merge-pro bot reviewed Sep 8, 2025

View reviewed changes

Add appropriate size checkers for fft and format

a655546

Anand and others added 5 commits September 12, 2025 16:00

1D Spectrum corrections

0384a60

Create Directory for En FFT

6bd60df

Bug

b673538

bug fix

eacd558

Relax Constraint (only needed for global)

c3425ee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MPI FFTW #997

MPI FFTW #997

Uh oh!

anandrdbz commented Sep 8, 2025 •

edited by qodo-merge-pro bot

Loading

Uh oh!

qodo-merge-pro bot commented Sep 8, 2025

Uh oh!

qodo-merge-pro bot Sep 8, 2025 •

edited

Loading

Uh oh!

qodo-merge-pro bot Sep 8, 2025 •

edited

Loading

Uh oh!

qodo-merge-pro bot Sep 8, 2025

Uh oh!

codecov bot commented Sep 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

		j_glb = j + proc_coords(2)*Nxloc
		k_glb = k + proc_coords(3)*Nyloc2

MPI FFTW #997

Are you sure you want to change the base?

MPI FFTW #997

Uh oh!

Conversation

anandrdbz commented Sep 8, 2025 • edited by qodo-merge-pro bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Description

Type of change

Scope

How Has This Been Tested?

Checklist

If your code changes any code source files (anything in src/simulation)

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-merge-pro bot commented Sep 8, 2025

PR Reviewer Guide 🔍

Uh oh!

qodo-merge-pro bot Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qodo-merge-pro bot Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qodo-merge-pro bot Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

anandrdbz commented Sep 8, 2025 •

edited by qodo-merge-pro bot

Loading

If your code changes any code source files (anything in `src/simulation`)

qodo-merge-pro bot Sep 8, 2025 •

edited

Loading

qodo-merge-pro bot Sep 8, 2025 •

edited

Loading

codecov bot commented Sep 8, 2025 •

edited

Loading