SpatialOps issueshttps://gitlab.multiscale.utah.edu/common/SpatialOps/-/issues2018-01-10T18:27:28Zhttps://gitlab.multiscale.utah.edu/common/SpatialOps/-/issues/56local() and mapped_value() operators (as part of mapped reduction) only work ...2018-01-10T18:27:28ZSiddartha Ravichandranlocal() and mapped_value() operators (as part of mapped reduction) only work using native nebo backend.The `local()` and `mapped_value()` Nebo operators introduced as part of the `NeboMappedReduction` operation, do not work on Kokkos because they depend on the outer index that is determined as part of the outer loop in case of the native ...The `local()` and `mapped_value()` Nebo operators introduced as part of the `NeboMappedReduction` operation, do not work on Kokkos because they depend on the outer index that is determined as part of the outer loop in case of the native nebo backend. Since we lose control of the outer loop when using Kokkos, information needed to drive `local()` and `mapped_value()` operators are no longer available.https://gitlab.multiscale.utah.edu/common/SpatialOps/-/issues/43Nebo assignments cause runtime errors on GPUs when multiplying by a constant2016-08-15T14:58:23ZTony SaadNebo assignments cause runtime errors on GPUs when multiplying by a constantConsider the following nebo assignment
```
const double a = 2.0;
Field1 <<= Field2 * a;
```
where Field1 and Field2 are spatialfields and a is a double.
The above assignments breaks on the GPU but NOT on CPU.
If I switch the order...Consider the following nebo assignment
```
const double a = 2.0;
Field1 <<= Field2 * a;
```
where Field1 and Field2 are spatialfields and a is a double.
The above assignments breaks on the GPU but NOT on CPU.
If I switch the order of algebraic operations to:
`Field1 <<= a * Field2;`
Then things work fine.
To test this:
* Wasatch GPU build (opt or dbg)
* ./sus -gpu -nthreads 2 -mpi inputs/Wasatch/Turbulence/decay-isotropic-turbulence-csmag_32.ups
With repository code, I get the following error:
```
terminate called after throwing an instance of 'std::runtime_error'
what():
Error trapped while executing expression: ( TurbulentViscosity, STATE_NONE )
details follow...
Request for const field pointer on a device for which it has not been allocated
(Locally allocated, generic system RAM) - /scratch/local/aurora_fast/tsaad/uintah-work/opt-gpu/Wasatch3P/install/SpatialOps/include/spatialops/structured/FieldInfo.h : 789
```
If you modify TurbulentViscosity.cc, line 110 to read:
`result <<= mixingLengthSq * rho ; // rho * (Cs * delta)^2 * |S|, Cs is the Smagorinsky constant`
Then things work fine. Note that I had to remove `sqrt(2.0 * strTsrSq_->field_ref() )`
because the sqrt doesn't work either, alghouth `strTsrSq_->field_ref()` is fine.
James SutherlandJames Sutherlandhttps://gitlab.multiscale.utah.edu/common/SpatialOps/-/issues/33Generalize OneSidedStencil for fields other than SVol2016-03-02T15:41:41ZJames SutherlandGeneralize OneSidedStencil for fields other than SVolAs written, OneSidedOperatorTypes.h in spatialops/structured/stencil will most likely fail on staggered fields. The UnitType should be redefined to work correctly on other volume fields, such as XVol. It could also be generalized to in...As written, OneSidedOperatorTypes.h in spatialops/structured/stencil will most likely fail on staggered fields. The UnitType should be redefined to work correctly on other volume fields, such as XVol. It could also be generalized to include face fields. The test in spatialops/structured/stencil/test/test_one_sided_stencil.cpp should also be fixed to ensure that the test fields are staggered appropriately.James SutherlandJames Sutherlandhttps://gitlab.multiscale.utah.edu/common/SpatialOps/-/issues/16Introspect core count in SpatialOps2018-02-25T20:43:48ZJames SutherlandIntrospect core count in SpatialOps# Compile-time introspection:
CMake provides a way to [determine processor counts](http://www.cmake.org/cmake/help/v3.0/module/ProcessorCount.html). See also [this blog post](http://www.kitware.com/blog/home/post/63).
We could levera...# Compile-time introspection:
CMake provides a way to [determine processor counts](http://www.cmake.org/cmake/help/v3.0/module/ProcessorCount.html). See also [this blog post](http://www.kitware.com/blog/home/post/63).
We could leverage this to help auto-populate the number of threads for SpatialOps. This could, in turn, be used in ExprLib.
# Runtime introspection
Several approaches are given [here](http://stackoverflow.com/questions/150355/programmatically-find-the-number-of-cores-on-a-machine).
# Other considerations
Once the threadcommunicator branch is merged, we have a few things to note:
- The number of threads in ExprLib and SpatialOps are multiplicative, and should never exceed the physical core count on the machine.
- The core count per socket should be divisible by the SpatialOps thread count.
- Thread count should generally not exceed the number of cores per socket if ExprLib is built on top of SpatialOps.
*Note also that execution will halt in the threadcommunicator branch if the number of threads exceeds the number of cores. This could be fixed if we can guarantee that the threadpool is not sized to exceed the physical core count.*James SutherlandJames Sutherlandhttps://gitlab.multiscale.utah.edu/common/SpatialOps/-/issues/13Fix thread bug in Wasatch (happens when SpatialOps is compiled with ENABLE_TH...2018-02-25T20:43:49ZJames SutherlandFix thread bug in Wasatch (happens when SpatialOps is compiled with ENABLE_THREADS=ON)When SpatialOps is compiled with `ENABLE_THREADS=ON`, Wasatch local regression tests fail and/or crash, sometimes. I have observed three types types of failures:
Exception is thrown, claiming some block of memory has been freed twice...When SpatialOps is compiled with `ENABLE_THREADS=ON`, Wasatch local regression tests fail and/or crash, sometimes. I have observed three types types of failures:
Exception is thrown, claiming some block of memory has been freed twice (double free). This exception has appeared in the following tests (not an exhaustive list):
- turb-lid-driven-cavity-3D-WALE
- turb-lid-driven-cavity-3D-SMAGPRINSKY
- turb-lid-driven-cavity-3D-VREMAN
- turb-lid-driven-cavity-3D-scalar
- coal-boiler-mini
- intrusion_flow_past_cylinder_xz
- intrusion_flow_past_cylinder_xy
- turbulent-inlet-test-xminus
- intrusion_flow_past_objects_xy
- intrusion_flow_past_oscillating_cylinder_xy
- intrusion_flow_past_cylinder_yz
- channel-flow-xy-xplus-pressure-outlet
- intrusion_flow_over_icse
- turbulent-flow-over-cavity
- channel-flow-zy-yplus-pressure-outlet
- channel-flow-yz-yminus-pressure-outlet
- lid-driven-cavity-3D-Re1000
- channel-flow-xy-xminus-pressure-outlet
- lid-driven-cavity-3D-Re1000-rk2
- channel-flow-zx-zplus-pressure-outlet
- channel-flow-symmetry-bc
- liddrivencavity3DRe1000rk3 (sic)
- lid-driven-cavity-xy-Re1000
- lid-driven-cavity-yz-Re1000
- hydrostatic-pressure-test
- lid-driven-cavity-xz-Re1000
- channel-flow-xz-zminus-pressure-outlet
- reduction-test
- lid-drive-cavity-xy-Re1000-adaptive (sic)
- convection-test-svol-ydir-bc
- convection-test-svol-zdir-bc
- bc-parabolic-inlet-channel-flow-test
- bc-linear-inlet-channel-flow-test
- bc-test-svol-zdir
Test hangs (test that usually takes ❤ seconds takes longer than a minute). This behavior has appeared in the following tests:
- varden-projection-mms
- varden-projection-xdir
- varden-projection-ydir
- varden-projection-zdir
- varden-projection-xdir-analytic-dens
- qmom-aggregation-test
- Test fails within testing framework with error code 3384. I do not know what this error code means. This behavior has appeared in the following tests:
- bc-test-svol-xdir
- bc-test-svol-ydir
- convection-test-svol-xdir-bc
Do not take these lists as exhaustive. Since these behaviors generally seem intermittent (I think the test hanging was consistent, but I do not remember at the moment), it is hard to tell exactly what is going on. Also, once a test failed in any way, I removed it from the list of tests I was running. In theory, a test could fail in multiple ways, but I have not seen that behavior.James SutherlandJames Sutherlandhttps://gitlab.multiscale.utah.edu/common/SpatialOps/-/issues/3Fix bug in using threads and GPU Nebo backends the same time2018-02-25T20:43:48ZJames SutherlandFix bug in using threads and GPU Nebo backends the same timeFirst reported by Chris Earl in May, 2014
This bug only appears on certain systems (prism and a few laptops). To reproduce the bug, set `ENABLE_THREADS=ON` and `ENABLE_CUDA=ON` during configuration.
Example errors:
```
../libspa...First reported by Chris Earl in May, 2014
This bug only appears on certain systems (prism and a few laptops). To reproduce the bug, set `ENABLE_THREADS=ON` and `ENABLE_CUDA=ON` during configuration.
Example errors:
```
../libspatialops-structured.a(spatialops-structured_generated_CudaMemoryAllocator.cu.o): In function `_GLOBAL__sub_I_tmpxft_000016bb_00000000_3_CudaMemoryAllocator.cudafe1.cpp':
tmpxft_000016bb_00000000-3_CudaMemoryAllocator.cudafe1.cpp:(.text.startup+0x6b): undefined reference to `boost::system::generic_category()'
tmpxft_000016bb_00000000-3_CudaMemoryAllocator.cudafe1.cpp:(.text.startup+0x77): undefined reference to `boost::system::generic_category()'
tmpxft_000016bb_00000000-3_CudaMemoryAllocator.cudafe1.cpp:(.text.startup+0x83): undefined reference to `boost::system::system_category()'
collect2: error: ld returned 1 exit status
```
These errors imply there is a problem with how boost and CudaMemoryAllocator.cu interact.