Two more glitches in avx.nim #22

Asc2011 · 2024-02-20T15:52:18Z

Hello guzba,

in avx.nim lines 226-230 ::

226 func mm_permutevar_pd*(a: M128d, b: M128i): M256d {.importc: "_mm_permutevar_pd".}`
...
230 func mm_permutevar_pd*(a: M128d, b: M128i): M256d {.importc: "_mm_permutevar_pd".}

AFAIK should return the same vector-types that they received. These should be double/single-precision vectors of 128b..

func mm_permutevar_pd*(a: M128d, b: M128i): M128d {.importc: "_mm_permutevar_pd".}
...
func mm_permutevar_ps*(a: M128, b: M128i): M128 {.importc: "_mm_permutevar_ps".}

and if you don't mind - sure you will :) - one could fix the terrible Intel-naming just a bit by adding the missing ::

func mm_permutevar_epi32*(a, mask :M128i ): M128i = 
  mm_castsi128_ps(
    mm_permutevar_ps( mm_castps_si128( a ), mask )
  )

func mm_permutevar_epi64*(a, mask :M128i ): M128i =
  mm_castsi128_pd(
    mm_permutevar_pd( mm_castpd_si128( a ), mask )
  )

Since everybd. has to add them anyways - after one got trapped... Same could/should be done for the 256bit-sized vectors.
And don't get me wrong here - i just suggest to add what Intel has left out, but evbd. expects to find. But staying with the Intel-wording. Actually a permutevar_<type> is a permute-operation - well, many operation permute a vector. In this case it is a shuffle-operation..
Maybe one could add a common_avx.nim that adds those missing functions to make the intrinsics a bit more consistent ?

just my 20ct, greets Andreas

we are gettin' closer to nimsimd v2 :)

The text was updated successfully, but these errors were encountered:

Asc2011 · 2024-07-19T16:12:54Z

My pull-request-25 fixes the return vectors in avx.nim 226-230. If you want to keep it minimal, then leave it as it is now.
Otherwise you might want to include as proposed above :

func mm_permutevar_epi32*(a, mask :M128i ): M128i
func mm_permutevar_epi64*(a, mask :M128i ): M128i

--
Intrinsics-Guide references

guzba · 2024-07-19T16:51:58Z

Hey, thanks for updating the PR. I was looking again and noticed that the storebe_ signatures have a return value instead of a second parameter. Such as https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=storebe_i16

As for fixing the other issues, I'm fine with getting all the corrections in this PR. It is easy to make little mistakes with this so I am happy to fix them as we notice them.

Asc2011 · 2024-07-19T17:12:03Z

yeah, another obvious bug - squashed it :) but still untested...

As for fixing the other issues, I'm fine with getting all the corrections in this PR. It is easy to make little mistakes with this so I am happy to fix them as we notice them.

i'm fine with that - lets do this properly. nimsimd is IMHO the most important module of all and it should go into std/simd or whatever name floats-your-boat.
I've added a warning to my MultiStepLRU - it's not on nimble yet - so i don't think anybody will notice, anyways ...

greets, andreas

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Two more glitches in avx.nim #22

Two more glitches in avx.nim #22

Asc2011 commented Feb 20, 2024 •

edited

Loading

Asc2011 commented Jul 19, 2024

guzba commented Jul 19, 2024

Asc2011 commented Jul 19, 2024 •

edited

Loading

Two more glitches in avx.nim #22

Two more glitches in avx.nim #22

Comments

Asc2011 commented Feb 20, 2024 • edited Loading

just my 20ct, greets Andreas

Asc2011 commented Jul 19, 2024

guzba commented Jul 19, 2024

Asc2011 commented Jul 19, 2024 • edited Loading

Asc2011 commented Feb 20, 2024 •

edited

Loading

Asc2011 commented Jul 19, 2024 •

edited

Loading