Skip to content

Two more glitches in avx.nim #22

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Asc2011 opened this issue Feb 20, 2024 · 3 comments
Open

Two more glitches in avx.nim #22

Asc2011 opened this issue Feb 20, 2024 · 3 comments

Comments

@Asc2011
Copy link
Contributor

Asc2011 commented Feb 20, 2024

Hello guzba,

in avx.nim lines 226-230 ::

226 func mm_permutevar_pd*(a: M128d, b: M128i): M256d {.importc: "_mm_permutevar_pd".}`
...
230 func mm_permutevar_pd*(a: M128d, b: M128i): M256d {.importc: "_mm_permutevar_pd".}

AFAIK should return the same vector-types that they received. These should be double/single-precision vectors of 128b..

func mm_permutevar_pd*(a: M128d, b: M128i): M128d {.importc: "_mm_permutevar_pd".}
...
func mm_permutevar_ps*(a: M128, b: M128i): M128 {.importc: "_mm_permutevar_ps".}

and if you don't mind - sure you will :) - one could fix the terrible Intel-naming just a bit by adding the missing ::

func mm_permutevar_epi32*(a, mask :M128i ): M128i = 
  mm_castsi128_ps(
    mm_permutevar_ps( mm_castps_si128( a ), mask )
  )
func mm_permutevar_epi64*(a, mask :M128i ): M128i =
  mm_castsi128_pd(
    mm_permutevar_pd( mm_castpd_si128( a ), mask )
  )

Since everybd. has to add them anyways - after one got trapped... Same could/should be done for the 256bit-sized vectors.
And don't get me wrong here - i just suggest to add what Intel has left out, but evbd. expects to find. But staying with the Intel-wording. Actually a permutevar_<type> is a permute-operation - well, many operation permute a vector. In this case it is a shuffle-operation..
Maybe one could add a common_avx.nim that adds those missing functions to make the intrinsics a bit more consistent ?

just my 20ct, greets Andreas

we are gettin' closer to nimsimd v2 :)

@Asc2011
Copy link
Contributor Author

Asc2011 commented Jul 19, 2024

My pull-request-25 fixes the return vectors in avx.nim 226-230. If you want to keep it minimal, then leave it as it is now.
Otherwise you might want to include as proposed above :

func mm_permutevar_epi32*(a, mask :M128i ): M128i
func mm_permutevar_epi64*(a, mask :M128i ): M128i

--
Intrinsics-Guide references

@guzba
Copy link
Owner

guzba commented Jul 19, 2024

Hey, thanks for updating the PR. I was looking again and noticed that the storebe_ signatures have a return value instead of a second parameter. Such as https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=storebe_i16

As for fixing the other issues, I'm fine with getting all the corrections in this PR. It is easy to make little mistakes with this so I am happy to fix them as we notice them.

@Asc2011
Copy link
Contributor Author

Asc2011 commented Jul 19, 2024

yeah, another obvious bug - squashed it :) but still untested...

As for fixing the other issues, I'm fine with getting all the corrections in this PR. It is easy to make little mistakes with this so I am happy to fix them as we notice them.

i'm fine with that - lets do this properly. nimsimd is IMHO the most important module of all and it should go into std/simd or whatever name floats-your-boat.
I've added a warning to my MultiStepLRU - it's not on nimble yet - so i don't think anybody will notice, anyways ...

greets, andreas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants