swati24
April 16th, 2005, 05:12 AM
Hi,
This question is regarding the SSE macro function for shuffle operations. I don't understand it fully.
MSDN says that _MM_SHUFFLE(z,y,x,w) expands to :
(z << 6) | (y <<4) | (x << 2) | w
This macro is used in conjunction with SHUFPS or MMX instrinsic function _mm_shuffle_ps(m1, m2, int mask), where mask is replaced by the macro that acts on m1 and m2. m1 and m2 are 128 bit registers.
MSDN has the following example:
Let m1 be a : b : c : d
Let m2 be e : f : g : h
where each of the a,b,c,d,e,f,g,h are 32 bit single precision floating point values. a is the highest double word and d is the lowest double word. Similarly, e is the highest double word and h is the lowest double word.
Now, when the following function is performed on the m1 and m2 MMX registers:
_mm_shuffle_ps(m1,m2,_MM_SHUFFLE(1,0,3,2))
we get
m3 g : h : a : b
-------------
Working
I tried to work this example myself but couldn't arrive at the right answer.
1 << 6 gives 0100 0000
0 << 4 gives 0000 0000
3 << 2 gives 0000 1100
2 gives 0000 0010
Bitwise OR operation gives 0100 1110
If we apply this mask on m1 and m2 how do we end up with m3? What am I doing wrong here. Any insights will be most helpful.
Thank you
Swati
This question is regarding the SSE macro function for shuffle operations. I don't understand it fully.
MSDN says that _MM_SHUFFLE(z,y,x,w) expands to :
(z << 6) | (y <<4) | (x << 2) | w
This macro is used in conjunction with SHUFPS or MMX instrinsic function _mm_shuffle_ps(m1, m2, int mask), where mask is replaced by the macro that acts on m1 and m2. m1 and m2 are 128 bit registers.
MSDN has the following example:
Let m1 be a : b : c : d
Let m2 be e : f : g : h
where each of the a,b,c,d,e,f,g,h are 32 bit single precision floating point values. a is the highest double word and d is the lowest double word. Similarly, e is the highest double word and h is the lowest double word.
Now, when the following function is performed on the m1 and m2 MMX registers:
_mm_shuffle_ps(m1,m2,_MM_SHUFFLE(1,0,3,2))
we get
m3 g : h : a : b
-------------
Working
I tried to work this example myself but couldn't arrive at the right answer.
1 << 6 gives 0100 0000
0 << 4 gives 0000 0000
3 << 2 gives 0000 1100
2 gives 0000 0010
Bitwise OR operation gives 0100 1110
If we apply this mask on m1 and m2 how do we end up with m3? What am I doing wrong here. Any insights will be most helpful.
Thank you
Swati