Question

This is very simple, but I haven't been able to figure it out yet.

This question is regarding a assembly mmx, but it's pure logic.

Imagine the following scenario:

MM0: 04 03 02 01 04 03 02 01  <-- input  
MM1: 02 02 02 02 02 02 02 02  
MM2: 04 03 02 01 04 03 02 01  <-- copy of input

after pcmpgtw MM0, MM1

MM0: FF FF 00 00 FF FF 00 00  <-- words where MM0 is greater than MM1 (comparing words)  
MM1: 02 02 02 02 02 02 02 02  
MM2: 04 03 02 01 04 03 02 01

after pand MM0, MM2  

MM0: 04 03 00 00 04 03 00 00  <-- almost there...
MM1: 02 02 02 02 02 02 02 02  
MM2: 04 03 02 01 04 03 02 01  

What I want is to know fill the zeros of MM0 with 02. I suppose I would have to invert MM0 register in step2, changing the FF's to 00's and the 00's to FF's and then do a and to MM1 and finally a or to merge the two.

If I was able to get:

MM3: 00 00 FF FF 00 00 FF FF

then, pand MM2, MM3

MM1: 04 03 00 00 04 03 00 00  
MM2: 00 00 02 02 00 00 02 02

finally por MM0, MM1 would give me the desired outcome:

MM0: 04 03 02 02 04 03 02 02  <-- Aha!

Summing up, how can I get that MM3 register as 00 00 FF FF 00 00 FF ? How can I invert the bits, proving I only have AND, OR, XOR and NAND instructions available in MMX registers?

Any answer is greatly appreciated. Thanks.

Was it helpful?

Solution

You can also generate the mask using pcmpgtw and swap the order of the arguments. That way you can save a register:

MM0: 04 03 02 01 04 03 02 01  <-- input  
MM1: 02 02 02 02 02 02 02 02  
MM2: 04 03 02 01 04 03 02 01  <-- copy of input


pcmpgtw MM0, MM1    ; MM0 = FF FF 00 00 FF FF 00 00 
pcmpgtw MM1, MM2    ; MM1 = 00 00 FF FF 00 00 FF FF

You may have to make a copy of the MM1 argument because it will get destroyed during mask generation, but this is often faster than loading/generating a 64 bit constant.

A alternative way would be to use PNAND:

pcmpgtw MM0, MM1    ; MM0 = FF FF 00 00 FF FF 00 00 

pand    MM2, MM0    ; leave bytes with FF intact 
pnand   MM1, MM0    ; leave bytes with 00 intact 
por     MM1, MM2    ; combine the results.

OTHER TIPS

So you have a mask = 0xFFFF0000FFFF0000; then:

all_ones = 0xFFFFFFFFFFFFFFFF;

inverted_mask = mask XOR all_ones;

merging M0 and M1 is:

M0 = M0 AND mask;
M1 = M1 AND inverted_mask;
M0 = M0 OR M1;

this edits M0 and M1 in place so their values are destroyed. If you want to preserve M1 then you need to store the intermediate result into a temporary variable/register/memory:

M0 = M0 AND mask;
TEMP = M1 AND inverted_mask;
M0 = M0 OR TEMP;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top