YourSurrogateGod
December 1st, 2004, 02:58 PM
From what I hear, AMD is a clone of Intel. Does that mean that assembly that executes fine on Intel will work just as well on AMD? Sorry if this question has been asked already...
|
Click to See Complete Forum and Search --> : Intel and AMD YourSurrogateGod December 1st, 2004, 02:58 PM From what I hear, AMD is a clone of Intel. Does that mean that assembly that executes fine on Intel will work just as well on AMD? Sorry if this question has been asked already... Maven December 3rd, 2004, 03:19 PM It may be slower but it will run, unless you use optimizations that are specific to intel or amd. kahlinor December 4th, 2004, 12:45 AM If you do not use AMD specific instruction extensions, there won't be a conflict. NoHero December 15th, 2004, 02:01 PM From what I hear, AMD is a clone of Intel. Does that mean that assembly that executes fine on Intel will work just as well on AMD? Sorry if this question has been asked already... They both use the basic instruction set specified by IBM several years ago. If you look deeper into some graphical/multimedia extension you will find differences between them there. So beware: Your source code will work on both machines if you don't use any Intel Multimedia extensions. Sam Hobbs December 15th, 2004, 05:24 PM Just think a bit. Have you ever seen versions of software thart was specific to AMD or Intel? In other words, have you ever seen software that requires either an AMD or Intel processor? Well, maybe. If so, then they require a specific processor due to some extra features of the specific processor, but if so, then it is rare. As far as I know, at least 99% of all Pentium software works with either Intel or AMD. Sam Hobbs December 15th, 2004, 05:39 PM They both use the basic instruction set specified by IBM several years ago.Specified where? I am very interested in seeing the IBM specifications describing the basic Pentium instruction set. The basic Pentium instruction set of course is based upon the processor used in the original IBM PC, so I suppose it would be those specifications that are relevant. So did IBM contract with Intel to build the processor designed by IBM for the PC? YourSurrogateGod December 15th, 2004, 08:01 PM Just think a bit. Have you ever seen versions of software thart was specific to AMD or Intel? In other words, have you ever seen software that requires either an AMD or Intel processor? Well, maybe. If so, then they require a specific processor due to some extra features of the specific processor, but if so, then it is rare. As far as I know, at least 99% of all Pentium software works with either Intel or AMD.:shrug: I'm new to assembly, hence the question. I know that C, C++ and Java will work fine, I just wasn't sure if the linker did something different in an AMD machine in order to get them to run, as opposed to an Intel one. Sam Hobbs December 15th, 2004, 09:10 PM I have not done a lot of assembler programming, but I have done some; both for Intel processors and for old-style big IBM "Mainframe" processors. I am familiar with what compilers and linkers do, and what happens during execution. I don't say that to brag; I am trying to avoid making you feel bad for not being familiar with those things. For me, it is very obvious that there is no difference between the machine instructions for AMD and Intel processors. The output of a linker is dependent on the operating system. For a Linux system, I assume the output would be the same in terms of format, but the machine instructions would vary depending on the processor. The fundamental purpose of a linker is to combine files created by compilers; the files might have been created by the same compiler or they could be from different compilers (and an assembler of course). In most environments, a linker is the only way to mix source code from more than one compiler into a single static link. A linker also allows multiple source files for a single compiler/assembler to be combined into a single executable. NoHero December 16th, 2004, 05:51 AM Specified where? I am very interested in seeing the IBM specifications describing the basic Pentium instruction set. The basic Pentium instruction set of course is based upon the processor used in the original IBM PC, so I suppose it would be those specifications that are relevant. So did IBM contract with Intel to build the processor designed by IBM for the PC? In those old days where IBM ruled this business they made a specifcation (based on the Intel processors engine) how a Personal Computer should work. This was necessary because Intel was unable to protect their label "x86". So other companies created clone processors called "7086" (I remember my friend has such a thing, he was really pissed off when he realized that a normal DOS doesn't work on this thing. I can't say if it was created by AMD or not) for example, but those were not compatible to any standard. IBM specified that - you may see "IBM compatible" on old software and or on old devices - based on the work of Intel. Intel renamed it's processor to Pentium so they can protect it, and avoid such clones as the "7086". That's the story behind IBM and Intel and why AMD needs to fit with this specification. Other story would be MIPS or PPC processors. Sam Hobbs December 16th, 2004, 12:11 PM Again, show us the specifications. I doubt the accuracy of most of what you are saying. I searched the internet for "+7086 processor"; I did not see anything about a 7086 processor. So if can find anything about what you are talking about, then that will be interesting. NoHero December 16th, 2004, 12:15 PM Again, show us the specifications. I doubt the accuracy of most of what you are saying. I searched the internet for "+7086 processor"; I did not see anything about a 7086 processor. So if can find anything about what you are talking about, then that will be interesting. I always thought so. :blush:. Otherwise I cannot explain myself the "IBM Compatible" sticker my old 486 processor has ... If you can, please feel free to do so. NoHero December 16th, 2004, 12:25 PM IBM, a strategic partner of Intel, soon started manufacturing PCs based on this CPU. Several manufacturers (Compaq, Columbia, Kayro, etc.) also built machines with i8086 inside, and claimed them to be "IBM-compatible" (however, their BIOSes weren't totally compatible with IBM's one). About 14.9 million units were sold (including clones). Ok its not exactly what I meant but it is the same direction ... Sry for confusing you all: http://www.alasir.com/x86ref/ Maven December 17th, 2004, 05:08 PM As I said above, unless you use specific instructions related to intel or AMD your code will run. They both use the x86 language. I also said that it might run slower. This is true when you move from one type of processor to another. Moving from intel to amd or even intel to a different chipset of intel. The reason I say this is some things work better on some processors then others. The reason is because of how the chip handles the instructions internally, some do some instructions good and some instructions bad. Things that may have been in fashion for a P2 may not be in fashion for a P4. The only instruction that I know of that you should avoid on any processor is "Div" or any form of it. Division instructions whore clock cycles like a *****. Even in high level languages I do work arounds for division when they are in loops to keep the compiler from spiting out a div instruction. Sam Hobbs December 17th, 2004, 06:26 PM Even in high level languages I do work arounds for division when they are in loops to keep the compiler from spiting out a div instruction.Assuming that the denominator is totally variable (unpredictalbe), are there work arounds that are faster than the Pentium processor's instructions for doing the division? I know that division is among the most processor-intensive instructions for all processors, but the reason why is fundamental to division. I assume that the Intel and AMD engineers have done the best they could to optimize the process, and if you have something better, then you should send them a message. They are likely to be very interested in your algorithm. Certainly when dividing by 2 or a power of 2, it is much more efficient to shift the bits, but that works only when it is known at design time what the denominator is. NoHero December 18th, 2004, 07:09 AM Yes the division is some quite of processor intensive. But implementing an ALU and divide this instruction several substraction made it quiet fast though. Don't forget the prefech unit that enhances the speed quite good. And it would be interesting to know if your instruction for dividing is faster than this one from the intel processor. I never looked to deep into this that I can say if the MMX clone of div is faster than the normal div/idiv. But most programmers use that if they need a faster division algorithm. Back to the topic of the clones: here (http://www.paradicesoftware.com/specs/cpuid/) you can find some old Intel x86 clones. Just scroll down to "5. Identifying CPU manufacturer through CPUID.". The old ones are these which don't are a link. Maven December 18th, 2004, 05:42 PM Assuming that the denominator is totally variable (unpredictalbe), are there work arounds that are faster than the Pentium processor's instructions for doing the division? I know that division is among the most processor-intensive instructions for all processors, but the reason why is fundamental to division. I assume that the Intel and AMD engineers have done the best they could to optimize the process, and if you have something better, then you should send them a message. They are likely to be very interested in your algorithm. Certainly when dividing by 2 or a power of 2, it is much more efficient to shift the bits, but that works only when it is known at design time what the denominator is. Well the first thing you want to do is see if you can avoid the use of divison in your algorithm. For example I seen a xor encyption algorithm the other day that used division to keep up with their location in the key buffer. Example code: purpose: xor message with a key for (i=0; i<=msgsize; i++) { crypt[i] = msg[i] ^ key[i % keysize]; } The % operator is a division instruciton that returns the modolus that is stored in edx. In other words in assembly, it'll spit out a idiv instruction. Now in this example, we could get rid of the need for divison by using pointers to keep track of our key. Example in asm view: SetKeyBack: sub esi, ebx jmp lblCon align 16 For_Loop: cmp esi, esp je SetKeyBack lblCon: mov al, [edi] add ecx, 1 add edi, 1 xor al, [esi] add esi, 1 mov byte ptr [edx], al add edx, 2 cmp ecx, ebp jne For_Loop Basically what that does is add 1 to the pointer of the key string and when it gets to the end, we set it back to the start of the string. I'm sure your asking at this point, what if I had to have the divison though? Depending on the nature of the numbers we are working with, we can do different things. Lets take the xor example and do a workaround for divison instead of getting rid of it completely. The rule is, its faster to multiply then to divide. 64 / 8 = 0.125 * 64 It's faster to use 0.125 * 64 then it is to use 64 / 8. So how can we make use of this? float num = 1 / keysize; for (i=0; i<=msgsize; i++) { crypt[i] = msg[i] ^ key[i * num]; } This is faster simply because we got rid of divison inside the loop. Another thing you can do to avoid divison is if you have a full 32 bit number you can multiply it and take the top 32 bit half as the answer. As you also pointed at, when your working with numbers that are a power of 2 you can shift bits to do divison or multiplication. The fastest thing you can do to do divison is subtraction or additon. In other words see how many of x can fit inside of y. For example: mov edx, 9 mov ecx, 3 xor eax, eax divloop: add eax, 1 sub edx, ecx ja divloop sub eax, 1 This can really bite you though because its a hard thing to do correctly. If you do it like above, the more x it takes to fit inside of y, the slower the algorithm becomes until div is faster. It takes clever little algorithms to keep that from happening and more often then not, it ties you to a processor faster then you can say: pneumonoultramicroscopicsilicovolcanoconiosis Sam Hobbs December 18th, 2004, 08:35 PM Maven, you did not provide the requested clarification of how to "do work arounds for division" that will work when the "denominator is totally variable (unpredictalbe)". You were just too strong in what you said earlier; you indicated it is always, or nearly always, possible to write code that does not use division that is faster than code that does not use division. It is good to say to avoid division when possible, but it is not always possible. Whether you want to admit it or not, sometimes it is necessary to divide and therefore more efficient to use the processor's division instructions. Even if you don't want to admit it, I know that other people know what I am saying. Maven December 18th, 2004, 08:50 PM Maven, you did not provide the requested clarification of how to "do work arounds for division" that will work when the "denominator is totally variable (unpredictalbe)". You were just too strong in what you said earlier; you indicated it is always, or nearly always, possible to write code that does not use division that is faster than code that does not use division. It is good to say to avoid division when possible, but it is not always possible. Whether you want to admit it or not, sometimes it is necessary to divide and therefore more efficient to use the processor's division instructions. Even if you don't want to admit it, I know that other people know what I am saying. Multiplying will always work to get a quotient and is always faster. Maven December 18th, 2004, 08:54 PM Multiplying will always work to get a quotient and is always faster. Example Problem 5/n Solution: 0.2 * n NoHero December 19th, 2004, 06:11 AM Wow. Never thought of such an solution. @Sam Hobbs: Is there a case were this way can't work? I can't think of one. Sam Hobbs December 19th, 2004, 02:35 PM This will work when the numerator is known, but what if it varies? How do we determine what to multiply by if the numerator varies and is unpredictable (at design time or whatever). This will work if the denominator varies, but won't work if the numerator varies. In those situations where both the numerator and denominator varies, division is a necessity. Some people can write many words except one: sorry, as in I'm sorry, I made a mistake. Maven December 19th, 2004, 07:49 PM This will work when the numerator is known, but what if it varies? How do we determine what to multiply by if the numerator varies and is unpredictable (at design time or whatever). This will work if the denominator varies, but won't work if the numerator varies. In those situations where both the numerator and denominator varies, division is a necessity. Some people can write many words except one: sorry, as in I'm sorry, I made a mistake. The only case where you cannot work around it is when the denominator and numerator changes with every iteration of a loop and the numbers are not full 32 bit numbers. If that is the case then you would have no choice but to put a divison inside your critical code. However, I must say that it's quite rare to run into such a thing. If you do though, the only advise I can give you is to xor edx, edx before you do your division. It'll make it run in about half the time. If you're simply saying that you don't know what the numerator will be at design time but know that it is not going to change during your critical code, you can use a work around to get the division out of your loop. Example: for i = 1 to 2000 val = n / d next i change to: buf = 1 / n for i = 1 to 2000 val = buf * d next i Sam Hobbs December 19th, 2004, 08:05 PM The only case where you cannot work around it is when the denominator and numerator changes with every iteration of a loop and the numbers are not full 32 bit numbers. If that is the case then you would have no choice but to put a divison inside your critical code. However, I must say that it's quite rare to run into such a thing. If you do though, the only advise I can give you is to xor edx, edx before you do your division. It'll make it run in about half the time.Very good. It is important to say things like that for the benefit of those we are helping. Another important point that you emphasize is the importance of the code. As you indicate, if the code is executed only once, then it is reasonable to use whatever is convenient for us, such as a divide instruction. Maven December 19th, 2004, 08:17 PM Some people can write many words except one: sorry, as in I'm sorry, I made a mistake. My comments have been acurate. The only instruction that I know of that you should avoid on any processor is "Div" or any form of it. Division instructions whore clock cycles like a *****. Even in high level languages I do work arounds for division when they are in loops to keep the compiler from spiting out a div instruction. I don't see where I have failed to demonstrate that. However lets point out your comments: Maven, you did not provide the requested clarification of how to "do work arounds for division" that will work when the "denominator is totally variable (unpredictalbe)" Then you come back and say: This will work if the denominator varies, but won't work if the numerator varies. In those situations where both the numerator and denominator varies, division is a necessity. The best part is you ask me to tell you sorry for acurately answering your question. Makes one suspect you're a troll on a fishing expedition. Maven December 19th, 2004, 08:23 PM Very good. It is important to say things like that for the benefit of those we are helping. Another important point that you emphasize is the importance of the code. As you indicate, if the code is executed only once, then it is reasonable to use whatever is convenient for us, such as a divide instruction. It's a waste of time and effort to optimize code that isn't critical to our applications. Before you set out to optimize code, you should alway's identify what is critical and what is not. Maven December 19th, 2004, 08:31 PM While I'm here: ; 58 clock cycles mov eax, 25 mov ecx, 3 div ecx The following runs in half the time: ; 29 clock cycles xor edx, edx mov eax, 25 mov ecx, 3 div ecx There is also a site written by angerfog that has a nice tutorial on dividing with a constant. You can find it here: http://www.agner.org/assem/ Sam Hobbs December 19th, 2004, 08:59 PM Makes one suspect you're a troll on a fishing expedition.Actually, I was fishing for a positive finish for this discussion. Accusing me of trolling means you don't have anything more relevant to accuse me of. codeguru.com
Copyright WebMediaBrands Inc., All Rights Reserved. |