Why aren't there any standard stack allocation routines? An answer could be that stack is an implementation terminology that might not hold relevance on certain platforms and hence, allocation routines that would be based on it are not standardized. But if that is the case, are there any practical implementations (OS or Compiler) that do not relate to a stack as a concept? If there are, how do they handle normal automatic variable allocations, function call stack and exception handling? Is that alternative efficient enough as compared to dynamic allocations?
Jim_King_2000
November 26th, 2006, 11:48 AM
I know some compilers for microcontrollers that may allocate memory without relation to stack. One example is 8051. Keil C compiler can use on-chip and out-chip RAMs or even registers to store parameters and local auto variables. But the stack concept still exists. Stack is a hardware concept. You may read books on computer architecture or assembly language to get aware of it. The allocation process is to modify the registers relevant to stack, such as EBP. However, dynamic allocation(heap or memory pool) in any platform is expensive, for the sake of concurrency and protection.
SuperKoko
November 26th, 2006, 12:37 PM
But if that is the case, are there any practical implementations (OS or Compiler) that do not relate to a stack as a concept?
Having a stack is not sufficient for having a stack allocation mechanism.
For example, on i386 architecture, it is not possible to have a stack allocation routine, allocating a variable amount of memory, without some specific stack frame model.
It means that, though it's possible on i386, compilers that disable or detect the absence of such allocation can produce better code.
Another problem: There are probably platforms on which such allocation mechanism would be quite impossible or dramatically reduce performances.
For example, on 6502 CPU (which has not more than 256 bytes of stack), it would be quite impossible to do it... Except if the allocation of "dynamic stack data" is not done on the real stack, but on a separate, independent stack.
I think that many CPU would require such workaround.
So, why would you want it? It's better to write a class encapsuling these operations.
Furthermore, dynamic allocation on the stack (if and only if the same stack is used for function calls) is very unsafe, since you can't know how much stack space you'll ever use, and the stack overflow condition may occur after the allocation itself, and when a function is called....
Assuming that the standard require compilers for throwing software exceptions when a stack overflow condition occur:
1) It would require a lot of checking code on some platforms (except if this stack is managed separately), dramatically reducing performances.
2) It would produce unrecoverable exceptions... Since, *any* function call or perhaps worst... any statement using temporary variables... would become a potentially throwing operation!
Consequently, it would become impossible to write exception safe code (no-throw operations are necessary for exception safety).
exterminator
November 26th, 2006, 01:34 PM
First off, thanks to Jim_King_2000 and SuperKoko for taking your time and responding.I know some compilers for microcontrollers that may allocate memory without relation to stack. One example is 8051. Keil C compiler can use on-chip and out-chip RAMs or even registers to store parameters and local auto variables. But the stack concept still exists. Stack is a hardware concept. You may read books on computer architecture or assembly language to get aware of it.I have no knowledge of assembly programming as such. But if there's an article (explanation) I can give it a try. But what do you mean when you say stack is a hardware concept? Whatever is the way to work with stack based storage, I query about any possibilities of dynamic allocation on it. I have a little idea that there are alignment issues between stack based allocations and heap based allocations (which is preferred for placement new, but let's not deviate). Can they be a governing factor why this is not universally possible?Having a stack is not sufficient for having a stack allocation mechanism.
For example, on i386 architecture, it is not possible to have a stack allocation routine, allocating a variable amount of memory, without some specific stack frame model.Frankly, I am not quite aware of stack frame models. Can you provide some specifics or a link to a brief article that explains something about these?It means that, though it's possible on i386, compilers that disable or detect the absence of such allocation can produce better code.Let's keep better code in the left hand for the moment. We will come back to this once some of my basic questions get a little clear.Another problem: There are probably platforms on which such allocation mechanism would be quite impossible or dramatically reduce performances.Why would it be impossible? Dramatically reducing performance is in my left hand, we will revisit it in some time. :)For example, on 6502 CPU (which has not more than 256 bytes of stack), it would be quite impossible to do it... Except if the allocation of "dynamic stack data" is not done on the real stack, but on a separate, independent stack. I think that many CPU would require such workaround.What I am asking about is allowing me to do dynamic allocation at runtime. I am never asking the dynamic allocation to be so big as to blow up the stack. That can always happen at static allocation as well, stack overflow exception would only occur at runtime. Isn't it? Again, I am not asking for huge allocations for which of course, I would prefer dynamic allocation on the free-store and not on stack regardless of it being static or dynamic.So, why would you want it? It's better to write a class encapsuling these operations.Which operations? Stack allocations or stack-like allocations?Furthermore, dynamic allocation on the stack (if and only if the same stack is used for function calls) is very unsafe, since you can't know how much stack space you'll ever use, and the stack overflow condition may occur after the allocation itself, and when a function is called....But the same can occur with the stack allocations at known at compile time. There is no check done for that as well. Also, I am not asking for huge allocations, very small, say 5-10 elements. I don't think that should be a problem. Stack overflow can always happen and the programmer is left with the decision to choose between dynamic or stack allocation depending upon how much he wants to do. The exact same case holds here as well. I just ask for this ability. I ask if this ability is not provided what could be the possible reasons?Assuming that the standard require compilers for throwing software exceptions when a stack overflow condition occur:
1) It would require a lot of checking code on some platforms (except if this stack is managed separately), dramatically reducing performances.No checks asked for. I don't ask for any checks to be made before allocation. Let the code go as it goes. If I ask for more than 1 MB (assuming that's the limit), and the program stack is susceptible to overflow, let it. That would be my responsibility. In fact that happens otherwise as well, as I mentioned above.2) It would produce unrecoverable exceptions... Since, *any* function call or perhaps worst... any statement using temporary variables... would become a potentially throwing operation!
Consequently, it would become impossible to write exception safe code (no-throw operations are necessary for exception safety).I understand but that is ok.
I am more interested in finding out why implementing something like this could be tough? Basically, standardization, which means across all platforms. There can be two possibilities:
1. There is a stack but dynamic stack allocation is problematic - I want to find out why/how?
2. There is no stack - but then in this case how and where do they allocate automatic (or stack based) variables/objects.
SuperKoko
November 26th, 2006, 03:36 PM
But what do you mean when you say stack is a hardware concept?
How could you define formally (on an abstract machine) what a stack is in a useful way, without restricting implementations from being high level?
For now, I can only see low-level definitions of stack, heavily relying on assembly-level concepts that the standard should never mention.
There are perhaps possibilities to define this word with a high-level meaning, but I doubt it'll ever be a useful concept.
On the other side, it is easy to define a "stack-allocating function" such as the alloca supported by a few compilers, because it doesn't require defining the word "stack".
Whatever is the way to work with stack based storage, I query about any possibilities of dynamic allocation on it
What, if dynamic allocation is impossible on it?
Frankly, I am not quite aware of stack frame models. Can you provide some specifics or a link to a brief article that explains something about these?
For example, consider a program that would do something like:
int func(size_t n) {
int x=42; // assume it's not put in a register.
char* p=(char*)dynamically_allocate_on_the_stack(n);
return x;
};
With an efficient stack frame, it's impossible:
func:
sub esp,8
mov dword ptr [esp+4], 42 ; assigns a value to x
sub esp, [esp+12] ; dynamically allocates the stuff
mov [esp+???], esp ; assign that to p
mov eax,[esp+???] ; loads x
add esp, ??? ; where could we store the amount of stack used?
ret
How do you want the compiler to know where x is, relatively to esp?
He can't know.... Because the stack space is variable.
If there is only static allocation on the stack, the compiler has no problem.
It is possible to work around this problem, using the ebp register for addressing the stack space:
func:
push ebp ; needs to save this register
mov ebp, esp ; this register is now reserved for this usage (1 register lost on the very few registers of the i386)
sub esp,8 ; stack space for x and p
mov [ebp-4], 42 ; assigns 42 to x
sub esp, [esp+8] ; allocates space on stack
mov [ebp-8], esp ; assigns it to p
mov eax, [ebp-4] ; loads x
mov esp, ebp
pop ebp ; clean up stack frame
ret
1) four enter/exit instructions are required.
2) a register is condemned.
What I am asking about is allowing me to do dynamic allocation at runtime. I am never asking the dynamic allocation to be so big as to blow up the stack.
That's not the problem.
The 6502 has no direct mean to decrement the stack pointer by an arbitrary number, and it has no mean to access memory relative to the stack pointer except the element on the top of the stack.
It has only a call/ret/push/pop mechanism, plus instructions to move the stack pointer into the X register or the X register into the stack pointer.
So, in fact, the workaround equivalent to the use of ebp on i386, would require storing the address of S when the function is entered (because there are only very very few registers, we can't condemn any of them), somewhere in the page zero, and access the stack with indirect accesses, not only for the dynamically allocated data, but for any variable on the stack...
Furthermore, putting this "dynamic stack allocation" elsewhere than in the hardware stack, would not increase this penalty for dynamic stack allocation, but would avoid the very serious performance penalty on the hardware stack.
So, my point is that, the hardware stack must not be used here. There can be a software stack for dynamic allocation, but it will be equivalent to a library stack.
But the same can occur with the stack allocations at known at compile time. There is no check done for that as well.
Except when using recursion, it is possible to prove that a stack overflow can't occur in a program.
1. There is a stack but dynamic stack allocation is problematic - I want to find out why/how?
Read a few architecture manuals.... Even if there is a stack, dynamic stack allocation is likely to be a problem.... In fact, it is *always* a problem, because it fundamentally require storing an extra stack frame pointer, in addition to the usual stack pointer.
It can be stored in another register, on some platforms... It must be stored at a static place in memory on other platforms.
Even the old K&R C prototype-free functions (thus, requiring the caller to clean up the stack frame) are problematic and tough to implement on many platforms.
http://cm.bell-labs.com/cm/cs/who/dmr/clcs.html
Fortunately, this problem nowadays, exists only with variadic parameter functions.
Prototype-free functions or variadic argument functions are far less problematic to implement than dynamic allocation on the stack.
2. There is no stack - but then in this case how and where do they allocate automatic (or stack based) variables/objects.
That's probably true for C++ interpretors (such as CINT).
There may be a stack, not being the hardware stack.
For example, suppose that an interpretor wants to let the user use many recursive functions... a std::deque might be an adequate container for that.... But, it is not possible to allocate an arbitrary large contiguous chunk on a std::deque or any similar data structure.
Another, very possible scenario: a stack formed of pointers (all of the same size) to objects of an internal Variable class type.
This stack would only store pointers to the heap, and dynamic allocation on the stack itself would not be possible, though a feature such as alloca would well be possible... Allocating on the heap, of course.
Conclusion: alloca-like functions are implementable in an efficient way... But, most of the time, it would not allocate the stuff on the hardware stack, but on a separate software stack or on the heap.... Even if the standard would say that this function must allocate on a "hardware stack" (I don't see how the standard could define this term), the as-is rule would imply that compilers are free to implement it as they can, out of the hardware stack.
Anyway, I don't think it would give any feature that we don't already have... It's easy to write a library for software stack allocation.
And, on computers where it is possible to efficiently dynamically allocate on the hardware stack (or on a software stack), compilers writers are free to allocate the room for automatic std::vector variables on this stack if they can prove that it makes no difference.
Zaccheus
November 26th, 2006, 04:43 PM
Why aren't there any standard stack allocation routines?
Are you asking for fast memory allocation or for automatic deletion?
Jim_King_2000
November 26th, 2006, 09:18 PM
Stack is LIFO(last in, first out). And many CPUs implement it by means of hardware. Of course you can implement it by software(such as STL::stack). However, CPUs and compilers use the first one to store local variables. On Windows, you can employ "_alloca" to allocate stack memory dynamically, and the memory allocated by this way will be deallocated automatically. This method need a function which is not belong to ANSI C, so I don't think it would work on other platforms. So, I think you should find some substitutions.
Jim_King_2000
November 26th, 2006, 10:34 PM
For the reason to not providing stack dynamic operations within ANSI, I think:
1. It's not permanent. The stack will vary with function calls and returns. So if you store something in stack, and then you call some functions or some functions return, the content of which you stored may change, even the space is never belong to you. So you must use "_alloca" locally and carefully.
2. It's not safe. Many important data(including function return address) are stored in stack. So manipulations of stack will cause a lot of system errors such as invalid memory access, stack overflow, thread crash, etc.
3. It's not necessary. You can use local auto array variables, or memory pool. Why do you love dynamic stack allocation so much?
Zaccheus
November 27th, 2006, 03:50 AM
Personally I can see no reason why a compiler should not be able to cope with:
void f(size_t n)
{
int arr[n];
// ...
}
I don't think arr would even have to live on the stack, as long as it is destroyed when it goes out of scope.
On systems where it can live on the stack, it could be more efficient than std::vector.
exterminator
November 27th, 2006, 06:10 AM
Yeah, that is what is my reason for asking about it. Had it been possible, we could have had variable length fixed arrays in C++ directly without relying on the free store to keep the allocation.
This question of mine has source from one of the discussions that I came across on comp.lang.c++.moderated.
The problem with std::vector is that, it is not possible to lock it's inherent "growable" property. And hence it sometimes becomes tough to have multi-dimensional arrays or even single dimensional arrays which should in fact be arrays with fixed size (known at runtime) and don't grow and also provide auto-cleanup. Moreover, vector of vector is actually jagged and not contiguous. (Let's not discuss for the moment if vector of vector is the right choice for implementing 2-D arrays) and you can't restrict one of the rows to grow more than others.
One can surely craft a class encapsulating this logic and allocating using new, but that would be on the heap. I had a feeling if it had direct support for allocation on stack, it would be better.
I am not sure what implementations are used for C's VLA's and if it is efficient and portable. alloca routines provided by gcc and vc++ are closer match but I guess they are platform dependent extensions.
SuperKoko
November 27th, 2006, 07:19 AM
Personally I can see no reason why a compiler should not be able to cope with:
void f(size_t n)
{
int arr[n];
// ...
}
I don't think arr would even have to live on the stack, as long as it is destroyed when it goes out of scope.
Yes, of course, VLA are possible to implement, even though a good implementation will probably not put VLA on the hardware stack.
On systems where it can live on the stack, it could be more efficient than std::vector.
But, it brings nothing new from a programmatic point-of-view compared to std::vector... And for optimizations, implementations are allowed to allocate std::vector on the stack as far as the as-is rule is respected (though, it might be hard because vectors are guaranteed to be exception safe)!
I know that many people are tempted to add into their language a builtin feature for each of the specific feature that his favorite platforms (such as SIMD) supports.
That's very bad for the long term.... Because hardware changes and those features would become more and more emulated while newer ones would appear and the language will finally become a huge heap consisting of all the CPU features of all existing platforms (+ platforms of the past)...
Fortunately, the C and C++ standard maintain a level of abstraction, avoiding hardware details.
SIMD and other things, should be handled by the optimizer, not the programmer... It guarantees that C++ will remain portable to several platforms and for a long time.
Yeah, that is what is my reason for asking about it. Had it been possible, we could have had variable length fixed arrays in C++ directly without relying on the free store to keep the allocation.
It's a better idea to write a better (faster) free-store... I know that VC++ free store is really very slow, but there exists much better implementations requiring only a few CPU cycles for allocating a new block (and it's exception safe).
The problem with std::vector is that, it is not possible to lock it's inherent "growable" property. And hence it sometimes becomes tough to have multi-dimensional arrays or even single dimensional arrays which should in fact be arrays with fixed size (known at runtime) and don't grow and also provide auto-cleanup.
I don't see the problem... If you don't want to make the vector grow, simply don't call resize() or push_back()... Or write your vector-like class containing a vector and looking like a vector except that resize(), push_back() and pop_back() would be removed from the interface.
Moreover, vector of vector is actually jagged and not contiguous. (Let's not discuss for the moment if vector of vector is the right choice for implementing 2-D arrays) and you can't restrict one of the rows to grow more than others.
I don't see how stack allocation would resolve this problem.
VLA don't solve this problem either, except if you're refering to the proposal of Dennis Ritchie (http://cm.bell-labs.com/cm/cs/who/dmr/vararray.html)... Which is unrelated to stack and inadequate for C++ because classes better handle the issue.
I had a feeling if it had direct support for allocation on stack, it would be better.
How would it be better? The class wouldn't be easier to write, and it wouldn't be much faster than a correctly-written heap manager.
And if a heap can't be fast enough, why wouldn't you write a stack class and create a single instance of it (or one instance for each thread)?
It would be perhaps more efficient than using alloca!
It would be exception-safe.
Moreover a not-too-bad heap, the CPU time consumed by allocation/deallocation is very small, especially for large memory blocks such as 2D matrices.
I am not sure what implementations are used for C's VLA's and if it is efficient and portable
It is possible to write efficient VLA implementations, but, they are not guaranteed (fortunately) to use the hardware stack.... For example, a software stack or the heap will be usually more adequate.
alloca routines provided by gcc and vc++ are closer match but I guess they are platform dependent extensions
Since alloca produces memory deallocated by longjmp(), it's very much platform-dependent.
I hope you don't use setjmp()/longjmp() often. Do you?
If you want, you can propose a stack allocation class as a library extension.
I don't think C++ is a weak language that needs to be filled of abstraction-lacking features. C++ is powerful enough to be extended by libraries.
PS: Adding any feature to C++ has a cost: forward-compatibility is almost as useful as backward-compatibility.
Zaccheus
November 27th, 2006, 05:21 PM
Fortunately, the C and C++ standard maintain a level of abstraction, avoiding hardware details.
I agree, but the 1970's limitation that an array of automatic variables must have a length which is known at compile time seems to be there because of assumed limitations of the hardware.
SuperKoko
November 28th, 2006, 07:59 AM
I agree, but the 1970's limitation that an array of automatic variables must have a length which is known at compile time seems to be there because of assumed limitations of the hardware.
No, I think you're wrong.
It is partly due to history & the fact that arrays are manipulated through pointers and have not special runtime size information.
For example, if arrays whose length is not known at compile time, were used in multidimensional arrays, it would become very complex for the compiler to handle those things.
C99 VLA are a very ugly, inconsistent, and are recognized as being a hack to avoid the painful usage of malloc()/free() in C where there is no std::vector class.
C99 VLA are hacks.
Real arrays (having a real type, known at compile time) are not hacks.
If arrays had to be variable in length, Dennis Ritchie's proposal (whose link is available in one of my previous post) would have been 10 times more meaningful, and close to the real problem.
Anyway, C99 VLA are easy to implement on all hardware because they're not guaranteed to put the stuff on the stack... So, they can use the heap and can be as portable as the heap itself (excluding the setjmp/longjmp issue).
In C++, C99 VLA are absolutely meaningless and inappropriate... except for compatibility with C99.
Zaccheus
November 28th, 2006, 09:01 AM
... arrays are manipulated through pointers and have not special runtime size information.
And the problem is?
For example, if arrays whose length is not known at compile time, were used in multidimensional arrays, it would become very complex for the compiler to handle those things.
In what way?
C99 VLA are a very ugly, inconsistent, and are recognized as being a hack to avoid the painful usage of malloc()/free() in C where there is no std::vector class.
I'm not talking about C99 VLA.
C99 VLA are hacks.
Real arrays (having a real type, known at compile time) are not hacks.
You do have a point there.
Anyway, C99 VLA are easy to implement on all hardware because they're not guaranteed to put the stuff on the stack
Where the array lives should always be an implementation issue.
In C++, C99 VLA are absolutely meaningless and inappropriate... except for compatibility with C99.
Why? I don't think std::vector is equivalent, partly because of the swap behaviour and the custom allocator feature.
codeguru.com
Copyright WebMediaBrands Inc., All Rights Reserved.