Click to See Complete Forum and Search --> : Memory Question for self Developed compiler in Visual C++


aijazasoomro
January 23rd, 2009, 07:19 AM
Dear all,

I am developing a Parser for xml ( a compiler in simple words).

You might know how the compiler is developed….

There are six phases, like Lexical Analysis, Syntax/Syntactic Analysis , Semantic Analysis, etc.

To implement the Lexical Phase we need to divide the source code ( Source code means stream of characters) into token.

Source code example :

“<book> C++</book>” (may be taken from a text box)

Tokens: Book as valid identifier, < Lessthan a separate token, / as separate token ….

For doing this suppose String s=”<book> C++</book>”

It means every character of the String s should be separately processed as character array.

for creating a character array I could do this like

Char c [1000] = s.tocharArray();

Over here what I have done I have created a character array c of size 1000, it means I will be able to process 1000 characters , but if I will need to process 10000000000 characters then definitely I must need to create a character array of size 10000000000 in advance.

But it is not necessary that After creating such a big size of character array I may only use few hundred characters.

To overcome the problem of this wastage of unused memory, I definitely need some automatic resizable arrays.

Now for doing this I can do this by Creating an ArrayList in visual c++.

Now the problem is that I need to create not only one ArrayList , but I need to create many ArrayList instead of many static arrays.

Problem expectation: since on addition of new element into ArrayList possibly the ArrayList object will be resizing the internal Array Structure.

My Question: Which technique will be Better
Either to create big static array Vs automatic resizing of Arraylist??
The static array will take fixed huge memory , while Arraylist can have resizing overhead when every time item is added.

I have used the ArrayList , but I want to confirm that can I be suffered from the ArrayList automatic resizing , or the big static ArrayCreation will take more memory /computational resources.



I hope you must have understood what I want to know, If you are still not clear then I can explain more here, or can call you easily any where in the world, or by voice conversations on messenger.

Waiting for your kind cooperation, if you really have such type of vision to answer my question.


Regards
Aijaz Soomro

aijazasoomro@hotmail.com
aijazasoomro@yahoo.com

Alex F
January 23rd, 2009, 11:07 AM
If you are using VS2005 or later, use generic containers (like List< >) and not ArrayList.
ArrayList is not resized, because this is list and not array. New elements are just added to it, they are not placed to contiguous memory block.

Having some information about input data, you can decide what array size you need. List may be too slow.

About programming language: why do you want to write this in managed C++? If you want to use .NET, write in C#. If you don't need .NET, use native C++. C++/CLI is used only for interoperability.

aijazasoomro
January 23rd, 2009, 07:12 PM
Dear Alex,

I am using visual studio .net, there is not such class of List in System::Collections, You wrote ArrayList is not resized, I dont want to resize manually, but it is automatically incremented, what ever elements we add it will be managed by it self. I dont know how they are stored. I can not decide about input data, since my program is free of input size, some times it can take millon of array elements some time only hundreds. therefore I want the expert opinion if i am using ArrayList instead of creating such huge static array , which approach will be efficient.

I am writhing this in manage code becuase in Visual c++ .net I am unable to declare any unmanaged type, since I was intrested to write the code in Windows Application Form, in Visual C++, native c++ and MFC are not easy for me to program, I could have not understood the document view architecture yet. I can not move to C# , becuase my rest of the application is already developed in c++, i will integrate this in future.

can you read again my example given above.


Aijaz.

Alex F
January 24th, 2009, 02:37 AM
Well, if you want to use old C++ code, managed C++ is right choice. Though it is better to wrap it in managed C++ Dll and use C# for user interface.
What Visual Studio version do you use? You need to use 2005 or later. Writing code in previous version is not good decision:
1. ArrayList should not be used, because it is replaced by generics. It is obsolete.
2. Managed C++ systax is obsolete, it is supported only for backward compatibility. C++/CLI language introduced in VS2005 has new syntax.

Difference between list and array is the same as in unmanaged C++. Array keeps data in contigous memory block. When there is no enough place, it is necessary to allocate new block with larger length and copy existing members to it. List is never resized. Every new list item is dynamically allocated, and pointer (reference) to it is added to previous list item.
Array is faster than list for any operation, except deleting from the middle, and adding new element with resizing. List is much slower, takes more memory, does not allow direct access by index, but has constant time for Add and Delete operations.
It is impossible to say what is better, it depends on situation.
In any case, allocating static array of huge size doesh't look like good programming design.

darwen
January 24th, 2009, 09:08 PM
I am developing a Parser for xml


You do realise there's a multitude to pick from - .NET has its own XML parsing (System::Xml::XmlDocument) and there's lots of freeware ones (I've personally used libxml which seems quite good).

Also mixing managed and native code like this is not a good idea when you start learning.

If you need collections in C++ you can use the std classes. These are available in VC++ express.

You can also write C++/CLI .NET wrapper classes to interface between C# and your native code, which might be another way to go.

Darwen.