adamjmac
June 18th, 2009, 01:23 AM
Hi,
I'm using a function to create, fill, and return a vector which runs in multiple threads (with pthreads) simultaneously. When I run the program in a single thread, it works. In multiple threads on one processor, it works. Only when I use multiple processors does it crash, consistently on each thread. After spending the day troubleshooting I'm about to blame the vector template and make my own dynamic array, but hopefully someone can pick up on something I missed...
AgentBase and SystemBase are classes, and xy is typedef float[2]. Summarized, the code is:
typedef vector<AgentBase *> AgentList; // dynamic list of agents
AgentBase **agent; // within SystemBase
------ in the thread -------
AgentList SystemBase :: GetAgentsInRange(xy center, float range) {
AgentList list;
for (int i = 0; i < n_agent; i++) {
AgentBase *a = agent[i];
if (xy_distsq(center, a->s) <= rangesq) {
list.push_back(a);
}
}
return list;
}
Called with (also inside thread):
AgentList list = GetAgentsInRange(...);
When I surround the function call in a mutex, it runs, but a mutex surrounding the code inside the function does not run. The crash happens in push_back, and each thread segfaults on the first function call. Also, if I put any printf's in the function, it works. Yeah, one of those bugs.
I also tried replacing vector with a basic array, and it works fine. This is my fallback, but I'd really like to use vector so I don't have to worry about returning the size as well as the array. But this is why I'm starting to think vector is not actually thread-safe, even though it should be - at least for what I'm doing.
The trace from gdb is:
Program received signal SIGSEGV, Segmentation fault.
[Switching to thread 2132.0x338]
0x75f3a09e in ?? ()
(gdb) backtrace
#0 0x75f3a09e in ?? ()
#1 0x00419e0c in AgentBase** std::__copy_trivial<AgentBase*>(AgentBase* const*,
AgentBase* const*, AgentBase**) (__first=0xa133a8, __last=0xa133ac,
__result=0x40f100) at C:/Dev-Cpp/include/c++/3.3.1/bits/stl_algobase.h:252
#2 0x00419d5f in AgentBase** std::__copy_aux2<AgentBase*>(AgentBase**, AgentBas
e**, AgentBase**, __true_type) (__first=0xa133a8, __last=0xa133ac,
__result=0x40f100) at C:/Dev-Cpp/include/c++/3.3.1/bits/stl_algobase.h:272
#3 0x00419d26 in __gnu_cxx::__normal_iterator<AgentBase**, std::vector<AgentBas
e*, std::allocator<AgentBase*> > > std::__copy_ni2<AgentBase**, __gnu_cxx::__nor
mal_iterator<AgentBase**, std::vector<AgentBase*, std::allocator<AgentBase*> > >
...
#9 0x00419965 in std::vector<AgentBase*, std::allocator<AgentBase*> >::push_bac
k(AgentBase* const&) (this=0xeffeb0, __x=@0xeffdd0)
at C:/Dev-Cpp/include/c++/3.3.1/bits/stl_vector.h:603
#10 0x00404375 in SystemBase::GetAgentsInRange(float*, float) (this=0x3511d0,
center=0x35f398, range=2) at abmsim.cpp:840
#11 0x00404e12 in SystemBase::Physics_Repulsion_Thread(int) (this=0x3511d0,
id=1) at abmsim.cpp:1163
#12 0x00404946 in SystemBase::Physics1_Thread(int) (this=0x3511d0, id=1)
---Type <return> to continue, or q <return> to quit---
at abmsim.cpp:1103
#13 0x004015aa in cpu_func_physics1(void*) (arg=0x35135c) at abmsim.cpp:158
#14 0x69ec12fa in ptw32_threadStart@4 ()
...
It also crashes sometimes on default_alloc_template(true, 0) or something like that. Also, sometimes if I make it run somehow, it crashes on a different push_back on a different vector in the same thread.
Please help. I recently switched over from C to C++ so I'm not used to templates. I can provide more info if it's vague at this point.
I'm using a function to create, fill, and return a vector which runs in multiple threads (with pthreads) simultaneously. When I run the program in a single thread, it works. In multiple threads on one processor, it works. Only when I use multiple processors does it crash, consistently on each thread. After spending the day troubleshooting I'm about to blame the vector template and make my own dynamic array, but hopefully someone can pick up on something I missed...
AgentBase and SystemBase are classes, and xy is typedef float[2]. Summarized, the code is:
typedef vector<AgentBase *> AgentList; // dynamic list of agents
AgentBase **agent; // within SystemBase
------ in the thread -------
AgentList SystemBase :: GetAgentsInRange(xy center, float range) {
AgentList list;
for (int i = 0; i < n_agent; i++) {
AgentBase *a = agent[i];
if (xy_distsq(center, a->s) <= rangesq) {
list.push_back(a);
}
}
return list;
}
Called with (also inside thread):
AgentList list = GetAgentsInRange(...);
When I surround the function call in a mutex, it runs, but a mutex surrounding the code inside the function does not run. The crash happens in push_back, and each thread segfaults on the first function call. Also, if I put any printf's in the function, it works. Yeah, one of those bugs.
I also tried replacing vector with a basic array, and it works fine. This is my fallback, but I'd really like to use vector so I don't have to worry about returning the size as well as the array. But this is why I'm starting to think vector is not actually thread-safe, even though it should be - at least for what I'm doing.
The trace from gdb is:
Program received signal SIGSEGV, Segmentation fault.
[Switching to thread 2132.0x338]
0x75f3a09e in ?? ()
(gdb) backtrace
#0 0x75f3a09e in ?? ()
#1 0x00419e0c in AgentBase** std::__copy_trivial<AgentBase*>(AgentBase* const*,
AgentBase* const*, AgentBase**) (__first=0xa133a8, __last=0xa133ac,
__result=0x40f100) at C:/Dev-Cpp/include/c++/3.3.1/bits/stl_algobase.h:252
#2 0x00419d5f in AgentBase** std::__copy_aux2<AgentBase*>(AgentBase**, AgentBas
e**, AgentBase**, __true_type) (__first=0xa133a8, __last=0xa133ac,
__result=0x40f100) at C:/Dev-Cpp/include/c++/3.3.1/bits/stl_algobase.h:272
#3 0x00419d26 in __gnu_cxx::__normal_iterator<AgentBase**, std::vector<AgentBas
e*, std::allocator<AgentBase*> > > std::__copy_ni2<AgentBase**, __gnu_cxx::__nor
mal_iterator<AgentBase**, std::vector<AgentBase*, std::allocator<AgentBase*> > >
...
#9 0x00419965 in std::vector<AgentBase*, std::allocator<AgentBase*> >::push_bac
k(AgentBase* const&) (this=0xeffeb0, __x=@0xeffdd0)
at C:/Dev-Cpp/include/c++/3.3.1/bits/stl_vector.h:603
#10 0x00404375 in SystemBase::GetAgentsInRange(float*, float) (this=0x3511d0,
center=0x35f398, range=2) at abmsim.cpp:840
#11 0x00404e12 in SystemBase::Physics_Repulsion_Thread(int) (this=0x3511d0,
id=1) at abmsim.cpp:1163
#12 0x00404946 in SystemBase::Physics1_Thread(int) (this=0x3511d0, id=1)
---Type <return> to continue, or q <return> to quit---
at abmsim.cpp:1103
#13 0x004015aa in cpu_func_physics1(void*) (arg=0x35135c) at abmsim.cpp:158
#14 0x69ec12fa in ptw32_threadStart@4 ()
...
It also crashes sometimes on default_alloc_template(true, 0) or something like that. Also, sometimes if I make it run somehow, it crashes on a different push_back on a different vector in the same thread.
Please help. I recently switched over from C to C++ so I'm not used to templates. I can provide more info if it's vague at this point.