This is the mail archive of the cygwin@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

failing malloc in multithreaded program using cygwin 1.3.2


Hello,

I'm trying to port an application which I built on linux to Windows 95.

It consists of a GUI displaying data that is produced by a technical
simulation program. The main thread starts does the GUI processing, a second
thread starts the simulation program as a background process. Once in a
while, the simulation program produces data that I keep in a list, so as to
be able to scan through it and display some of it in graphs. When new data
was added to the list, I signal an event to the main thread and the GUI then
updates the displays. I tried to make an extract with the important
ingredients in the following lines.



// ------------------------------------------------------------
// Event to communicate that new data was made by the process
Event new_data;
// Mutex to protect data
Mutex data_mutex;

// GUI class
class GUI {
public:
  //...
  void update() {
    // this function sets the data and text labels 
    // to be displayed

    data_mutex.Acquire();
    // scan data and extract some numbers
    // ...
    // get id number of the data
    int id = data.get_id();
    data_mutex.Release();

    // some text processing, for example
    std::ostrstream text;
    text << " data id: " << id << endl;  // SIGSEV typically occurs in
    std::string temp_s = text.str();     // one of these statements
    char * temp_c = temp_s.c_str();      // ....
    strncpy(label_, temp_c, 100);
    // some more processing ...
  }    
  // .....
private:
  char label_[100];
};


int main() {
  // making and showing the gui
  GUI gui;     
  gui.show();
  // making and starting the background process
  // produces new data, typically every 1/2 sec
  Process proc;  
  proc.Start();

  while(proc.is_running()) {
    if (new_data.Test()) {
      // if new data were produced: update gui
      gui.update();
      gui.redisplay();
    }
    // wait a little while before trying to update
    microsec_sleep(200);
  }
}
// --------------------------------------------------------------------



The simulation program is actually some old c code (and even some FORTRAN
routines) wrapped in a c++ class. The c is compiled with gcc, the FORTRAN
with g77, the rest is compiled and the whole thing linked with g++. The c
uses printf() which I know is not thread safe, but as there is only one
thread executing the c code, I don't expect any problems there, at least not
of the kind that I now experience.

This program runs fine on linux (although there is some memory leak which I
still have to find) for more than a day, but crashes on Windows 95 with a
segmentation fault after about five minutes. The point in the source code
where this happens, for a single build, is more or less the same, but the
point in time can vary (the number of times that the GUI was updated). The
SIGSEV mostly occurs in the main thread, typically in an operation involving
std::string - I have the feeling that it is when creating memory for
strings, though I wasn't always able to trace it -, but it can also occur in
the secondary thread (much less likely, one out of ten I'd say), when
creating memory for the newly produced data.

When I change the code, for example to avoid string processing, the error
will occur somewhere else where memory is involved, like when creating a
temporary std::vector.

In some older e-mails (1998), I read that linking with mmaloc might solve
the problem. I gave it a shot but got the following message from gdb:


-------------------------------------------
Program received signal SIGSEGV, Segmentation fault.
0x004c2776 in mmalloc () at gui_process_port.cxx:78
Current language:  auto; currently c++
(gdb) bt
#0  0x004c2776 in mmalloc () at gui_process_port.cxx:78
#1  0x004c29e8 in malloc () at gui_process_port.cxx:78
#2  0x004d0876 in __builtin_new (sz=100)
#3  0x004d0b13 in __builtin_vec_new (sz=100)
#4  0x004c573e in default_alloc () at gui_process_port.cxx:78
#5  0x004c94bd in _IO_str_overflow () at gui_process_port.cxx:78
#6  0x004c60f2 in strstreambuf::overflow () at gui_process_port.cxx:78
#7  0x004c86c1 in __overflow () at gui_process_port.cxx:78
#8  0x004c8ab2 in _IO_default_xsputn () at gui_process_port.cxx:78
#9  0x004c6eaa in streambuf::xsputn () at gui_process_port.cxx:78
#10 0x004c4b44 in ostream::write () at gui_process_port.cxx:78
#11 0x004e4d92 in ostream & operator<<<char, string_char_traits<char>,
__default_alloc_template<false, 0> > (o=@0xcbf7b4, 
    s=@0xcbf71c) at /usr/include/g++-3/std/bastring.cc:470
#12 0x0041c717 in TR_Axis::set_label (this=0x1a96cc8) at TR_Axis.cxx:204
#13 0x0041b011 in Graph_Plot_Area_Window::display (this=0x1a98f00,
curve_list=@0x1a8d16c) at Graph_Plot_Area.cxx:232
#14 0x0040b7b2 in Graph_Window::display (this=0x1a8d280,
curve_list=@0x1a8d16c) at Graph_Window.cxx:137
#15 0x00429b08 in TP_Curve_Manager::update_running_curves (this=0x1a8d100,
running_case=0x15fae00) at TP_Curve_Manager.cxx:154
#16 0x004fbe33 in Graph_Window::update_running_curves (this=0x1a8d280,
running_case=0x15fae00) at Graph_Window.h:45
#17 0x00409c0c in Process_Plot::update_running_case (this=0xcbfbcc,
running_case=0x15fae00) at Process_Plot.cxx:146
#18 0x00421eba in TR_Running_Case::time_out (v=0x15fae00) at TR_Case.cxx:581
#19 0x0043800d in Fl::wait () at gui_process_port.cxx:78
#20 0x004380e8 in Fl::run () at gui_process_port.cxx:78
#21 0x004065d9 in main () at trfplot.cxx:10
#22 0x61003aea in _libwsock32_a_iname ()
#23 0x61003cbd in _libwsock32_a_iname ()
#24 0x61003cfc in _libwsock32_a_iname ()
#25 0x004cea93 in cygwin_crt0 () at gui_process_port.cxx:78
----------------------------------------------



There are still some tests that I could do before I give up (like using a
pseudo process, and / or going back to single thread processing) but I
wondered whether this sounds familiar to anybody. Am I looking at a bug in
my code (why then would operator new fail?) or am I using cygwin in the
wrong way? I recently upgraded from cygwin 1.1 to 1.3.2, but get exactly the
same results. Also different hardware (one box with 64Mb RAM, another with
256) doesn't matter.

I'm not subscribed to the mailing list, so please copy me in when replying.
Thanks in advance for any help.



Geert Karman



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]