This is the mail archive of the
ecos-discuss@sources.redhat.com
mailing list for the eCos project.
RE: Question about ecos server performance
- From: Pieter Truter <ptruter at intrinsyc dot com>
- To: eCos Discussion <ecos-discuss at sources dot redhat dot com>
- Date: Tue, 13 Aug 2002 08:15:27 -0700
- Subject: RE: [ECOS] Question about ecos server performance
Make sure the priority of the application is lower that the network threads,
otherwise the network housekeeping threads does get cpu time.
-----Original Message-----
From: NavEcos [mailto:ecos@navosha.com]
Sent: Tuesday, August 13, 2002 8:17 AM
To: Gary Thomas; ecos@navosha.com
Cc: Andrew Lunn; eCos Discussion
Subject: Re: [ECOS] Question about ecos server performance
On Tuesday 13 August 2002 07:56, Gary Thomas wrote:
> On Tue, 2002-08-13 at 08:51, NavEcos wrote:
> > On Tuesday 13 August 2002 07:04, Gary Thomas wrote:
> > > On Tue, 2002-08-13 at 08:03, NavEcos wrote:
> > > > [SNIP]
> > > >
> > > > > > The bug is as follows:
> > > > > >
> > > > > > 1) The server (eCos app) starts,
> > > > > > 2) Connect to the server with telnet, port 4000
> > > > >
> > > > > Then what? What do you have to do [from the "client" side] to
> > > > > evoke the crash?
> > > >
> > > > Connect. That's all. My crash happens in less than 3000 bytes
> > > > of transferred data, always.
> > > >
> > > > If you want, I can send you my entire environment but before I
> > > > do that I'll update CVS. Maybe it was a bad day when I downloaded?
> > >
> > > No, I was able to duplicate this. I just asked before trying it as
> > > I didn't want to waste time if there was more that was necessary.
> > >
> > > The problem is obvious and, indeed, the program tells you exactly why.
> > > It's reporting "too many mbufs to tx", which comes from the logical
> > > network layer which tries to pack up a packet to be sent and give it
> > > to the physical driver. However, in this case, the data structure
> > > which represents the packet has [perhaps] hundreds of little tiny
> > > pieces in it. The method used by the physical layer can't handle that
> > > [currently]. I'll have to think a bit about how to fix this.
> >
> > Well, the documentation states that running out of mbufs will not
> > crash the TCP/IP layer. Why does it? I suspected that it was
> > because there were a bunch of tiny pieces but I didn't debug it. I
> > did see the error message, of course, and assumed they may be linked.
> > I probably should have mentioned that.
> >
> > Maybe incorporating a counting semaphore to cause threads allocating
> > mbufs to block would do it? I am not sure how much overhead there
> > would be in doing that, but it would nicely block the threads when there
> > were no more mbufs.
>
> This has *nothing* to do with running out of mbufs. That's not what
> what message says at all. It says that it [currently] can't handle
> a data packet which is composed of so many mbufs.
Sorry. As I said, I didn't do much work in debugging it.
> > > I would say that this is a aberrant program though and just happened
to
> > > run into this limitation.
> >
> > Well, I agree, it's an atypical example but it's still a serious problem
> > when you can crash it for whatever reason. The code is legal.
> >
> > I don't care about performance for such a program, what and I do not
> > think ANYBODY would writing awful code like that. But what concerns
> > me is that the stack crashes. There are instances in which you may
> > get a bunch of small packets being sent.
> >
> > For example, say you have a profiler that sends out the PC at the
> > time of an interrupt at regular intervals. If you get the interval
> > just right, you'll crash the box. You may do this as a low priority
> > thread too that sends all available data. In a quiet system, it will
> > end up sending 4 bytes almost always.
>
> But probably not continuously, as your example does though.
In most cases no, but in a critical system it would be dangerous.
For example, a medical system.
> I agree that there is a problem with the stack. It's simply not a
> scenario I ever imagined (nor, until today, experienced). It's been
> filed as a bug and will get fixed [someday].
>
> Of course, you're free to fix it yourself. Remember that for the
> most part, eCos is now a *volunteer* project. I'm certainly not
> getting paid to fix this (any more). Things will get fixed if and
> when there is time.
I am well aware that it's a volunteer project. I've contributed
several patches before for the XScale board and I fixed a driver
problem with the stack back in September of last year before it was
a volunteer project. That bug also caused a crash.
If you could give me some advice as to what exactly is going
on and where, I'll submit a patch as time permits. I'm busy too
but I think I can look into it before next week. I am trying to gain
a mastery of this OS, so I'll be happy to try to fix it. I just wrote the
list to confirm it was indeed a bug. I don't have a huge sampling
of hardware here.
-Rich
--
Before posting, please read the FAQ: http://sources.redhat.com/fom/ecos
and search the list archive: http://sources.redhat.com/ml/ecos-discuss
--
Before posting, please read the FAQ: http://sources.redhat.com/fom/ecos
and search the list archive: http://sources.redhat.com/ml/ecos-discuss