[ircd-ratbox] ratbox-services hangs on run.

Jeremy Chadwick jdc at koitsu.org
Wed Oct 16 03:18:30 UTC 2013


The futex() call is not necessarily the problem; EAGAIN may always get
returned, we simply don't know; I see all kinds of syscalls on FreeBSD
spit out "errors" to the untrained eye.  :-)

What I don't see is an infinite loop happening where a series of
syscalls are getting executed over and over.  The last syscall to get
executed -- successfully -- is getuid32().  From there it appears the
CPU core is spinning in some way (either due to userland code, or kernel
code -- we don't know which).

The problem with strace and some other utilities (see below) is that
they don't tell you what the syscall is that's being executed (if there
is one) *before* printing that line; they only show it after the fact.

gdb may be able to help here -- if you're skilled at using it.  Most
end-users are not, and even as an occasional software developer I tend
to resort to printf debugging since gdb tends to piss me off.

Short story: late last week while building fetchmail at work on one of
our dedicated Solaris 10 machines (specifically made to build software,
with no other purpose), I ran into this situation: fetchmail worked; it
would output a couple lines, but then suddenly suck up 100% of a CPU
core.  truss said the last syscall was getuid() and then nothing more.

Root cause?  A previous developer who has since left the company had
installed gcc 4.7.0 on that system without telling anyone.  I reverted
to a package of gcc 4.3.6 which we knew worked reliably, rebuilt
fetchmail, and suddenly everything was fine.  Nothing else changed --
only the compiler.

Not saying that's the cause here, but I am saying that the pastebin
log does not show a loop of syscalls being run, which seems to point at
something even lower level.

-- 
| Jeremy Chadwick                                   jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Making life hard for others since 1977.             PGP 4BD6C0CB |

On Tue, Oct 15, 2013 at 08:42:12PM -0600, Jason Booth wrote:
> Here is his strace
> 
> http://pastebin.com/iBGQeisX
> 
> 
> futex(0xbec43778, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL,
> b6cc0000) = -1 EAGAIN (Resource temporarily unavailable)
> 
> http://man7.org/linux/man-pages/man2/futex.2.html
> 
> I'd guess that what ever resource ircd-ratbox wanted it could not or did
> not have permission to. Perhaps someone else might know from the mailing
> list.
> 
> I think it would be good if you did a gdb session as well because it was
> not clear to me what was causing this.
> 
> gdb ircd-ratbox
> once in run
> (gdb) r
> 
> Then if it crashes you can look at the backtrace. That should give you a
> little more detail I would think.
> 
> For example
> 
> root at system1:~# gdb ratbox-services
> 
> (gdb) r
> Starting program: /usr/sbin/ratbox-services
> [Thread debugging using libthread_db enabled]
> ratbox-services will not run as root
> 
> Program exited with code 0377.
> 
> Or as a non root user
> 
> (gdb) r -f
> Starting program: /usr/sbin/ratbox-services -f
> [Thread debugging using libthread_db enabled]
> ratbox-services: version 1.2.1(20080628_1-25639)
> ratbox-services: pid 8017
> ratbox-services: running in foreground
> [New Thread 0x7ffff716f700 (LWP 8020)]
> [Thread 0x7ffff716f700 (LWP 8020) exited]
> 
> Program received signal SIGPIPE, Broken pipe.
> 0x00007ffff766c00d in write () at ../sysdeps/unix/syscall-template.S:82
> 82      ../sysdeps/unix/syscall-template.S: Permission denied.
>         in ../sysdeps/unix/syscall-template.S
> 
> 
> I hope that helps get you to the next step.
> 
> -JB
> 
> 
> On Tue, Oct 15, 2013 at 3:52 PM, Jeremy Chadwick <jdc at koitsu.org> wrote:
> 
> > On Tue, Oct 15, 2013 at 05:39:51PM -0400, Weasel Grease wrote:
> > > I've come to a complete halt on options with this one.  I've got a
> > > Raspberry Pi running ircd-ratbox.  When I attempt to run
> > > ratbox-services it hangs, using around 86% CPU.  I get no log files
> > > from the ircd saying it failed to connect, and I get absolutely no log
> > > files from ratbox-services at all.  It hangs even when I give
> > > ratbox-services -t.  The only things I can do with ratbox-services is
> > > check the version and see the help.  If anyone has any idea of where I
> > > can look in the system to figure out why it's hanging, it'd be much
> > > appreciated.  I spent roughly eight hours thinking it was a
> > > configuration issue, but I'd expect the test configuration option to
> > > tell me that and not hang like it is.
> >
> > Find out what the process is spinning on, syscall-wise, using truss,
> > strace, or ktrace (I have no familiarity with Raspberry Pi therefore
> > cannot tell you which of those tools to use).
> >
> > --
> > | Jeremy Chadwick                                   jdc at koitsu.org |
> > | UNIX Systems Administrator                http://jdc.koitsu.org/ |
> > | Making life hard for others since 1977.             PGP 4BD6C0CB |
> >
> > _______________________________________________
> > ircd-ratbox mailing list
> > ircd-ratbox at lists.ratbox.org
> > http://lists.ratbox.org/cgi-bin/mailman/listinfo/ircd-ratbox
> >


More information about the ircd-ratbox mailing list