Page 1 of 2

Many "lfd - (child) Statistic" processes

Posted: 02 Jul 2014, 20:06
by laxbobber
Can anyone help me determine what causes a process called "lfd - (child) Statistic" to be spawned? We had a server go into high load today and there were over 800 of these processes. I'm not seeing anything obvious in the logs so I'd like more to go on. What causes the "lfd - (child) Statistic" process?

Thanks in advance,
Bob

Re: Many "lfd - (child) Statistic" processes

Posted: 02 Jul 2014, 21:11
by ForumAdmin
That is a sub process that processes iptables log lines in the kernel log for the ST_ENABLE option. If those processes were hanging, you've most likely had a hanging lock on /var/lib/csf/stats/iptables_log from the first process for some reason. It's impossible to say what would have been causing the hanging lock after the fact - it would need to be investigated at the time. The most likely scenario would be if the /var partition ran out of space or inodes. The only other possibility might be if ST_IPTABLES has been set to something strange.

Re: Many "lfd - (child) Statistic" processes

Posted: 02 Jul 2014, 21:31
by laxbobber
Thanks for your reply! It's definitely not disk or inode as we've got plenty to spare there. Any advice on how to investigate it if/when it happens again? We definitely had disk contention at the time, with disk io near 100% and significant swapping.

Our ST_IPTABLES is still at the default of 100.

Re: Many "lfd - (child) Statistic" processes

Posted: 02 Jul 2014, 22:06
by ForumAdmin
If it recurs, you'd need to search through all the processes to see which have /var/lib/csf/stats/iptables_log open to try and find out why it is locked. This would be simplest done using lsof:

lsof | grep /var/lib/csf/stats/iptables_log, then work through the PID's to see if any of those processes is doing something odd other than just waiting on a lock using strace.

Unfortunately, this could be quite tricky on a system that is suffering high load with masses of processes running.

It is a very odd thing to happen and not one we've ever come across before. If you do find it happening regularly we can look into using a system we use elsewhere in lfd to skip routines when a lock is stuck. The main issue with that, though, is it would mean log lines being missed.

One last thought, do ensure that you have RESTRICT_SYSLOG set to "3" just incase it is a nefarious DOS against lfd.

Re: Many "lfd - (child) Statistic" processes

Posted: 23 Jul 2014, 02:34
by sneader
Hi Jonathan. We had this happen just now. lfd sent an email to let us know that the 5 minute average had hit 8.8 (the 1 minute average on this alert email says 24). This is a 16-CPU system. The email has ps.txt attached, and in that file it shows nearly 200 of these "lfd - (child) Statistics..." processes running, with timestamps within 2 minutes of each other. The parent process is "lfd - scanning log files".

Unfortunately, by the time I got to the server, load was already over 115 and I decided to just get the server under control instead of trying some of the ideas listed.

I would be VERY grateful for any information you can provide regarding a system that could skip routines when a lock is stuck. Also, if you have any interest in seeing this ps.txt, I could get it to you.

Thanks!

- Scott

Re: Many "lfd - (child) Statistic" processes

Posted: 23 Jul 2014, 08:15
by ForumAdmin
Scott, if you could email us the ps.txt at the usual address please do.

Re: Many "lfd - (child) Statistic" processes

Posted: 23 Jul 2014, 17:29
by sneader
Sent! Thanks so much!!

- Scott

Re: Many "lfd - (child) Statistic" processes

Posted: 24 Jul 2014, 09:47
by ForumAdmin
v7.06 has been released to alleviate this. It isn't possible from the information we have to know whether this is a symptom of a server problem or vice-versa, however the code change will prevent queued processes from building up for the stats settings:
http://blog.configserver.com/

Re: Many "lfd - (child) Statistic" processes

Posted: 24 Jul 2014, 17:36
by sneader
Thanks, Jonathan! We had another similar event earlier this morning (about 58 of the lfd statistics processes), and I forwarded you the info from that, but this was BEFORE our update to v7.07. I will keep you posted!

- Scott

Re: Many "lfd - (child) Statistic" processes

Posted: 17 Dec 2014, 16:26
by sneader
As an FYI, we had a server in a high load state this morning. lfd sent out an email with the output of ps, and there are over 7 HUNDRED "lfd - (child) Statistics..." processes listed. I'll email you the ps.txt, if it is of interest. We are running v7.56 (latest version)

- Scott