-
Notifications
You must be signed in to change notification settings - Fork 354
Description
So the thing is that I had this test.py ran in 3600 ( 1h ) just fine for 10 hours.
Moved it to 7200 ( 2h ) and it dies like you see further down the page.
It runs just fine the first time ( chmod -x test.py;sleep 15;chmod +x test.py ) but 2 hours later .. I get this issue again.
This all started when I wanted to have a collector ran every 6 hours and I didn't know why it kept dying after one run ( same message bug bigger period of inactivity ).
The script runs just fine from console and has exit code 0.
hbase@d3xbucharest ~ $ cat tcollector/collectors/7200/test.py
#!/usr/bin/python
from __future__ import print_function
import re
import time
import sys
from collectors.lib import utils
def main():
utils.drop_privileges()
ts = int(time.time())
print ("test.test %d 100 product=test" % (ts))
print ("test.test %d 100 product=test" % (ts), file=sys.stderr)
time.sleep(300)
sys.stdout.flush()
if __name__ == "__main__":
main()
sys.exit(0)
Here are the logs:
hbase@d3xbucharest ~ $ grep 'test' tcollector/tcollector.log
2015-11-27 14:17:37,818 tcollector[7194] INFO: test.py (interval=7200) needs to be spawned
2015-11-27 14:17:37,820 tcollector[7194] INFO: spawned test.py (pid=1914)
2015-11-27 14:17:37,890 tcollector[7194] DEBUG: reading test.py got 38 bytes on stderr
2015-11-27 14:17:37,890 tcollector[7194] WARNING: test.py: test.test 1448626657 100 product=test
2015-11-27 14:18:22,996 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_sent 1448626697 0 collector=test.py host=d3xbucharest
2015-11-27 14:18:22,996 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_received 1448626697 0 collector=test.py host=d3xbucharest
2015-11-27 14:18:22,996 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_invalid 1448626697 0 collector=test.py host=d3xbucharest
2015-11-27 14:19:32,167 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_sent 1448626763 0 collector=test.py host=d3xbucharest
2015-11-27 14:19:32,167 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_received 1448626763 0 collector=test.py host=d3xbucharest
2015-11-27 14:19:32,167 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_invalid 1448626763 0 collector=test.py host=d3xbucharest
2015-11-27 14:20:32,270 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_sent 1448626823 0 collector=test.py host=d3xbucharest
2015-11-27 14:20:32,270 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_received 1448626823 0 collector=test.py host=d3xbucharest
2015-11-27 14:20:32,270 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_invalid 1448626823 0 collector=test.py host=d3xbucharest
2015-11-27 14:21:34,401 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_sent 1448626888 0 collector=test.py host=d3xbucharest
2015-11-27 14:21:34,401 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_received 1448626888 0 collector=test.py host=d3xbucharest
2015-11-27 14:21:34,401 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_invalid 1448626888 0 collector=test.py host=d3xbucharest
2015-11-27 14:22:34,535 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_sent 1448626948 0 collector=test.py host=d3xbucharest
2015-11-27 14:22:34,535 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_received 1448626948 0 collector=test.py host=d3xbucharest
2015-11-27 14:22:34,535 tcollector[7194] DEBUG: SENDING: put tcollector.collector.lines_invalid 1448626948 0 collector=test.py host=d3xbucharest
2015-11-27 16:17:44,280 tcollector[7194] INFO: test.py (interval=7200) needs to be spawned
2015-11-27 16:17:44,282 tcollector[7194] INFO: spawned test.py (pid=8245)
2015-11-27 16:17:45,079 tcollector[7194] DEBUG: reading test.py got 38 bytes on stderr
2015-11-27 16:17:45,079 tcollector[7194] WARNING: test.py: test.test 1448633864 100 product=test
2015-11-27 16:17:59,296 tcollector[7194] WARNING: Terminating collector test.py after 6921 seconds of inactivity
2015-11-27 16:17:59,297 tcollector[7194] INFO: Waiting 5s for PID 8245 (test.py) to exit...
2015-11-27 16:18:00,297 tcollector[7194] ERROR: test.py still has a process (pid=8245) and is being reset, terminating
Which is true .. the script isn't ran for the next 2 hours. This is how tcollector is ran. Should I increase the inactivity time ? Why ? The collector is meant to run every > max-inactivity-time and tcollector should take that into consideration.
hbase 7194 0.2 0.0 223116 15860 pts/8 Sl Nov26 3:00 /usr/bin/python2.7 /home/hbase/tcollector/tcollector.py -c /home/hbase/tcollector/collectors -H 127.0.0.1 -t host=d3xbucharest -P /home/hbase/tcollector/tcollector.pid --logfile /home/hbase/tcollector/tcollector.log --allowed-inactivity-time=3600 --backup-count=10 -v