write_points method fails if called at exactly 5 second intervals

I'm experiencing a strange problem with the DB client that has taken a while to wrap my head around. Running a very simple logging script POSTing data to a remote database every five seconds (the five seconds is important), about one in 10 writes will fail throwing an `requests.exceptions.ConnectionError`.

Here is some example code that exhibits this behavior very consistently for me:

```
from influxdb import InfluxDBClient

client = InfluxDBClient('ADDRESS', 8086, 'USERNAME', 'PASSWORD', 'log', timeout=None)
json_body = [{
    "points": [],
    "name": "test.connectivity",
    "columns": ["time", "value", "errortype"]
}]
import time, calendar
while 1:
    inttime = calendar.timegm(time.gmtime())
    json_body[0]["points"].append([inttime, 1, None])
    try:
        client.write_points(json_body)
        print "Write succeeded"
        json_body[0]["points"] = []
    except Exception as e:
        print "Write failed: ", e, type(e)
        json_body[0]["points"][-1][1] = 0
        json_body[0]["points"][-1][2] = str(type(e))
    time.sleep(5)
```

The time.sleep(5) is important, because the following behavior only happens with a delay of five seconds between writes. Four seconds or six seconds don't have any problem. About every one in ten or so requests, the write_points call will throw an exception which prints out like this:

```
Write failed:  HTTPConnectionPool(host='ADDRESS', port=8086): Max retries exceeded with url: /db/log/series?p=PASSWORD&time_precision=s&u=USERNAME (Caused by <class 'httplib.BadStatusLine'>: '') <class 'requests.exceptions.ConnectionError'>
```

After watching the sequence of packets to influxDB in wireshark I've come to the conclusion that this issue is happening because of the way the requests library is being used. 

If requests are sent at a greater than 5 second interval, each request is sent, receives a HTTP 200 OK and then exactly 5 seconds later a FIN, ACK packet arrives from the server closing the TCP connection. Each subsequent request then opens a brand new connection with a SYN before sending the POST. 

For a request interval of less than five seconds, the HTTP requests keep the connection alive and a FIN is never received between database writes. 

If the InfluxDB writing interval is exactly five seconds though (not a seemingly uncommon logging interval) the packet sequence looks like this (two requests, five seconds apart, second one failed):
![screenshot from 2015-02-16 15 44 10](https://cloud.githubusercontent.com/assets/1697414/6219718/fcf97f0a-b5f2-11e4-8e3f-1a0e39564405.png)

To me, it appears that the POST request from influxdb-python and the FIN, ACK from the influxdb server must be in flight at the same time, since the ACK of 402 is the same as the ACK that the remote server sent after the first POST was received and does not include the size of the second packet. Somehow this ends up being interpreted by the python client as an httplib.BadStatusLine which causes the unexpected exception to be thrown. 

I understand that this is quite a complicated bug report for what seems like a simple issue, so thank you for taking the time to look over it. For now I'll just decrease my logging interval to four seconds to avoid this timing issue, but it seems like behavior that should be fixed to avoid trapping unsuspecting users of the client who might think they have connectivity issues or server problems when in reality the issue seems to be a very specific TCP state timing issue.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

write_points method fails if called at exactly 5 second intervals #103

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

write_points method fails if called at exactly 5 second intervals #103

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions