Adding basic chunk support #39

apppies · 2017-03-19T20:15:27Z

Adds support for obtaining chunked responses from InfluxDb.
After all chunks have been received they are processed and returned in one list.
Chunks are not stitched together, as a chunk can be split on both series or number of rows. The consumer has to handle that part.

It would be nice to process each chunk as soon as the http data has been received, but that requires (I think) a lot more work.

Minor cleanup

tihomir-kit · 2017-03-20T09:14:51Z

Hey man, thanks for this. I'll try to deploy it to nuget.org tonight in v6.0.2.

Just one quick question as I haven't had the time to take a more careful look at the code - for the end user, this is as simple as consuming any other api calls? The end user doesn't need to collect these chunks into a collection and then put them together? It works out of the box and puts the whole message together in the background?

apppies · 2017-03-20T10:31:25Z

To your question: no there is still work left for the user:
But due to your question I just found the statement_id being returned by the influxdb response. Let me see if I can fit that in.

I'm currently using it like this, where Datapoint is a oxyplot datapoint:

var query = "SELECT value FROM measurement"
    + $" WHERE time >= '{start:yyyy-MM-dd HH:mm:ss}'"
    + $" AND time < '{end:yyyy-MM-dd HH:mm:ss}'"
    + " AND (item_tag = '" + string.Join("' OR item_tag = '", tags) + "')"
    + " GROUP BY item_tag";
var response = await influxClient.Client.QueryChunkedAsync(database, query, 50000);

//Collect the values belonging to the same serie
var datapoints = new Dictionary<string, List<DataPoint>>();
foreach (var serie in response)
{
    var tag = serie.Tags.First().Value;
    var data = new List<DataPoint>(serie.Values.Count);
    for (int i = 0; i < serie.Values.Count; i++)
    {
        var newDatapoint = new DataPoint(DateTimeAxis.ToDouble(serie.Values[i][0]), Convert.ToDouble(serie.Values[i][1]));
        data.Add(newDatapoint);
    }

    if (datapoints.ContainsKey(tag))
        datapoints[tag].AddRange(data);
    else
        datapoints.Add(tag, data);
}

apppies · 2017-03-20T10:52:39Z

Taking a quick look it seems doable by using the statement_id and partial values in the response. The difficulty might be that the chunks for different series can be interleaved.

tihomir-kit · 2017-03-25T08:14:15Z

Hi,

thanks for the PR. Just to let you know, I'll merge this in about a week or two.

Cheers!

apppies · 2017-03-27T09:03:06Z

There is a slight problem with chunks: as there is no longer a limit on the amount of data InfluxDB is sending it is rather easy to get an out of memory exception after running a query with a large result set.

tihomir-kit · 2017-04-23T12:13:11Z

Merged this. Seems really solid. Thanks!

tihomir-kit · 2017-04-23T17:52:49Z

Also, while I was merging epoch time format from another PR, I had to change code a bit because overloads and additional methods started piling up so chunkSize ended up being an optional param on one of the original methods. Not a huge change, but once I deploy the nuget, you'll probably have a few things to update if you used your build in the mean time.

yellboy · 2017-08-22T09:27:04Z

Hello pootzko. Is this already deployed to nuget? I just updated to 7.0.3 and this method seems not to be included. Thanks.

tihomir-kit · 2017-08-22T09:50:30Z

@yellboy hi!

I removed the separate method and added an optional chunkSize param to QueryAsync() and MultiQueryAsync() methods. If the param is not passed, the request won't be chunked.

Please see:

https://github.com/pootzko/InfluxData.Net/blob/41fc728a5d85e9fdba5d7f3f485c3eaca5805599/InfluxData.Net.InfluxDb/ClientModules/BasicClientModule.cs#L24

https://github.com/pootzko/InfluxData.Net/blob/41fc728a5d85e9fdba5d7f3f485c3eaca5805599/InfluxData.Net.InfluxDb/ClientModules/ClientModuleBase.cs#L54

Let me know if that helped. Cheers!

yellboy · 2017-08-22T10:54:17Z

Thanks for the answer. It answers my question, but I actually wanted to asynchronously fetch each chunk and to do some processing with every chunk, but that I don't have to wait for all chunks to be fetched to start this processing. However, if I didn't misunderstood docs, there is no way to do this right now except manually. I guess I will have to go with that way. If I find a nice way to do this which could enrich this library, I will fork and create a PR. Thanks for the quick answer once again.

tihomir-kit · 2017-08-22T19:14:13Z

I'm actually not fully sure that will work (if you need the payload) as the beginning and the end might contain crucial part of the payload such as some headers or something like that. So you might not be able to use the payload that's in the middle, but I'm curious what you will find out. Good luck, and let us know! :)

yellboy · 2017-09-20T12:18:11Z

Hi @pootzko! I am not completely sure what you mean. Here is what I needed: having a measurement containing for example 100000 rows, I wanted to fetch all the data and process it. However, this would take too much time if done synchronously, because that would mean that first the data is fetched and then the data is processed. What I did is actually creating queries that have LIMIT and OFFSET inside, with LIMIT being, let's say 10000. That means creating 10 threads that run in parallel, every thread fetching one chunk, and then processing it. This made the process a lot faster, because you start processing before all the data is fetched. But, I don't think this should be part of your library, since it is nothing more than just creating queries containing LIMIT and OFFSET clauses.

tihomir-kit · 2017-10-02T21:09:34Z

I see. Yeah, I'm not sure the lib should be doing what you needed either. The chunked option simply means more async under the hood. It's not really supposed to be something you sort of "hook-into" and process the data in parallel.

I think your approach is great for your needs, and I'm glad you solved it.

Gert added 3 commits March 14, 2017 19:07

Adding basic chunk support

76019de

reverting solution file

b09df06

Added option to change maximum chunk size

2cc7b60

Minor cleanup

tihomir-kit mentioned this pull request Mar 20, 2017

Ability to handle chunked query responses ? #36

Closed

tihomir-kit merged commit a195654 into tihomir-kit:master Apr 23, 2017

tihomir-kit added a commit that referenced this pull request Apr 23, 2017

Minor refactoring after #39 and addded MultiQueryChunkedAsync().

31f1a33

tihomir-kit mentioned this pull request Aug 22, 2017

Update docs about how to use "chunked" #55

Closed

Adding basic chunk support #39

Adding basic chunk support #39

Conversation

apppies commented Mar 19, 2017

Uh oh!

tihomir-kit commented Mar 20, 2017

Uh oh!

apppies commented Mar 20, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apppies commented Mar 20, 2017

Uh oh!

tihomir-kit commented Mar 25, 2017

Uh oh!

apppies commented Mar 27, 2017

Uh oh!

tihomir-kit commented Apr 23, 2017

Uh oh!

tihomir-kit commented Apr 23, 2017

Uh oh!

yellboy commented Aug 22, 2017

Uh oh!

tihomir-kit commented Aug 22, 2017

Uh oh!

yellboy commented Aug 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tihomir-kit commented Aug 22, 2017

Uh oh!

yellboy commented Sep 20, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tihomir-kit commented Oct 2, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

apppies commented Mar 20, 2017 •

edited

Loading

yellboy commented Aug 22, 2017 •

edited

Loading

yellboy commented Sep 20, 2017 •

edited

Loading