Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Bug: Firestore AggregationQuery stuck when running locally in Docker #316

Open
@ArmandBriere

Description

@ArmandBriere

Firestore AggregationQuery getting stuck

We are getting Client.Timeout error when running AggregationQuery to count the number of documents in a query inside a Docker container.

How to reproduce

Use the code provided below with the following folder structure:

.
├── credentials.json
├── Dockerfile
├── main.py
└── requirements.txt
  • credentails.json is used to authenticate to Google Cloud and have access to Firestore. For this example, we assume that Firestore is set up for the project and can be access by this service account key.
  • Dockerfile, main.py and requirements.txt are provided below.
docker build -t bug .
docker run -d -p 8888:8080 --name bug bug:latest
  • Running the following curl command output the expected result:
$ curl http://localhost:8888
Count: 0.0

At this point the code works well, the issue appears when we restart the docker container, and we send multiple concurrent request to the endpoint using the hey HTTP load generator to simulate real traffic on our application:

docker restart bug
hey -c 10 -n 100 -m GET http://localhost:8888/

From that point we are not getting any response back from application, and we receive Get "http://localhost:8888/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

This error setup has been reproduced on Linux and Mac.

The error only seems to appear when we restart the container. It is also fix when we restart the container again. It is going in a loop of stuck, unstuck, stuck, unstuck.... We didn't manage to reproduce this bug by running this code outside of Docker.

Is there any undocumented caching or network protocol used by that tool that we should know of and that required some Docker config?

What we tested

  • Running the same code without the data = aggregate_query.count().get() line solve the timeout issue. We are no longer getting the data we need since we are not running it. By doing so, we isolated the issue to that line.
  • Adding the timeout parameter to the aggregate_query.count().get(timeout=2) does not do anything for us. This parameter doesn't seem to be working at all.
  • We tested this code on different network to exclude firewall rules that could block network calls.

Source code

  • main.py
"""BUGGED module."""

from datetime import datetime, timedelta
from typing import Tuple

import flask
import functions_framework
from flask import Response
from google.cloud.firestore_v1 import Query
from google.cloud.firestore_v1.aggregation import AggregationQuery
from google.cloud.firestore_v1.base_query import FieldFilter
from google.cloud.firestore_v1.client import Client as FirestoreClient

FIRESTORE_CLIENT = FirestoreClient()


def count_data_in_query_bugged(query: Query) -> int:
    """Count data in query."""
    print("Start counting data in query")
    # Transform to aggregation query to count
    aggregate_query: AggregationQuery = AggregationQuery(query)
    data = aggregate_query.count().get()
    count = data[0][0].value
    print("end counting data in query")
    return count


@functions_framework.http
def entry_point(request: flask.Request) -> Tuple[Response | str, int]:
    print("Request received")
    start = datetime.now() - timedelta(days=1)
    end = datetime.now()

    query = (
        FIRESTORE_CLIENT.collection("statistics")
        .where(filter=FieldFilter("status", "==", "acceptable"))
        .where(filter=FieldFilter("timestamp", ">=", start))
        .where(filter=FieldFilter("timestamp", "<", end))
    )

    count = count_data_in_query_bugged(query)
    print(count)
    return f"Count: {count}", 200
  • Dockerfile
FROM python:3.11

WORKDIR /app

COPY . .

# Install requirements
RUN pip install -r requirements.txt

ENV FUNCTION_TARGET="entry_point"
ENV GOOGLE_APPLICATION_CREDENTIALS="/app/credentials.json"

# Run cloud function locally
CMD functions-framework --target=$FUNCTION_TARGET --debug
  • requirements.txt
blinker==1.7.0
cachetools==5.3.3
certifi==2024.2.2
charset-normalizer==3.3.2
click==8.1.7
cloudevents==1.10.1
deprecation==2.1.0
Flask==3.0.2
functions-framework==3.5.0
google-api-core==2.18.0
google-auth==2.29.0
google-cloud-core==2.4.1
google-cloud-firestore==2.15.0
googleapis-common-protos==1.63.0
grpcio==1.62.1
grpcio-status==1.62.1
gunicorn==21.2.0
idna==3.6
itsdangerous==2.1.2
Jinja2==3.1.3
MarkupSafe==2.1.5
packaging==24.0
proto-plus==1.23.0
protobuf==4.25.3
pyasn1==0.5.1
pyasn1-modules==0.3.0
requests==2.31.0
rsa==4.9
urllib3==2.2.1
watchdog==4.0.0
Werkzeug==3.0.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions