Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Slow response when data url is produced by ERRDAP #439

@Mikejmnez

Description

@Mikejmnez

problem description

This is not a pydap issue, but it seems a requests issue:

It takes a very long time to get a response when using requests.session.get, when the dataurl is hosted by an ERRDAP server. While normally it takes O(ms) , I am experiencing O(1) minutes to get a 200 status response.

Minimal example

import requests
session = requests.session()
url = "https://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41"
session.get(url+".dds", allow_redirects=True)

Then

%%time
session.get(url+".dds", allow_redirects=True)

It takes close to 2 minutes just to fetch the dds (metadata) response.

With curl on the other hand, it takes about a second or less. The below the entire output:

Jimenez@Work: ~ > curl -I -v -L "https://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41.dds"
* Host coastwatch.pfeg.noaa.gov:443 was resolved.
* IPv6: 2610:20:90a3:3bcc::15
* IPv4: 161.55.160.15
*   Trying [2610:20:90a3:3bcc::15]:443...
*   Trying 161.55.160.15:443...
* Connected to coastwatch.pfeg.noaa.gov (161.55.160.15) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
* ALPN: server accepted http/1.1
* Server certificate:
*  subject: C=US; ST=California; L=La Jolla; O=Southwest Fisheries Science Center; CN=coastwatch.pfeg.noaa.gov
*  start date: Dec 13 00:00:00 2024 GMT
*  expire date: Jan 13 23:59:59 2026 GMT
*  subjectAltName: host "coastwatch.pfeg.noaa.gov" matched cert's "coastwatch.pfeg.noaa.gov"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
*  SSL certificate verify ok.
* using HTTP/1.x
> HEAD /erddap/griddap/jplMURSST41.dds HTTP/1.1
> Host: coastwatch.pfeg.noaa.gov
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/1.1 200 
HTTP/1.1 200 
< Date: Thu, 06 Mar 2025 22:45:46 GMT
Date: Thu, 06 Mar 2025 22:45:46 GMT
< Server: Apache
Server: Apache
< Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
< X-Frame-Options: SAMEORIGIN, SAMEORIGIN
X-Frame-Options: SAMEORIGIN, SAMEORIGIN
< Last-Modified: Thu, 06 Mar 2025 22:45:46 GMT
Last-Modified: Thu, 06 Mar 2025 22:45:46 GMT
< xdods-server: dods/3.7
xdods-server: dods/3.7
< erddap-server: 2.25_1
erddap-server: 2.25_1
< Content-Description: dods-dds
Content-Description: dods-dds
< Content-Encoding: identity
Content-Encoding: identity
< vary: accept-encoding
vary: accept-encoding
< Content-Type: text/plain;charset=ISO-8859-1
Content-Type: text/plain;charset=ISO-8859-1
< X-XSS-Protection: 1; mode=block
X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
X-Content-Type-Options: nosniff
< Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Opener-Policy: same-origin
< Content-Security-Policy: script-src 'self' https://accounts.google.com https://apis.google.com https://code.jquery.com/ https://www.google-analytics.com https://www.googletagmanager.com https://www.gstatic.com https://stackpath.bootstrapcdn.com https://fp1.formmail.com https://coastwatch.noaa.gov https://polarwatch.noaa.gov 'unsafe-inline' 'unsafe-eval'; font-src 'self' https://fonts.googleapis.com https://stackpath.bootstrapcdn.com; frame-ancestors 'self'  https://heatherwelch.shinyapps.io;
Content-Security-Policy: script-src 'self' https://accounts.google.com https://apis.google.com https://code.jquery.com/ https://www.google-analytics.com https://www.googletagmanager.com https://www.gstatic.com https://stackpath.bootstrapcdn.com https://fp1.formmail.com https://coastwatch.noaa.gov https://polarwatch.noaa.gov 'unsafe-inline' 'unsafe-eval'; font-src 'self' https://fonts.googleapis.com https://stackpath.bootstrapcdn.com; frame-ancestors 'self'  https://heatherwelch.shinyapps.io;
< Connection: close
Connection: close
< 

* Closing connection

potential solution

From curl output above, it turns out the connection is done via IPv4. I read that requests uses IPv6. And so one can force the connection to IPv4 wiith requests as follows:

# Resolve the IPv4 address
import socket
import requests

url = "https://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41.dds"

# Resolve the IPv4 address
ipv4_address = socket.gethostbyname("coastwatch.pfeg.noaa.gov")
ipv4_url = f"https://{ipv4_address}/erddap/griddap/jplMURSST41.dds"

# Send request with resolved IPv4
session = requests.Session()

Now, the following works, but only when I set verify=False (non-default)

%%time
response = session.get(ipv4_url, headers={"Host": "coastwatch.pfeg.noaa.gov"}, timeout=10, verify=False)

with time O(ms), with warning InsecureRequestWarning: Unverified HTTPS request is being made to host '161.55.160.15'. Adding certificate verification is strongly advised.

I haven't figured out how to make it work with verify=True, but the above works and get the correct DDS for the GRID file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions