-
Notifications
You must be signed in to change notification settings - Fork 88
Description
problem description
This is not a pydap issue, but it seems a requests issue:
It takes a very long time to get a response when using requests.session.get, when the dataurl is hosted by an ERRDAP server. While normally it takes O(ms) , I am experiencing O(1) minutes to get a 200 status response.
Minimal example
import requests
session = requests.session()
url = "https://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41"
session.get(url+".dds", allow_redirects=True)Then
%%time
session.get(url+".dds", allow_redirects=True)It takes close to 2 minutes just to fetch the dds (metadata) response.
With curl on the other hand, it takes about a second or less. The below the entire output:
Jimenez@Work: ~ > curl -I -v -L "https://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41.dds"
* Host coastwatch.pfeg.noaa.gov:443 was resolved.
* IPv6: 2610:20:90a3:3bcc::15
* IPv4: 161.55.160.15
* Trying [2610:20:90a3:3bcc::15]:443...
* Trying 161.55.160.15:443...
* Connected to coastwatch.pfeg.noaa.gov (161.55.160.15) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
* CAfile: /etc/ssl/cert.pem
* CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* (304) (IN), TLS handshake, Unknown (8):
* (304) (IN), TLS handshake, Certificate (11):
* (304) (IN), TLS handshake, CERT verify (15):
* (304) (IN), TLS handshake, Finished (20):
* (304) (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
* ALPN: server accepted http/1.1
* Server certificate:
* subject: C=US; ST=California; L=La Jolla; O=Southwest Fisheries Science Center; CN=coastwatch.pfeg.noaa.gov
* start date: Dec 13 00:00:00 2024 GMT
* expire date: Jan 13 23:59:59 2026 GMT
* subjectAltName: host "coastwatch.pfeg.noaa.gov" matched cert's "coastwatch.pfeg.noaa.gov"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert Global G2 TLS RSA SHA256 2020 CA1
* SSL certificate verify ok.
* using HTTP/1.x
> HEAD /erddap/griddap/jplMURSST41.dds HTTP/1.1
> Host: coastwatch.pfeg.noaa.gov
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200
HTTP/1.1 200
< Date: Thu, 06 Mar 2025 22:45:46 GMT
Date: Thu, 06 Mar 2025 22:45:46 GMT
< Server: Apache
Server: Apache
< Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
< X-Frame-Options: SAMEORIGIN, SAMEORIGIN
X-Frame-Options: SAMEORIGIN, SAMEORIGIN
< Last-Modified: Thu, 06 Mar 2025 22:45:46 GMT
Last-Modified: Thu, 06 Mar 2025 22:45:46 GMT
< xdods-server: dods/3.7
xdods-server: dods/3.7
< erddap-server: 2.25_1
erddap-server: 2.25_1
< Content-Description: dods-dds
Content-Description: dods-dds
< Content-Encoding: identity
Content-Encoding: identity
< vary: accept-encoding
vary: accept-encoding
< Content-Type: text/plain;charset=ISO-8859-1
Content-Type: text/plain;charset=ISO-8859-1
< X-XSS-Protection: 1; mode=block
X-XSS-Protection: 1; mode=block
< X-Content-Type-Options: nosniff
X-Content-Type-Options: nosniff
< Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Opener-Policy: same-origin
< Content-Security-Policy: script-src 'self' https://accounts.google.com https://apis.google.com https://code.jquery.com/ https://www.google-analytics.com https://www.googletagmanager.com https://www.gstatic.com https://stackpath.bootstrapcdn.com https://fp1.formmail.com https://coastwatch.noaa.gov https://polarwatch.noaa.gov 'unsafe-inline' 'unsafe-eval'; font-src 'self' https://fonts.googleapis.com https://stackpath.bootstrapcdn.com; frame-ancestors 'self' https://heatherwelch.shinyapps.io;
Content-Security-Policy: script-src 'self' https://accounts.google.com https://apis.google.com https://code.jquery.com/ https://www.google-analytics.com https://www.googletagmanager.com https://www.gstatic.com https://stackpath.bootstrapcdn.com https://fp1.formmail.com https://coastwatch.noaa.gov https://polarwatch.noaa.gov 'unsafe-inline' 'unsafe-eval'; font-src 'self' https://fonts.googleapis.com https://stackpath.bootstrapcdn.com; frame-ancestors 'self' https://heatherwelch.shinyapps.io;
< Connection: close
Connection: close
<
* Closing connection
potential solution
From curl output above, it turns out the connection is done via IPv4. I read that requests uses IPv6. And so one can force the connection to IPv4 wiith requests as follows:
# Resolve the IPv4 address
import socket
import requests
url = "https://coastwatch.pfeg.noaa.gov/erddap/griddap/jplMURSST41.dds"
# Resolve the IPv4 address
ipv4_address = socket.gethostbyname("coastwatch.pfeg.noaa.gov")
ipv4_url = f"https://{ipv4_address}/erddap/griddap/jplMURSST41.dds"
# Send request with resolved IPv4
session = requests.Session()Now, the following works, but only when I set verify=False (non-default)
%%time
response = session.get(ipv4_url, headers={"Host": "coastwatch.pfeg.noaa.gov"}, timeout=10, verify=False)with time O(ms), with warning InsecureRequestWarning: Unverified HTTPS request is being made to host '161.55.160.15'. Adding certificate verification is strongly advised.
I haven't figured out how to make it work with verify=True, but the above works and get the correct DDS for the GRID file.