🔧 Glue Iceberg Rest Api and PyIceberg
Nachrichtenbereich: 🔧 Programmierung
🔗 Quelle: dev.to
Access Glue Iceberg tables via the Iceberg Rest Api
AWS Released silenty Iceberg REST-API support. This is a standard API to access iceberg tables on different platforms. More information can be found here https://iceberg.apache.org/concepts/catalog/
PyIceberg is a python library with generic iceberg support. I also supports the rest api.
from pyiceberg.catalog import load_catalog
import logging
# Set up logging to show debug messages
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
# Specifically for PyIceberg logging
logger = logging.getLogger('pyiceberg')
logger.setLevel(logging.DEBUG)
def main():
rest_catalog = load_catalog(
"ibtest1",
**{
"type": "rest",
"uri": "https://glue.eu-central-1.amazonaws.com/iceberg",
"rest.sigv4-enabled": "true",
"rest.signing-name": "glue",
"rest.signing-region": "eu-central-1"
}
)
print(rest_catalog.list_namespaces())
print(rest_catalog.list_tables("ibtest"))
print(rest_catalog.load_table("ibtest.ibtest1").scan().to_pandas())
if __name__ == "__main__":
main()
Glue Catalog version
For comparison this is the native glue version in pyiceberg. This uses the boto api.
def main():
glue_catalog = load_catalog("glue", **{"type": "glue"})
print(glue_catalog.list_namespaces())
print(glue_catalog.list_tables("ibtest"))
print(glue_catalog.load_table("ibtest.ibtest1").scan().to_pandas())
Output
ListNameSpace
2024-12-22 15:13:41,061 - botocore.credentials - INFO - Found credentials in shared credentials file: ~/.aws/credentials
2024-12-22 15:13:41,062 - botocore.auth - DEBUG - Calculating signature using v4 auth.
2024-12-22 15:13:41,062 - botocore.auth - DEBUG - CanonicalRequest:
GET
/iceberg/v1/config
accept:*/*
accept-encoding:gzip, deflate
content-type:application/json
host:glue.eu-central-1.amazonaws.com
x-amz-date:20241222T141341Z
x-client-version:0.14.1
x-iceberg-access-delegation:vended-credentials
accept;accept-encoding;content-type;host;x-amz-date;x-client-version;x-iceberg-access-delegation
xxxxxxxx
2024-12-22 15:13:41,062 - botocore.auth - DEBUG - StringToSign:
AWS4-HMAC-SHA256
20241222T141341Z
20241222/eu-central-1/glue/aws4_request
xxxxxxx
2024-12-22 15:13:41,062 - botocore.auth - DEBUG - Signature:
xxxxxx
2024-12-22 15:13:41,062 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): glue.eu-central-1.amazonaws.com:443
2024-12-22 15:13:41,213 - urllib3.connectionpool - DEBUG - https://glue.eu-central-1.amazonaws.com:443 "GET /iceberg/v1/config HTTP/1.1" 200 327
2024-12-22 15:13:41,237 - botocore.credentials - INFO - Found credentials in shared credentials file: ~/.aws/credentials
2024-12-22 15:13:41,237 - botocore.auth - DEBUG - Calculating signature using v4 auth.
2024-12-22 15:13:41,237 - botocore.auth - DEBUG - CanonicalRequest:
GET
/iceberg/v1/catalogs/311141556126/namespaces
accept:*/*
accept-encoding:gzip, deflate
content-type:application/json
host:glue.eu-central-1.amazonaws.com
x-amz-date:20241222T141341Z
x-client-version:0.14.1
x-iceberg-access-delegation:vended-credentials
accept;accept-encoding;content-type;host;x-amz-date;x-client-version;x-iceberg-access-delegation
xxxxxx
2024-12-22 15:13:41,237 - botocore.auth - DEBUG - StringToSign:
AWS4-HMAC-SHA256
20241222T141341Z
20241222/eu-central-1/glue/aws4_request
xxxxxxx
2024-12-22 15:13:41,237 - botocore.auth - DEBUG - Signature:
xxxxxx
2024-12-22 15:13:41,237 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): glue.eu-central-1.amazonaws.com:443
2024-12-22 15:13:41,435 - urllib3.connectionpool - DEBUG - https://glue.eu-central-1.amazonaws.com:443 "GET /iceberg/v1/catalogs/311141556126/namespaces HTTP/1.1" 200 48
[('ibtest',), ('sourcedata_sales',)]
ListTable
2024-12-22 15:13:41,462 - botocore.credentials - INFO - Found credentials in shared credentials file: ~/.aws/credentials
2024-12-22 15:13:41,463 - botocore.auth - DEBUG - Calculating signature using v4 auth.
2024-12-22 15:13:41,463 - botocore.auth - DEBUG - CanonicalRequest:
GET
/iceberg/v1/catalogs/311141556126/namespaces/ibtest/tables
accept:*/*
accept-encoding:gzip, deflate
content-type:application/json
host:glue.eu-central-1.amazonaws.com
x-amz-date:20241222T141341Z
x-client-version:0.14.1
x-iceberg-access-delegation:vended-credentials
accept;accept-encoding;content-type;host;x-amz-date;x-client-version;x-iceberg-access-delegation
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
2024-12-22 15:13:41,463 - botocore.auth - DEBUG - StringToSign:
AWS4-HMAC-SHA256
20241222T141341Z
20241222/eu-central-1/glue/aws4_request
xxxxxxx
2024-12-22 15:13:41,463 - botocore.auth - DEBUG - Signature:
xxxx
2024-12-22 15:13:41,541 - urllib3.connectionpool - DEBUG - https://glue.eu-central-1.amazonaws.com:443 "GET /iceberg/v1/catalogs/311141556126/namespaces/ibtest/tables HTTP/1.1" 200 59
[('ibtest', 'ibtest1')]
Scan Tables
2024-12-22 15:13:41,567 - botocore.credentials - INFO - Found credentials in shared credentials file: ~/.aws/credentials
2024-12-22 15:13:41,567 - botocore.auth - DEBUG - Calculating signature using v4 auth.
2024-12-22 15:13:41,567 - botocore.auth - DEBUG - CanonicalRequest:
GET
/iceberg/v1/catalogs/311141556126/namespaces/ibtest/tables/ibtest1
accept:*/*
accept-encoding:gzip, deflate
content-type:application/json
host:glue.eu-central-1.amazonaws.com
x-amz-date:20241222T141341Z
x-client-version:0.14.1
x-iceberg-access-delegation:vended-credentials
accept;accept-encoding;content-type;host;x-amz-date;x-client-version;x-iceberg-access-delegation
xxxxxx
2024-12-22 15:13:41,567 - botocore.auth - DEBUG - StringToSign:
AWS4-HMAC-SHA256
20241222T141341Z
20241222/eu-central-1/glue/aws4_request
xxxxx
2024-12-22 15:13:41,567 - botocore.auth - DEBUG - Signature: xxxxxx
2024-12-22 15:13:41,712 - urllib3.connectionpool - DEBUG - https://glue.eu-central-1.amazonaws.com:443 "GET /iceberg/v1/catalogs/311141556126/namespaces/ibtest/tables/ibtest1 HTTP/1.1" 200 2123
id name created
0 001 test 2024-12-22 13:48:31.381
Conclusion
With the latest release of Glue you can access Iceberg tables on AWS using the standard iceberg REST_API opening the infrastructure to multiple tools.
The only AWS specific call is the signing with sigv4. https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html. This is already used by many tools for the S3 access.
This decouples the code from AWS Specific access and allows you to use more generic tools
...
🔧 Glue Iceberg Rest Api and PyIceberg
📈 81.72 Punkte
🔧 Programmierung
🔧 Glue Iceberg Rest Api and PyIceberg
📈 81.72 Punkte
🔧 Programmierung
🔧 Unity Catalog Iceberg Rest Api and PyIceberg
📈 63.19 Punkte
🔧 Programmierung
🔧 Unity Catalog Iceberg Rest Api and PyIceberg
📈 63.19 Punkte
🔧 Programmierung
🔧 Quick tip: Using SingleStore with PyIceberg
📈 30.16 Punkte
🔧 Programmierung
🔧 What Apache Iceberg REST Catalog is and isn't
📈 28.14 Punkte
🔧 Programmierung
🔧 Rest API v/s Web API v/s SOAP API
📈 23.61 Punkte
🔧 Programmierung
🔧 Rest API Vs HTTP API Vs WebSocket API
📈 23.61 Punkte
🔧 Programmierung
📰 Rest API Testing: So testest du Rest APIs richtig!
📈 22.74 Punkte
Web Tipps