S3 Integration
Audience: Data Owners, Data Users, and System Administrators
Content Summary: Immuta supports an S3-style REST API, which allows you to communicate with Immuta the same way you would with S3. Consequently, Immuta easily integrates with tools you may already be using to work with S3.
S3 as a Filesystem
In this integration, Immuta implements a single bucket (with data sources broken up as sub-directories under that bucket), since some S3 tools only support the new virtual-hosted style requests.
The three APIs (outlined below) used in this integration support basic AWS functionality; the requests and responses for each are identical to those in S3.
GET Bucket
This request returns the bucket configured within Immuta.
Method | Path | Successful Status Code |
---|---|---|
GET | /s3p |
200 |
GET Bucket Contents
This request returns the contents of the given bucket.
Method | Path | Successful Status Code |
---|---|---|
GET | /s3p/{bucket} |
200 |
GET Object
This request returns a stream from the requested object within Immuta.
Method | Path | Successful Status Code |
---|---|---|
GET | /s3p/{bucket}/{dataSource}/{key*} |
200 |
Example Request:
curl \
--request GET \
--header "Authorization: AWS <API KEY>:immuta" \
https://demo.immuta.com/s3p/immuta/my_data_source/path/to/file/myfile.json
Example: HTTP Request and Response
GET Bucket Example Request:
curl \
--request GET \
--header "Authorization: AWS <API KEY>:immuta" \
https://demo.immuta.com/s3p/immuta?delimiter=/&prefix=my_data_source/path/to/file
Note: There is a single file in the requested directory.
GET Bucket Example Response:
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://doc.s3.amazonaws.com/2006-03-01/">
<IsTruncated>false</IsTruncated>
<Marker></Marker>
<Name>immuta</Name>
<Prefix>my_data_source/path/to/file</Prefix>
<MaxKeys>1000</MaxKeys>
<Delimiter>/</Delimiter>
<Contents>
<Key>my_data_source/path/to/file/myfile.json</Key>
<LastModified>2018-11-05T21:25:04.000Z</LastModified>
<ETag>5b0810c82a69a70e552cece19b20585fc94b67fe4eaa8b</ETag>
<Size>389</Size>
<StorageClass>STANDARD</StorageClass>
<Owner>
<ID>Immuta</ID>
<DisplayName>Immuta</DisplayName>
</Owner>
</Contents>
</ListBucketResult>
Example: Using Boto 3 to Download Objects
Boto 3 is the official Amazon Web Services client SDK
for Python and is widely used by developers for accessing S3 objects. With Immuta's S3
integration, Immuta users can use boto3
to download policy-enforced files or tables.
The first step is to create a Session
object that points to your Immuta endpoint
and is authenticated with a user-specific API Key.
import boto3
session = boto3.session.Session()
s3_client = session.client(
service_name = 's3',
aws_access_key_id = '<YOUR_USER_API_KEY>',
aws_secret_access_key = 'immuta',
endpoint_url = 'https://<YOUR_IMMUTA_URL>:443/s3p'
)
To find out what objects are available for download, you can list the objects
in the immuta
bucket. To filter down to a particular data source, pass in a Prefix
that corresponds to the SQL table name of your Immuta data source.
bucket_contents = s3_client.list_objects(
Bucket = 'immuta',
Delimiter = '/',
Prefix = '<SQL_TABLE_NAME>'
).get("Contents")
print(bucket_contents[0])
{
'Key': '<SQL_TABLE_NAME>/<SINGLE_OBJECT_KEY>',
'ETag': 'aa0492082b95c5d8bb90377a006e...',
'StorageClass': 'STANDARD',
'Owner': {'DisplayName': 'Immuta', 'ID': 'Immuta'}
}
Once you have an object key, you can use the download_file
method to
download the object to your local development environment.
s3_client.download_file(
Bucket = "immuta",
Key = "<SQL_TABLE_NAME>/<SINGLE_OBJECT_KEY>",
Filename = "<OUTPUT_FILE_PATH>"
)