2.3.1.1 API Access: Setting Up and Using the AWS S3 API
API Access: Setting Up and Using the AWS S3 API
Data Boutique offers convenient access to datasets through AWS S3 API, allowing users to download files programmatically. This guide provides step-by-step instructions for setting up access, configuring tools, and retrieving files from the Data Boutique S3 bucket.
Overview
To access files via AWS S3 API, you must have an active Data Boutique subscription and be logged in to obtain your AWS credentials. With these credentials, you can access datasets in the Data Boutique S3 bucket and automate your data ingestion.
Prerequisites
- Data Boutique Subscription: Ensure you have an active subscription with a recurring payment method.
- AWS Credentials: Obtain your AWS access key and AWS secret key by logging into your Data Boutique account.
Accessing Files via AWS S3 API
The datasets are stored in the following S3 bucket:
https://s3.eu-central-1.amazonaws.com/databoutique.com/buyers/[AWS access key]
Replace [AWS access key]
with your unique access key to access your specific directory.
Tools You Can Use to Access AWS S3
You can use various tools and programming languages to access Data Boutique files via AWS S3 API:
- AWS Command Line Interface (CLI)
- Python (using boto3)
- Other programming languages with AWS SDKs (e.g., Java, JavaScript, Ruby)
Using AWS CLI
Step 1: Install AWS CLI
If you haven’t installed the AWS CLI, you can do so by following the instructions on AWS’s official page.
Step 2: Configure AWS CLI
Configure the AWS CLI with your access key and secret key:
aws configure
You will be prompted to enter your AWS credentials:
AWS Access Key ID [None]: YOUR_ACCESS_KEY
AWS Secret Access Key [None]: YOUR_SECRET_KEY
Default region name [None]: eu-central-1
Default output format [None]: json
Step 3: Access the Files
To list files in your directory:
aws s3 ls s3://databoutique.com/buyers/YOUR_ACCESS_KEY/
To download a file:
aws s3 cp s3://databoutique.com/buyers/YOUR_ACCESS_KEY/yourfile.csv ./yourfile.csv
Using Python (boto3)
Step 1: Install boto3
Install boto3, the AWS SDK for Python:
pip install boto3
Step 2: Configure boto3
Set up a session using your AWS credentials:
import boto3
# Create a session using your credentials
session = boto3.Session(
aws_access_key_id='YOUR_ACCESS_KEY',
aws_secret_access_key='YOUR_SECRET_KEY',
region_name='eu-central-1'
)
# Create an S3 client
s3 = session.client('s3')
# List files in the bucket
response = s3.list_objects_v2(Bucket='databoutique.com', Prefix='buyers/YOUR_ACCESS_KEY/')
for obj in response.get('Contents', []):
print(obj['Key'])
Step 3: Download a File
To download a specific file:
# Download a specific file
s3.download_file('databoutique.com', 'buyers/YOUR_ACCESS_KEY/yourfile.csv', 'yourfile.csv')
Using Other Programming Languages
AWS SDKs are available for other programming languages, including Java, JavaScript, and Ruby. Refer to the AWS SDK documentation for specific setup and usage instructions in your preferred language.
Common Issues and Troubleshooting
- Access Denied: Check that your AWS access key and secret key are correct and have the necessary permissions.
- Region Issues: Ensure you are using the correct region (
eu-central-1
). - Bucket Name and Path: Verify that you are using the correct bucket name and path format.
By following this guide, you’ll be able to set up and use AWS S3 API to access Data Boutique datasets, enabling you to automate data ingestion and make the most of your Data Boutique subscription.