3.2.3 How to Upload Content to Activate the Contract or Deliver an Order on Data Boutique

Updated by Andrea Squatrito

How to Upload Content to Activate the Contract or Deliver an Order on Data Boutique

To activate a data seller contract (in "Selecting" status) or to fulfill an “On Demand” order, sellers must upload the required files to Data Boutique’s AWS S3 storage at the specified delivery path. This path, provided on your contract page, ensures that files are correctly associated with your dataset and meet platform requirements.

Validation Process

After you upload a file, a validator will examine both the delivery path and file format to ensure compliance:

  • Delivery Path: Files must be uploaded to the exact path specified in your contract. Incorrect paths will result in the file not being accepted or published.
  • Schema Compliance: Each contract is linked to a precise schema. The validator will check that your file adheres to this schema. Any mismatches in structure or content will result in the file being refused.

Following the correct path and schema requirements is essential to ensure your data is published successfully and made available for buyers.

Please ensure that your file meets the following specifications:

  • File Type: Use a .txt file in standard UTF-8 encoding.
  • File Compression: Files should be uncompressed.
  • Headers: No headers—the first row should be data content.
  • Enclosures: All content should be enclosed in quotes.
  • Separator: Use a semicolon (;) as the field separator.

Mandatory Fields

  1. Contract ID: You must use the exact Contract ID associated with your listing (e.g., recXXXXXXXXXXXXXX). If this field is incorrect, the file will be refused.
  2. Seller ID: Use your own Seller ID (e.g., AKIAS3XXXXXXXXXXXXXX). If this field is incorrect, the file will be refused.
  3. dbq_prd_type: Use the exact code specified in the field list for dbq_prd_type. If this code is incorrect, the file will be refused.

File Naming

The file name must always be data_file.txt. The directory path will vary based on the delivery folder, but the file name itself should remain constant.

AWS S3 Delivery Path Format

Your delivery path will be structured as follows:

s3://databoutique.com/sellers/[Your AWS Seller ID]/[Contract ID]/(collection date in format YYYY-MM-DD)/data_file.txt

Example:

If your AWS Seller ID is AKIABCDEFGH123456789, your Contract ID is recxxxxxxxxxxxxxx, and the collection date is 2024-11-05, your delivery path would look like this:

s3://databoutique.com/sellers/AKIABCDEFGH123456789/recxxxxxxxxxxxxxx/2024-11-05/data_file.txt

Below are examples of how to upload files to this path using Python (boto3), AWS CLI, and Java.

Uploading Using Python (boto3)

Using Python’s boto3 library is a convenient way to upload files to AWS S3 programmatically.

Step 1: Install boto3

If you haven’t installed boto3, you can do so using pip:

pip install boto3

Step 2: Upload File

Here’s an example of how to upload a file using boto3:

import boto3

# Define your AWS credentials and bucket details
aws_access_key_id = 'YOUR_ACCESS_KEY'
aws_secret_access_key = 'YOUR_SECRET_KEY'
bucket_name = 'databoutique.com'
seller_id = 'AKIABCDEFGH123456789'
contract_id = 'recxxxxxxxxxxxxxx'
collection_date = '2024-11-05'
file_path = 'path/to/your/data_file.txt'

# Create S3 client
s3 = boto3.client(
's3',
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
region_name='eu-central-1'
)

# Define the destination path
s3_key = f'sellers/{seller_id}/{contract_id}/{collection_date}/data_file.txt'

# Upload the file
s3.upload_file(file_path, bucket_name, s3_key)
print("File uploaded successfully.")

Uploading Using AWS CLI

If you prefer using the AWS CLI, follow these steps:

Step 1: Configure AWS CLI

Ensure your AWS CLI is configured with your credentials:

aws configure

Step 2: Upload File

Use the following command to upload your file:

aws s3 cp path/to/your/data_file.txt s3://databoutique.com/sellers/AKIABCDEFGH123456789/recxxxxxxxxxxxxxx/2024-11-05/data_file.txt

Replace path/to/your/data_file.txt with the path to your local file. The file will be uploaded to the specified S3 path.

Uploading Using Java (AWS SDK)

For Java users, the AWS SDK provides methods to upload files to S3.

Step 1: Add AWS SDK Dependency

If you’re using Maven, add the following dependency:

<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
<version>2.17.19</version>
</dependency>

Step 2: Upload File

Use the following Java code to upload the file:

import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.PutObjectRequest;
import software.amazon.awssdk.services.s3.model.PutObjectResponse;

import java.nio.file.Paths;

public class UploadToS3 {
public static void main(String[] args) {
String bucketName = "databoutique.com";
String sellerId = "AKIABCDEFGH123456789";
String contractId = "recxxxxxxxxxxxxxx";
String collectionDate = "2024-11-05";
String filePath = "path/to/your/data_file.txt";

S3Client s3 = S3Client.builder()
.region(Region.EU_CENTRAL_1)
.credentialsProvider(ProfileCredentialsProvider.create())
.build();

String s3Key = "sellers/" + sellerId + "/" + contractId + "/" + collectionDate + "/data_file.txt";

PutObjectRequest putObjectRequest = PutObjectRequest.builder()
.bucket(bucketName)
.key(s3Key)
.build();

s3.putObject(putObjectRequest, Paths.get(filePath));
System.out.println("File uploaded successfully.");
}
}

In Summary

  • AWS S3 Delivery Path: Ensure your file is uploaded to the correct path as specified on the contract page, following the exact format required.
  • Validation Requirements: The file’s delivery path and schema must match the contract’s specifications. Any errors will prevent the file from being accepted.
  • Supported Tools: Use Python (boto3), AWS CLI, or Java (AWS SDK) to upload your file.

Uploading to the correct S3 path and adhering to schema requirements ensures that your data will pass validation, allowing the dataset to be activated or delivered for buyers as intended.


How did we do?