How to deal with AWS (Amazon Web Services)

5 min readAug 26, 2020

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. This means customers of all sizes and industries can use it to store and protect any amount of data for a range of use cases, such as websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics. Amazon S3 provides easy-to-use management features so you can organize your data and configure finely-tuned access controls to meet your specific business, organizational, and compliance requirements.

Benefits -

Industry-leading performance, scalability, availability, and durability
Wide range of cost-effective storage classes
Unmatched security, compliance and audit capabilities
Easily manage data and access controls
Most supported cloud storage service

Pre-requisites: Understanding of Buckets , Containers and Objects.

import boto3
import oss3 = boto3.client('s3')files = list(map(lambda x: x['Key'],s3.list_objects_v2(
            Bucket='bucket_name',
            Prefix ='the_starting_string_to_your_folders/objects' ['Contents']))

s3.list_objects_v2 : Returns some or all (up to 1000) of the objects in a bucket. You can use the request parameters as selection criteria to return a subset of the objects in a bucket using a prefix or more arguments.

Prefix — Limits the response to keys that begin with the specified prefix

Bucket name — Name of the bucket (String)

s3.list_objects_v2 returns a dictionary in which the key-value ‘Contents’ is:

Contents’: [
 {
 ‘Key’: ‘string’,  ‘LastModified’: datetime(2015, 1, 1), ‘ETag’: ‘string’,
 ‘Size’: 123,‘StorageClass’: ‘STANDARD’|’REDUCED_REDUNDANCY’|’GLACIER’|’STANDARD_IA’|’ONEZONE_IA’,
 ‘Owner’: {‘DisplayName’: ‘string’,
 ‘ID’: ‘string’ }
 },
 ],

We need ‘key’ from contents which is basically the name of the file/object we want.

How to download the file using s3

file_name= 'name of file in which we want to save our data, basically the path of the file's3.download_file('bucket_name', file, 'file_name')

How to upload the file

uploading_path='The file name from which you want to upload the data'uploaded_path='The name with which you want to save the data on aws's3.upload_file(uploading_path,bucket_name,uploaded_path)

How to list all the items in the bucket on aws:

Run the below command in your terminalaws s3 ls s3://bucket_name/For further listing give the folder/object path:aws s3 ls s3://bucket_name/object_path/Like this navigate inside all the objects inside the bucket.

Move the objects inside the bucket:

aws s3 mv s3://bucket_name/move_from_file s3://bucket_name/move_to_pathmove_from_file='path from where you want to move the file along with file name'If we have a object named text_data inside the bucket and move_file.txt is the file we want to move i.e inside the object/folder text_data then move_from_file will be 'text_data/move_file.txt'move_from_file='path where you want to move the file move_file.txt'Let's say we want to move this .txt file to the object final_data inside another object dataset, then move_to_path will be 'dataset/final_data/' (If we want to keep the same name of the file i.e move_file) otherwise 'dataset/final_data/new_name.txt'

What if we want to move multiple files to a new destination ??

Python script to the rescue✌🏻😎:

Files is a list which contains all the files you want to move to new destination.file = 'path of the file we want to copy/delete' or 'text_data/move_file.txt'copy_to_path = dataset/final_data/new_name.txt or 'dataset/final_data/'for file in files:
s3.copy_object(Bucket=bucket_name,Key=copy_to_path,CopySource=bucket_name+'/'+file)
s3.delete_object(Bucket=bucket_name,Key=file)Basically first copy the object to the new destination (Saves the original file, if things go wrong)
Then delete the previous one, if you don't need the file anymore.
Run this python script and copy/delete/move anything.

How to access the files/folder/objects on aws

aws ls s3://bucket_name/

The above command will show the content of the bucket.

For eg. The content in the bucket_name is shown below:

PRE  s3:/PRE data_nlp/PRE dataset/PRE txt_data/YYYY--MM-DD HH:MM:SS    KKKK random.jsonYYYY--MM-DD HH:MM:SS    JJJJ random.xlsxYYYY--MM-DD HH:MM:SS    PPPP random.npyYYYY--MM-DD HH:MM:SS    OOOO random.csvYYYY--MM-DD HH:MM:SS    TTTT random_0.txtYYYY--MM-DD HH:MM:SS    TTTT random_1.txtYYYY--MM-DD HH:MM:SS    TTTT random_2.txtYYYY--MM-DD HH:MM:SS    AAAA random.pkl

To access move_file.txt i.e., inside the object txt_data

files = list(map(lambda x: x['Key'],s3.list_objects_v2(
            Bucket=bucket_name,
            Prefix ='txt_data/')['Contents']))

Files will be the list of all the files inside the txt_data.

2. To access all txt files — random_0.txt, random_1.txt, random_2.txt:

txt_files = list(map(lambda x: x['Key'],s3.list_objects_v2(
            Bucket=bucket_name,
            Prefix ='random_')['Contents']))

Output: txt_files=[‘random_0.txt’, ‘random_1.txt’, ‘random_2.txt’]

Possible Errors (Life is not that easy🤓):

While running the command

aws s3 ls s3://bucket_name/

Error 1: Unable to locate credentials. You can configure credentials by running “aws. configure”.

Solution: We need two things to run this command

1. AWS_ACCESS_KEY_ID2. AWS_SECRET_ACCESS_KEY

If you have both of them then just run

export AWS_ACCESS_KEY_ID=valueexport AWS_SECRET_ACCESS_KEY=value

in the terminal.

Error 2:

zsh: command not found: aws

Solution 1: If you already have aws in your system run this in terminal

export PATH=~/bin:$PATH

Solution 2: Install aws in the system ( What were you expecting without installing aws 😒😏 )

Follow 
https://docs.aws.amazon.com/cli/latest/userguide/install-macos.html#install-macosos-prereq1. curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o       "awscli-bundle.zip"2. unzip awscli-bundle.zip3. ./awscli-bundle/install -b ~/bin/aws4. aws --version

References :

Introduction to Amazon S3

Cloud Object Storage | Store & Retrieve Data Anywhere | Amazon Simple Storage Service (S3)

Object storage built to store and retrieve any amount of data from anywhere Amazon Simple Storage Service (Amazon S3)…

aws.amazon.com

You can find everything related to S3 here, So, Dig it in !!!

S3 - Boto 3 Docs 1.9.42 documentation

A low-level client representing Amazon Simple Storage Service (S3): These are the available methods: Aborts a multipart…

boto3.amazonaws.com

Find useful things related to Command Line Ineterface here.

s3 - AWS CLI 1.18.125 Command Reference

This section explains prominent concepts and notations in the set of high-level S3 commands provided. Whenever using a…

docs.aws.amazon.com

Installation help.

Install, Update, and Uninstall the AWS CLI version 1 on macOS

You can install the AWS Command Line Interface (AWS CLI) version 1 and its dependencies on macOS by using the bundled…

docs.aws.amazon.com

Introduce yourself to Buckets, Containers and Objects.

Key terms | Cloud Storage | Google Cloud

To use Cloud Storage effectively, you should understand some of the concepts on which it is built. This page provides…

cloud.google.com