Home Project
This project is a little more complex than the previous ones, it combines elements of
concurrent programming with elements of network programming.
The requirements are as it follows:
Build a thread called DownloadThread that will connect to a specified URL and
download a file on the local filesystem. The file is encrypted using a simple Caesar
encryption algorithm (https:// en.wikipedia.org/wiki/Caesar_cipher). Make three
instances of this thread that will connect to the following locations:
•https://advpython.000webhostapp.com/s1.txt
•https://advpython.000webhostapp.com/s2.txt
•https://advpython.000webhostapp.com/s3.txt
The threads will save the downloaded content to the files s1_enc.txt, s2_enc_txt,
s3_enc.txt, accordingly
pip install click
pip install requests
pip install threading
import click
import requests
import threading
//Note : The below code is used for each chunk of file handled
# by each thread for downloading the content from specified
# location to storage
def Handler(start, end, url, filename):
# specify the starting and ending of the file
headers = {'Range': 'bytes=%d-%d' % (start, end)}
# request the specified part and get into variable
r = requests.get(url, headers=headers, stream=True)
# open the file and write the content of the html page
# into file.
with open(filename, "r+b") as fp:
fp.seek(start)
var = fp.tell()
fp.write(r.content)
@click.command(help="It downloads the specified file with specified name")
@click.option('—number_of_threads',default=4, help="No of Threads")
@click.option('--name',type=click.Path(),help="Name of the file with extension")
@click.argument('url_of_file',type=click.Path())
@click.pass_context
def download_file(ctx,url_of_file,name,number_of_threads):
r = requests.head(url_of_file)
if name:
file_name = name
else:
file_name = url_of_file.split('https://advpython.000webhostapp.com/s1.txt
,https://advpython.000webhostapp.com/s2.txt
,https://advpython.000webhostapp.com/s3.txt')[-1]
try:
file_size = int(r.headers['content-length'])
except:
print "Invalid URL"
return
part = int(file_size) / number_of_threads
fp = open(file_name, "wb")
fp.write('\0' * file_size)
fp.close()
we create Threads and pass the Handler function which has the main functionality :
for i in range(number_of_threads):
start = part * i
end = start + part
# create a Thread with start and end locations
t = threading.Thread(target=Handler,
kwargs={'start': start, 'end': end, 'url': url_of_file, 'filename': file_name})
t.setDaemon(True)
t.start()
main_thread = threading.current_thread()
for t in threading.enumerate():
if t is main_thread:
continue
t.join()
print '%s downloaded' % file_name
if __name__ == '__main__':
download_file(obj={})
Build a thread called DecryptThread which will open a file specified at the previous step
and then will decrypt its content using Caesar algorithm, taking into account that the offset
is 8. Every decrypted content will be saved into a data structure in memory.
keyFile = open("key.txt", "r")
keylist1= []
keylist2 = []
for line in keyFile:
keylist1.append(line.split()[0])
keylist2.append(line.split()[1])
keyFile.close()
encryptedfile = open("encrypted.txt", "r")
lines = encryptedfile.readlines()
currentline = ""
decrypt = ""
for line in lines:
currentline = line
letter = list(currentline)
for i in range(len(letter)):
currentletter = letter[i]
if not letter[i].isalpha():
decrypt += letter[i]
else:
for o in range(len(keylist1)):
if currentletter == keylist1[o]:
decrypt += keylist2[o]
print(decrypt)
Build a class called Combiner which will retrieve the content saved in the data structure at
the previous step and then will write it in a file called s_final.txt. Be careful that we need the
content of the file written in order. Because the threads could execute in random order, it is
possible that the content of the data structure to look like
[s2.txt, s2.txt, s1.txt]
or
[s3.txt, s2.txt, s1.txt]
etc but we want it like
[s1.txt, s2.txt, s3.txt]
import logging
import os
from time import time
from download import setup_download_dir, get_links, download_link
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s -
%(levelname)s - %(message)s')
logger = logging.getLogger(__name__)
def main():
ts = time()
client_id = os.getenv('IMGUR_CLIENT_ID')
if not client_id:
raise Exception("Couldn't find IMGUR_CLIENT_ID environment variable!")
download_dir = setup_download_dir()
links = get_links(client_id)
for link in links:
download_link(download_dir, link)
logging.info('Took %s seconds', time() - ts)
if __name__ == '__main__':
main()
Also pay attention to the concurrency issues that could occur when you access shared
memory data structures.
The main program should contain (in pseudo-ccode):
1.instantiate DownloadThread
2.start and join DownloadThread
3.instantiate DecryptThread
4.start and join DecryptThread
5.instantiate Combiner
6.display the content of the s_final.txt file
import logging
import os
from redis import Redis
from rq import Queue
from download import setup_download_dir, get_links, download_link
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(name)s -
%(levelname)s - %(message)s')
logging.getLogger('requests').setLevel(logging.CRITICAL)
logger = logging.getLogger(__name__)
def main():
client_id = os.getenv('IMGUR_CLIENT_ID')
if not client_id:
raise Exception("Couldn't find IMGUR_CLIENT_ID environment variable!")
download_dir = setup_download_dir()
links = get_links(client_id)
q = Queue(connection=Redis(host='localhost', port=6379))
for link in links:
q.enqueue(download_link, download_dir, link)
if __name__ == '__main__':
main()