Must run in a windows environment IDE Your Final Scripting Project Details: Your
Must run in a windows environment IDE Your Final Scripting Project Details: Your
Must run in a windows environment IDE Your Final Scripting Project Details: Your final script will scrape the https://casl.website/ within CyberApolis. (the writer can use any website of their choice, since they cannot access this school website. IMPORTANT you are allowed to use standard python libraries and any 3rd party library that we have used during the class. Your script will generate a report that contains the following information. 1) Unique URLs of **all** the pages found on the website. This means all sub pages of the domain. (pretty table column 1) 2) Unique URL links to images found on the website (pretty table column 2) 3) Extract and phone numbers found on the website (pretty table column 3) 4) Extract all text content from each of the pages and store them in a string variable (for use in parts 6-8) 5) Extract any Zip Codes (pretty table column 4) NOTE for Items 6-8 you will be utilizing NLTK to process all the text found on the web site, using the text content you extracted during item 4 above. 6) A list of all unique vocabulary found on the website (unique_words.txt) 7) A list of all possible verbs (unique_verbs.txt) 8) A list of all possible nouns (unique_nouns.txt) NOTE: REGEX PATTERN HELP ZipCode Regex pattern = b'd{5}(?:-d{4})?' Phone Number Regex b'(?d{3})?-? *d{3}-? *-?d{4}' You will submit: 1) Your python script .py file 2) The full output of your script

Leave a Reply

Your email address will not be published. Required fields are marked *