2024-01-15 Weekly Links
Artists are driven by the tension between the desire to communicate and the desire to hide. It is a joy to be hidden... but disaster not to be found.” - DW Winnicott
1. Scientific Method + Data Science = Better Results
A Data Scientific Method. How to take a pragmatic and goal-driven… | by Peter Turner | Towards Data Science - Link - DB
This article grabbed me from the get-go:
The main aim of data science is simple: it is to extract value from data.
I could not say it better myself. But to extract value from data, you gotta get yourself organized, man.
And to get organized, you gotta have a framework; an overarching template that you can ask the right questions, measure the results in a systematic way and reproduce the results in the future. If this sounds a lot like elementary science class, you would be right.
The author proposes to modify the stages of the scientific method for data science projects into 6 identifiable stages:
- Identify
- Understand
- Process
- Analyze
- Conclude
- Communicate
Additional References
2. Quit the NFL and Sell Pokemon Cards? A Story in 2 Parts.
Blake Martinez quit the NFL to sell Pokémon cards, brings in millions - Link - DB
Blake Martinez Attempts NFL Comeback After Allegations of Pokémon Business Scam - Sports Illustrated - Link - DB
I had originally posted the first link with the feel-good and out of the ordinary story: NFL player quits the league to focus on a booming Pokémon card business foregoing future bodily harm while simultaneously earning more money.
Recent allegations indicate the story might not be what I first thought: “amid allegations by customers who claimed they were scammed by the [player’s] company” he was subsequently banned from the Whatnot marketplace and is now attempting an NFL comeback.
Nothing but allegations but certainly puts the first part in context.
The real lesson from this story? Double check your sources, and don’t believe everything you read.
Additional References
- Blake Martinez Pokemon card scam, explained: Why NFL player's company was banned from Whatnot | Sporting News - Link - DB
3. Scheduling Functions on DigitalOcean
How to Schedule Functions :: DigitalOcean Documentation - Link - DB
Similar to Github Actions, Digital Ocean functions are “blocks of code that run on demand without the need to manage any infrastructure.” They can be scheduled to run at a given time using cron.
Other cron
schedulers
- Windows - taskscheduleR - Link
- Mac/UNIX - Package cronR - Link
- python-crontab - Link
- How to Execute a Cron Job on Mac With Crontab - Link
Additional References
- How To Scrape a Website Using Node.js and Puppeteer | DigitalOcean - Link
- Functions Python Runtime | DigitalOcean Documentation - Link
- How to Scrape Etsy Product Data with Python - Link
- Simple Web Scraping using BeautifulSoup and Python in Google Colab | by Nathanael Victorious | Medium - Link - DB
4. Automatically Annoying a Teenager: Building a “Back in My Day” Machine
Simple Telegram Bot with Python and AWS Lambda | by Daniils Petrovs | Level Up Coding - Link - DB
Guide: Telegram bot powered by GPT & DigitalOcean function | by Kuba Płoskonka | JavaScript in Plain English - Link - DB
I was reminiscing that it had been about 30 years since Snoop Dogg’s album Doggystyle hit #1 on the charts.
I drop this kind of musical knowledge on my 14 year old at random and it annoys him to no end.
So methinks: how could I automate this?
And so blossomed the idea to create a Telegram bot that automatically sends out notifications on the anniversary of an album hitting #1 on the Billboard charts.
Goal
- Create a telegram bot that sends a weekly notification(s) on the anniversary of the top Billboard 200 album from 10, 20, 30 etc years ago.
Tools
- R - to scrape the source and organize the data.
- Packages
- tidyverse
- lubridate
- rvest
- Packages
- Python - to run on AWS Lambda.
- Telegram - for the Bot and the channel.
- AWS S3 - file storage and URL.
- AWS Lambda - serverless code functions.
Process
- Step 0: Setup Project + Directory + File Structure
- Step 1: Gather the Data Locally
- Generate a list of urls to scrape
- Save as csv
- Use
rvest
to scrape wikipedia pages for the annual top album from the Billboard 200 from 1964 to 2024.- Items to extract
date_issue
- Issue Date
album_name
- Album Name
album_url
- Album url
artist_name
- Artist Name
artist_url
- Artist url
- Transform the data to a
data.frame
and save as acsv
file.
- Items to extract
- Scrape each Album url
album_summ
- Summary of the album
album_url
- Album image url
- Scrape each Artist url
- Download each album image
- Notes & Research
- Example Billboard 200 Starting URL from 1964: https://en.m.wikipedia.org/wiki/List_of_Billboard_200_number-one_albums_of_1964
- Generate a list of urls to scrape
- Step 2: Build the Data Structure
- Telegram Bots can do 2-way communication with the Telegram API via either Long-Polling or through a Webhook.
- If we organize our data correctly and since our Bot only sends messages, we can avoid the need for either method and use a Python http
GET
request to process the correct url from AWS S3.
- This leads to a decision on how to organize and structure the data on AWS S3.
- From the scraped data we know:
date_bill
- the Billboard date - date (or week) the Album hit number 1 on the top 200.
date_curr
- the current date.
date_years_
- that we want information on albums that have hit a round number anniversary - 10, 20, 30, 40 or 50 years - on the current date.
- We can then calculate
date_diff
- the difference in current date and the Billboard date
date_ann_
- the date of the anniversary for the album n years in the future - ann_date_10 etc
send_now
- If the ann_date is within 1 day of the curr_date, the bot should send the relevant information.
- Notes & Research
- Step 3: Write the data to AWS s3
- Once we have added Step 2 to the csv file, we can upload it to the S3 bucket.
- Upload an object to your bucket - Amazon Simple Storage Service - Link
- Step 4: Bot - Create the Telegram Bot
- Step 5: Host - Create the AWS Lambda Structure or A Digital Ocean Function
Additional References
- RStudio | DigitalOcean Documentation - Link
- I Built a Telegram Bot To Schedule Recurring Messages on Telegram | by Hui Shun Chua | Better Programming - Link - DB
- Cron - Apps on Google Play - Link
- Free cronjobs - from minutely to once a year. - cron-job.org - Link
- How To Build a Telegram Quotes Generator Bot With Node.js, Telegraf, Jimp, and Pexels | DigitalOcean - Link - DB
5. Pre-Admissioning
Even if They Didn’t Apply, Some Students Get College Admission Offers - The New York Times - Link - DB
Just sit back, and let the offers roll in kids! Offers to go to school AND go six-figures in debt? Sign me up!
Apparently “direct” admission is a thing now. But whether pre-admission works to get more asses in the lecture hall is still up in the air. “A study published last year looked at six four-year colleges offering automatic acceptance through the Common App. It found that students offered direct admission were about 12 percent more likely to apply to college but were no more likely to enroll than students who applied through traditional channels — in part because of the cost of attending.”
But what else can you apply this “pre-admission” thinking to (other than credit cards)?
How about churches (thou hast been chosen!), doctors offices (we hear you get sick a lot…), jobs (we heard you were killer at chopping onions; you’re now the executive chef at Spago) or shopping (you are probably out of ketchup so here is a bottle. Love, Target. ps you owe us $5.99).
Whatever the reasoning, when barriers to entry start dropping by the wayside, you should start questioning the value of the product.
Additional References
- None