Project Description

Domains: Web Scraping, NLP, APIs for Communication, Database Management

Project Overview: The project aims at creating techno-sales products for the financial firm AlgoBulls. To begin with, we must create a repository of active traders by extracting information available on social media platforms, especially LinkedIn and Twitter. Once ready, people's social profiles will be extracted through certain classical and NLP tools, and a communication system will be set up to auto-follow up with them through email. Prototypical AI-based chat bots will also be explored as an additional feature.

Work Description

Roles: 3 Web Scraping/NLP developers
Stipend: Paid project (join community for more details)
Project Duration: 3+ months

Tasks/Deliverables

  • R&D on web scraping methods (APIs, use of proxies, etc)
  • Initial data extraction and back-end scripting
  • Social profiling through NLP tools
  • API integration for email communication
  • Chat bot design and implementation
  • Analysis and cleaning of web-scraping outputs
  • Pipeline design and optimization
  • Design and maintenance of the database

Skills Learned

  • Proficiency in web scraping methodologies
  • Knowledge of API integration and usage
  • Data extraction, processing and cleaning
  • NLP tools for social profiling
  • Chat bot integration
  • Database management
  • Back-end scripting

Qualifications Required

Experience: Strong Python programming skills are needed. Experience with web-scraping libraries (bs4, Scrapy etc) and database familiarity (PostgreSQL) is appreciated.

Year of Study: Second year or above

How to Apply?

Submission Link: https://forms.gle/sWKMSDpAG2urdoKM9

Deadline: 11:59 PM, 18th February, 2024

To enroll for the project, you must fill out the form above. For further credit, you can attempt and submit the assignment below to the best of your abilities, taking the aid of any tools online. We will contact you personally if you are shortlisted for the interview.

(Optional) Assignment:
The goal of this assignment is to perform web scraping to gather relevant information from social media sites. You can scrape either LinkedIn or Twitter.

LinkedIn Scraping:

  • Scrape LinkedIn profiles of about 100 professionals who are into Software Development
  • Extract the following attributes for each profile:
  1. Name
  2. Current Position
  3. Skills
  4. LinkedIn URL
  • Structure this data and present as a PostgreSQL data table

Twitter Scraping:

  • Search for the hashtag “#coding” and scrape 1000 tweets which contain this tag
  • Extract the following information for each tweet:
  1. Tweet content
  2. Number of likes
  3. Twitter handle of the user
  • Present this data as a PostgreSQL data table

Submit the web-scraping script, a csv with the relevant information, as well as a snapshot of the data table in a GitHub repository.

Contact Us

For any general queries, join the ProSpace WhatsApp group- https://chat.whatsapp.com/E09qtrcuShp1uf2w82LCsa

For assignment queries, contact:

Email: satyamm435@gmail.com

Phone: 9324865787

Announcements

Recruitment Open: Web Scraping & NLP

14-Feb-2024

Recruitment for the AlgoBulls proejct on web scraping is now open! Join the project and gain valuable skills in SQL, NLP and more. Check out the submission link and assignment to apply.

Comments