Program: Python

Package:  BeautifulSoup

Recently, our client has requirements for collecting social ids from their public page. But, for me, I used Ruby as a default program a few years,  

So I take some days to learned python and write a simple script to this. 

from bs4 import BeautifulSoup as soup  # HTML data structure
from urllib.request import urlopen as uReq # Web client
import re

def getLinks(url):
    uClient = uReq(url)
    page_soup = soup(, "html.parser")
    links = []

    full_links = page_soup.findAll('a', attrs={'href': re.compile("^http(s)?://")})
    for link in full_links:

    return list(set(links))

print( getLinks("") )
# ['', '', '']


  • BeautifulSoup - HTML data structure
  • uReq - Web client
  • re - Python Regex
  • set - Python Function (convert to hash and remove duplicate)
  • list - Python Function (easy to convert to Array)

Overall, I think it's very easy to implement it through python.