Off-Topic > The Pit
Google Drive to record the "Story Games" and other Pit stuff
Oliver Colt:
--- Quote from: Shinkurex on January 29, 2014, 11:23:11 pm ---
--- End quote ---
HOLY FUNK WHERE DID THAT COME FROM D:
ramjamslam:
Oh, that print link is very useful, makes screen scrapping a thread a lot easier. Here is a very quick and dirty 5 minute python hack that uses the print page to screen scrape the story down. It needs a bunch of cleaning up e.g. to just get the story you will need to remove people quoting each other and remove the bracketed comments. But here it is (it's very rough):
import requests
from BeautifulSoup import BeautifulSoup
from html2text import html2text
html = requests.get('https://gunsoficarus.com/community/forum/index.php?action=printpage;topic=3086.0').text
p = BeautifulSoup(html)
print ' '.join([html2text(text.text).strip() for text in p.findChildren('dd')])
Oliver Colt:
--- Quote from: ramjamslam on January 30, 2014, 01:07:29 am ---
html = requests.get('https://gunsoficarus.com/community/forum/index.php?action=printpage;topic=3086.0').text
--- End quote ---
Wait is that the whole thread or just the first page?
Edit: I could probably figure this out if I look at the links closely when I open stuff but I'd rather be sure hearing it from someone who knows xD
Piemanlives:
The entire thread.
macmacnick:
I'm too lazy to learn python
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version