Scraping posts, comments and replies from Facebook.
$ git clone https://github.com/utkarsh512/fbscraper.git
$ cd fbscraper
$ pip install . -r requirements.txt
Create a Session
object for scraping:
from fbscraper import Session
sess = Session(
credentials=(EMAIL,
PASSWORD),
chromeDriverPath="chromedriver"
)
where (EMAIL, PASSWORD)
are your facebook credentials and chromeDriverPath
is the path to the chromedriver.
Then, you can extract recent post URLs of a public pages as
sess.getPage("nytimes")
sess.scroll(10)
postURLs = sess.getPostURLs()
As you now have the list of URLs for the required posts, post data (including comments) can be scraped as
sess.getPost(
postURL="https://mbasic.facebook.com/story.php?...",
dump="posts.pkl",
getComments=True,
getReplies=True,
nComments=1000,
nReplies=10
)
where
postURL
is the URL of the postdump
is the name of binary file used for dumping the post datagetComments
should beTrue
if you want to scrape comments to the post as wellgetReplies
should beTrue
if you want to scrape replies to the comments as wellnComments
is the upper-bound on number of comments per postnReplies
is the upper-bound on number of replies per comment
Note: Just make sure postURL
starts with https://mbasic.facebook.com
instead of https://www.facebook.com
, https://mobile.facebook.com
, etc.