Monday, 2 September 2013

Equivalent to Python's HTML parsing function/module in Go?

Equivalent to Python's HTML parsing function/module in Go?

I'm now learning Go myself and am stuck in getting and parsing HTML/XML.
In Python, I usually write the following code when I do web scraping:
from urllib.request import urlopen, Request
url = "http://stackoverflow.com/"
req = Request(url)
html = urlopen(req).read()
, then I can get raw HTML/XML in a form of either string or bytes and
proceed to work with it. In Go, how can I cope with it? What I hope to get
is raw HTML data which is stored either in string or []byte (though it can
be easily converted, that I don't mind which to get at all). I consider
using gokogiri package to do web scraping in Go (not sure I'll indeed end
up with using it!), but it looks like it requires raw HTML text before
doing any work with it...
So how can I acquire such object?
Or is there any better way to do web scraping work in Go?
Thanks.

No comments:

Post a Comment