c# - Null reference exception when try to get link by class in HtmlAgilityPack -


i have asp.net mvc application , html page parse using htmlagilitypack, when try looping elements have next error in foreach: object reference not set instance of object. code next. know mistake? i'm new using htmlagilitypack.

part of html:

<li class="b-serp-item i-bem" onclick="return {&quot;b-serp-item&quot;:{}}">   <i class="b-serp-item__favicon" style="background-position: 0 -0px"></i>   <h2 class="b-serp-item__title">     <b class="b-serp-item__number">1</b>     <a class="b-serp-item__title-link" href="http://googlescraping.com/google-scraper.php">google</a>   </h2> </li> 

code

datetime dt = datetime.now; string dtf = string.format("{0:u}", dt); string wp = "page" + dtf + ".html"; htmldocument hd = new htmldocument(); hd.load(wp); string output = ""; foreach (htmlnode node in hd.documentnode.selectnodes("//a[@class='b-serp-item__title-link']")) {     output += node.getattributevalue("href", null) + " "; } 

html output shared in google drive: https://drive.google.com/file/d/0b3-m-r5ce0gostlzugltt1vbb00/edit?usp=sharing

i ran code 1 slight change, used htmldocument.loadhtml(stringcontents) instead of htmldocument.load(path) , works flawlessly.

i suspect code unable find file path. ensure file exists using file.exists(wp) , consider using qualified path instead of file name using wp = path.getfullpath(wp).

or read contents first using string contents = file.readalltext(wp); grab contents , use loadhtml method on htmldocument.


Comments

Popular posts from this blog

python - Subclassed QStyledItemDelegate ignores Stylesheet -

java - HttpClient 3.1 Connection pooling vs HttpClient 4.3.2 -

SQL: Divide the sum of values in one table with the count of rows in another -