Question

My Perl code using HTML::TableExtract doesn't work.

Here is my code

#!/usr/bin/perl
use strict;
use warnings;

use HTML::TableExtract;


## Exactract table from html file
my $te = new HTML::TableExtract( attribs => { border => 0} );
$te->parse_file("file_path.html");
my $table = $te->tables;

for my $row ($table->rows) {
    print join(',', @$row), "\n";
}

I keep having this error

Can't call method "rows" without a package or object reference at ./parse_table.pl line 13.

Here is my HTML file, truncated to show only the table I am interested in. http://phucnvo.myvnc.com/sandbox/out.html

<div>
  <form name="listAssignmentsForm" action="https://t-square.gatech.edu/portal/tool/3a34f619-99d1-4548-be57-9ee977fd8127?panel=Main"
    method="post">
    <input type="hidden" name="source" value="0"/>
    <table class="listHier lines nolines" border="0" cellspacing="0"
      summary="List of assignments. Column headers are also links which can be used to sort the table by that column. Column 1: Indicates if the assignment has attachments. Column 2: assignment title and links to edit, duplicate or grade(if allowed). Column 3: status. Column 4: opening date. Column 5: due date. The rest of the columns may or may not be present. Column 6: may have the number submitted and graded. Column 7: may have checkboxes to select and remove the assignment.">
      <tr>
        <th id="attachments" class="attach"> &nbsp; </th>
        <th id="title">
          <a href="#" onclick="location='url'; return false;" title="Sort by title"> Assignment title </a>
        </th>
        <th id="For">
          <a href="#" onclick="location='url'; return false;" title="Sort by audience">For</a>
        </th>
        <th id="status">
          <a href="#"
            onclick="location='https://t-square.gatech.edu/portal/tool/3a34f619-99d1-4548-be57-9ee977fd8127?criteria=assignment_status&amp;panel=Main&amp;sakai_action=doSort'; return false;"
            title="Sort by status"> Status </a>
        </th>
        <th id="openDate">
          <a href="#"
            onclick="location='https://t-square.gatech.edu/portal/tool/3a34f619-99d1-4548-be57-9ee977fd8127?criteria=opendate&amp;panel=Main&amp;sakai_action=doSort'; return false;"
            title="Sort by section"> Open </a>
        </th>
        <th id="dueDate">
          <a href="#"
            onclick="location='https://t-square.gatech.edu/portal/tool/3a34f619-99d1-4548-be57-9ee977fd8127?criteria=duedate&amp;panel=Main&amp;sakai_action=doSort'; return false;"
            title="Sort by due date"> Due </a>
        </th>
      </tr>
      <tr>
        <td headers="attachments" class="attach">
          <img id="attachment1" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
        </td>
        <td headers="title">
          <h4><a href="url">Project 7</a></h4>
        </td>
        <td style="padding-bottom:0"> site </td>
        <td headers="status"> Submitted Jul 24, 2013 12:24 am </td>
        <td headers="openDate"> Jul 19, 2013 12:00 pm </td>
        <td headers="dueDate"> Jul 26, 2013 11:55 pm </td>
      </tr>
      <tr>
        <td headers="attachments" class="attach">
          <img id="attachment2" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
        </td>
        <td headers="title">
          <h4><a href="url">Project 6</a></h4>
        </td>
        <td style="padding-bottom:0"> site </td>
        <td headers="status"> Submitted Jul 19, 2013 4:33 am </td>
        <td headers="openDate"> Jul 11, 2013 12:00 pm </td>
        <td headers="dueDate"> Jul 18, 2013 11:55 pm </td>
      </tr>
      <tr>
        <td headers="attachments" class="attach">
          <img id="attachment3" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
        </td>
        <td headers="title">
          <h4><a href="url">Project 5</a></h4>
        </td>
        <td style="padding-bottom:0"> site </td>
        <td headers="status"> Submitted Jul 10, 2013 11:37 pm </td>
        <td headers="openDate"> Jun 27, 2013 12:00 pm </td>
        <td headers="dueDate"> Jul 10, 2013 11:55 pm </td>
      </tr>
      <tr>
        <td headers="attachments" class="attach">
          <img id="attachment4" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
        </td>
        <td headers="title">
          <h4><a href="url">Threads Practice </a></h4>
        </td>
        <td style="padding-bottom:0"> site </td>
        <td headers="status"> Not Started </td>
        <td headers="openDate"> Jun 27, 2013 12:00 pm </td>
        <td headers="dueDate"> Jun 27, 2013 12:05 pm </td>
      </tr>
      <tr>
        <td headers="attachments" class="attach">
          <img id="attachment5" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
        </td>
        <td headers="title">
          <h4><a href="url">Project 4</a></h4>
        </td>
        <td style="padding-bottom:0"> site </td>
        <td headers="status"> Submitted Jun 27, 2013 4:58 am </td>
        <td headers="openDate"> Jun 20, 2013 1:00 am </td>
        <td headers="dueDate"> Jun 26, 2013 11:55 pm </td>
      </tr>
      <tr>
        <td headers="attachments" class="attach">
          <img id="attachment6" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
        </td>
        <td headers="title">
          <h4><a href="url">Project 3</a></h4>
        </td>
        <td style="padding-bottom:0"> site </td>
        <td headers="status"> Submitted Jun 20, 2013 3:19 am </td>
        <td headers="openDate"> Jun 6, 2013 12:00 pm </td>
        <td headers="dueDate"> Jun 19, 2013 11:55 pm </td>
      </tr>
      <tr>
        <td headers="attachments" class="attach">
          <img id="attachment7" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
        </td>
        <td headers="title">
          <h4><a href="url">Project 2</a></h4>
        </td>
        <td style="padding-bottom:0"> site </td>
        <td headers="status"> Submitted Jun 5, 2013 5:39 am </td>
        <td headers="openDate"> May 28, 2013 12:00 pm </td>
        <td headers="dueDate"> Jun 4, 2013 11:55 pm </td>
      </tr>
      <tr>
        <td headers="attachments" class="attach">
          <img id="attachment8" src="/library/image/sakai/attachments.gif?panel=Main" alt="Attachments" width="13" height="11" border="0"/>
        </td>
        <td headers="title">
          <h4><a href="url">Project 1: Processor Design</a></h4>
        </td>
        <td style="padding-bottom:0"> site </td>
        <td headers="status"> Submitted May 31, 2013 2:09 am </td>
        <td headers="openDate"> May 16, 2013 1:40 pm </td>
        <td headers="dueDate"> May 30, 2013 11:55 pm </td>
      </tr>
    </table>
  </form>
</div>

What I expect to see are assignment title, status, open date, and close date.

Was it helpful?

Solution

As ysth suggested, your problem is right here:

my $table = $te->tables;

tables is plural, suggesting it should be called in list context. You're calling it in scalar context. In Perl, many functions that return a list will return the length of that list if called in scalar context. tables is one of them, so $table gets set to 1. You can't call methods on a number (well, not without autobox).

Try this:

my ($table) = $te->tables;

The parens before the assignment make it a list assignment. $table gets the first table found, and any additional tables are discarded.

OTHER TIPS

The doc says:

tables()

Return table objects for all tables that matched. Returns an empty list if no tables matched.

It is expecting to be called like:

my @tables = $te->tables();

and apparently it isn't finding any, so is returning nothing.

Perhaps you could provide a trimmed down version of your html that still demonstrates the problem and tell what you expect to happen?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top