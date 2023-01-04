Geekzone: technology news, blogs, forums
ForumsIT Pro and developersPython examples not working
#302932 4-Jan-2023 19:11
I'm not sure if it's a fault of my installation, the version I'm using or just newbie fails but all the examples of using the pdf modules to open a table and export to Excel aren't working and give errors

 

for example:

 

[code]

 

import openpyxl
import tabula

 

# Read in the PDF file
df = tabula.read_pdf("table.pdf", pages="all")

 

# Create a new Excel file
wb = openpyxl.Workbook()
ws = wb.active

 

# Iterate through the rows and columns of the dataframe and write the values to the Excel worksheet
for r in df.index.tolist():
    for c in df.columns:
        ws.cell(row=r+1, column=c+1).value = df.iloc[r, c]

 

# Save the Excel file
wb.save("table.xlsx")

 

[/code]

 

Gives the error on the first for line

 

AttributeError: 'builtin_function_or_method' object has no attribute 'tolist'

 

 

 

Examples from this page seem to be OK 

 

Scraping Tables from PDF Files Using Python | Towards Data Science

 

... however only after I removed the all=true option

 

What's at fault here?




ezbee
1309 posts

Uber Geek


  #3017022 4-Jan-2023 19:54
Hi, I, Um, asked ChatGPT.

This could be rubbish, but see what you think.

 

""

 


This error is occurring because df.index is returning a built-in function or method (i.e. range or enumerate), rather than a list-like object. This is causing the tolist() method to raise an AttributeError, as built-in functions and methods do not have a tolist() attribute.

 

To fix this error, you can try using df.index.to_list() instead of df.index.tolist(). This will convert the index of the dataframe to a list, which should allow you to iterate over it in the for loop.

 

Alternatively, you can try iterating over the rows of the dataframe directly using for r in df.iterrows():, which should work regardless of the type of the dataframe's index.

 

""

